plot_beta.RdThis function plots the \(\beta_{kd}^{m}\) topic parameters across models
\(m\), topics \(k\), and dimensions \(d\). It takes as input a raw
alignment object and then returns a circle heatmap. The size of each circle
corresponds to the value \(\beta_{kd}^m\) for the model in panel \(m\),
topic in column \(k\), and dimension in row \(d\). The plot can be
restricted to only a subset of models by using the models argument,
which may be either a vector of model names or numeric indices into the list
of models. The dimensions can be filtered by using the n_features or
min_beta arguments -- by default, only dimensions with at least one
topic satisfying \(\beta_{kd}^m > 0.025\) are displayed.
plot_beta(
x,
models = "all",
filter_by = "beta",
x_axis = "label",
threshold = 0.001,
n_features = NULL,
beta_aes = "size",
color_by = "path"
)(required) An alignment class object resulting from
align_topics.
Which models to display in the heatmap? Defaults to
"all", meaning that all models are shown. If given "last", only
the last model in the models list will be plotted. If given a vector of
characters, it will plot only models whose names in the original models list
match. Similarly, if given a list of integers, only the models lying at those
indices in the original model list will be visualized.
(optional, default = "beta") a character specifying
if the data (beta matrices) should be filtered by the average "beta"
across topics or by the "distinctiveness" of the features.
(optional, default = "index") a character specifying
if the x-axis should display topic indices ("index") such that they
match the alignment plot order or topic names ("label").
(optional, default = 0.001) Words (features) with less than this average beta or distinctiveness across all topics are ignored
(optional) alternative to threshold. The maximum
number of words (features) to display along rows of the plot.
Should word probabilities within a topic be encoded using
circle size ("size") or opacity ("alpha") ? Defaults to
"size".
(optional) What should the color of topics and weights encode? Defaults to 'path'. Other possible arguments are 'coherence', 'refinement', or 'topic'.
A ggplot2 object describing the word probabilities associated with each topic across models of interest.
library(purrr)
data <- rmultinom(10, 20, rep(0.1, 20))
lda_params <- setNames(map(1:5, ~ list(k = .)), 1:5)
lda_models <- run_lda_models(data, lda_params)
#> Using default value 'VEM' for 'method' LDA parameter.
#> Using default value 'VEM' for 'method' LDA parameter.
#> Using default value 'VEM' for 'method' LDA parameter.
#> Using default value 'VEM' for 'method' LDA parameter.
#> Using default value 'VEM' for 'method' LDA parameter.
alignment <- align_topics(lda_models)
plot_beta(alignment)
plot_beta(alignment, models = c(3, 4))
plot_beta(alignment, models = "last")