Let’s say we have some documents as character vectors, and we want to discover the underlying topics. This is called “topic modeling”, and Latent Dirichlet Allocation (LDA) is probably the most famous among all of topic models. Here, we consider the Wasserstein Dicionary Learning (WDL) model.
# a very simple example
sentences <- c("this is a sentence", "this is another one", "yet another sentence")
wdl_fit <- wdl(sentences, specs = wdl_specs(
wdl_control = list(num_topics = 2),
word2vec_control = list(min_count = 1)
))
#> Preprocessing the data...
#> Running tokenizer on the sentences...
#> Running Word2Vec for the embeddings and distance matrix...
#> `method` is automatically switched to "log"
#> Running WDL in CUDA mode...
#> This might take a while depending on the problem size...
#> Initializing WDL model with 5 vocabs, 3 docs, and 2 topics...
#> Training WDL model with 2 epochs, 1 batches
#> Epoch 1 of 2, batch 1 of 1
#> batch time: 0.01 sec
#> Epoch 2 of 2, batch 1 of 1
#> batch time: 0.00 sec
#> Inference on the dataset
#> Inference: 3 of 3 docs done
wdl_fit
#> WDL model topics:
#>
#> Topic 1:
#> sentenc yet anoth one </s>
#> 0.482 0.201 0.197 0.071 0.048
#>
#> Topic 2:
#> sentenc one anoth </s> yet
#> 0.419 0.255 0.176 0.112 0.039We can see from the topics that they are vectors of the tokens (words) with associated probabilities. If you want to access the topics, you can do this:
wdl_fit$topics
#> topic1 topic2
#> one 0.07082487 0.25479149
#> yet 0.20149596 0.03862156
#> anoth 0.19719629 0.17623633
#> sentenc 0.48236498 0.41868685
#> </s> 0.04811789 0.11166377Alternatively, you can also obtain the weights of the topics used to re-construct the input data:
wdl_fit$weights
#> [,1] [,2] [,3]
#> topic1 0.5644257 0.6153338 0.4026389
#> topic2 0.4355743 0.3846662 0.5973611See also vignette("specs").
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.
Peyré, G., & Cuturi, M. (2019). Computational Optimal Transport: With Applications to Data Science. Foundations and Trends® in Machine Learning, 11(5–6), 355–607. https://doi.org/10.1561/2200000073
Schmitz, M. A., Heitz, M., Bonneel, N., Ngolè, F., Coeurjolly, D., Cuturi, M., Peyré, G., & Starck, J.-L. (2018). Wasserstein dictionary learning: Optimal transport-based unsupervised nonlinear dictionary learning. SIAM Journal on Imaging Sciences, 11(1), 643–678. https://doi.org/10.1137/17M1140431
Xie, F. (2025). Deriving the Gradients of Some Popular Optimal Transport Algorithms (No. arXiv:2504.08722). arXiv. https://doi.org/10.48550/arXiv.2504.08722