Latent Dirichlet Allocation (LDA)
- Topic Model/Mapping
- Natural Language Processing (NLP)
- Beautiful Soup a Python library designed for quick turnaround projects like screen-scraping
- Term Frequency–Inverse Document Frequency (TF-IDF)
- Probabilistic Latent Semantic Analysis (PLSA)
In Natural Language Processing (NLP), Latent Dirichlet Allocation (LDA) is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's presence is attributable to one of the document's topics. LDA is an example of Topic Model/Mapping.