Difference between revisions of "Topic Model/Mapping"
| Line 12: | Line 12: | ||
* [http://en.wikipedia.org/wiki/Latent_semantic_analysis Latent Semantic Analysis | Wikipedia] | * [http://en.wikipedia.org/wiki/Latent_semantic_analysis Latent Semantic Analysis | Wikipedia] | ||
* [http://en.wikipedia.org/wiki/Explicit_semantic_analysis Explicit semantic analysis | Wikipedia] | * [http://en.wikipedia.org/wiki/Explicit_semantic_analysis Explicit semantic analysis | Wikipedia] | ||
| − | + | * [http://medium.com/nanonets/topic-modeling-with-lsa-psla-lda-and-lda2vec-555ff65b0b05 Topic Modeling with LSA, PLSA, LDA & lda2Vec | Joyce Xu] | |
Topic modelling can be described as a method for finding a group of words (i.e topic) from a collection of documents that best represents the information in the collection. It can also be thought of as a form of text mining – a way to obtain recurring patterns of words in textual material. | Topic modelling can be described as a method for finding a group of words (i.e topic) from a collection of documents that best represents the information in the collection. It can also be thought of as a form of text mining – a way to obtain recurring patterns of words in textual material. | ||
| Line 18: | Line 18: | ||
In machine learning and [[Natural Language Processing (NLP)]], a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both. | In machine learning and [[Natural Language Processing (NLP)]], a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both. | ||
| − | http:// | + | http://cdn-images-1.medium.com/max/600/1*_ZMgTsJGmR743ngZ7UxN9w.png |
<youtube>BuMu-bdoVrU</youtube> | <youtube>BuMu-bdoVrU</youtube> | ||
Revision as of 22:47, 7 January 2019
Youtube search... ...Google search
- Latent Dirichlet Allocation (LDA)
- Natural Language Processing (NLP)
- Beautiful Soup a Python library designed for quick turnaround projects like screen-scraping
- Term Frequency–Inverse Document Frequency (TF-IDF)
- Probabilistic Latent Semantic Analysis (PLSA)
- Topic model | Wikipedia
- Hierarchical Dirichlet Process | Wikipedia
- Latent Semantic Analysis | Wikipedia
- Explicit semantic analysis | Wikipedia
- Topic Modeling with LSA, PLSA, LDA & lda2Vec | Joyce Xu
Topic modelling can be described as a method for finding a group of words (i.e topic) from a collection of documents that best represents the information in the collection. It can also be thought of as a form of text mining – a way to obtain recurring patterns of words in textual material.
In machine learning and Natural Language Processing (NLP), a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both.
Topic Map
A topic map is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information. Topic maps were originally developed in the late 1990s as a way to represent back-of-the-book index structures so that multiple indexes from different sources could be merged. However, the developers quickly realized that with a little additional generalization, they could create a meta-model with potentially far wider application. The ISO standard is formally known as ISO/IEC 13250:2003.
A topic map represents information using
- topics, representing any concept, from people, countries, and organizations to software modules, individual files, and events,
- associations, representing hypergraph relationships between topics, and
- occurrences, representing information resources relevant to a particular topic.
Topic maps are similar to concept maps and mind maps in many respects, though only topic maps are ISO standards. Topic maps are a form of semantic web technology similar to RDF.