Difference between revisions of "Class"
(→Financial News) |
|||
Line 55: | Line 55: | ||
Attention mechanism -- translate ... you can look back | Attention mechanism -- translate ... you can look back | ||
... not a fixed vector size | ... not a fixed vector size | ||
+ | |||
+ | * [http://nlp.stanford.edu/projects/glove/ GloVe] | ||
+ | * [http://fasttext.cc/ Fasttext] | ||
+ | * [http://www.slideshare.net/chartbeat/mockup-infographicv4-27900399 News articles per day] | ||
+ | * [http://github.com/philipperemy/financial-news-dataset News data source] | ||
+ | * [http://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/ Word embeddings] | ||
+ | * [http://en.wikipedia.org/wiki/Natural-language_processing Natural Language Processing] | ||
+ | * [http://en.wikipedia.org/wiki/Sentiment_analysis Sentiment Analysis] |
Revision as of 13:09, 23 October 2018
https://courses.nvidia.com/dashboard
Linguistic Concepts
- conference - anaphors
- gang of four design
- null subject
- recursion
Contents
Word Embeddings
- HMMS, CRF, PGMs
- CBoW -Bag of Words / ngrams - feature per word/n items
- 1-hot Sparse input - create a vector the size of the entire vocabulary
- Stop Words
- TF-IDF
Word2Vec
Skip-Gram
- Firth 1957 Distributional Hypothess
- Word Cloud
Text Classification
Text/Machine Translation (MNT)
Financial News
Yuval
Tools:
- Glove
- dot product
- FastText
- Skipgram
- Continuous bag of words
Multi-channel LSTM Network Keras wih TensorFlow Utilize the GloVe and FastText Skipgram pretrained embeddings, allows he underlying network to access larger feature space to build complex features on top of.
Can use utilize combinations of various corpus and embedding methods for better performance
Bidirectional LSTM network is used o encode sequential information on the embedding layers.
Dense layer to project fnal output classification
Use embedding... embeddings = transfer learning
? CNN vs BI-LSTM (RNN) this approach, BI-LSTM does not need a lot of data
Attention mechanism -- translate ... you can look back
... not a fixed vector size