Continuous Bag-of-Words (CBoW)
YouTube search... ...Google search
- Scikit-learn Machine Learning in Python, Simple and efficient tools for data mining and data analysis; Built on NumPy, SciPy, and matplotlib
- Term Frequency, Inverse Document Frequency (TF-IDF)
- Doc2Vec
The CBOW model architecture tries to predict the current target word (the center word) based on the source context words (surrounding words). Considering a simple sentence, “the quick brown fox jumps over the lazy dog”, this can be pairs of (context_window, target_word) where if we consider a context window of size 2, we have examples like ([quick, fox], brown), ([the, brown], quick), ([the, dog], lazy) and so on. Thus the model tries to predict the target_word based on the context_window words. A hands-on intuitive approach to Deep Learning Methods for Text Data — Word2Vec, GloVe and FastText | Dipanjan Sarkar - Towards Data Science