Difference between revisions of "Natural Language Tools & Services"

From
Jump to: navigation, search
(Capability (other))
(Capability (other))
Line 44: Line 44:
 
* [http://stanfordnlp.github.io/CoreNLP/ CoreNLP | Stanford] The Stanford Natural Language Processing Group Toolkit ([[Python]])
 
* [http://stanfordnlp.github.io/CoreNLP/ CoreNLP | Stanford] The Stanford Natural Language Processing Group Toolkit ([[Python]])
 
* [[Natural Language Toolkit (NLTK)]] ([[Python]])
 
* [[Natural Language Toolkit (NLTK)]] ([[Python]])
 +
* [[SpaCy]] ([[Python]] and Cython) 
 +
* [[Python#scikit-learn|scikit-learn]] NLP toolkit
 
* [http://opennlp.apache.org/ Apache OpenNLP]
 
* [http://opennlp.apache.org/ Apache OpenNLP]
 
* [http://fasttext.cc/ fastText | Facebook's AI Research] representations and text classifiers ([[Python]])
 
* [http://fasttext.cc/ fastText | Facebook's AI Research] representations and text classifiers ([[Python]])
 
* [http://mallet.cs.umass.edu/ MALLET] a Java-based package
 
* [http://mallet.cs.umass.edu/ MALLET] a Java-based package
 
* [http://www.intel.ai/nlp-architect-by-intel-ai-lab-release-0-2/ Intel NLP Architect] ([[Python]])
 
* [http://www.intel.ai/nlp-architect-by-intel-ai-lab-release-0-2/ Intel NLP Architect] ([[Python]])
* [[SpaCy]] ([[Python]] and Cython) 
 
* [[Python#scikit-learn|scikit-learn]] NLP toolkit
 
 
* [[Gensim]] fast Vector Space Modelling, Topic Modeling, LDA implementation ([[Python]])
 
* [[Gensim]] fast Vector Space Modelling, Topic Modeling, LDA implementation ([[Python]])
 
* [http://allennlp.org/ AllenNLP] an Apache NLP research library (PyTorch)
 
* [http://allennlp.org/ AllenNLP] an Apache NLP research library (PyTorch)
Line 55: Line 55:
 
* [[Sintelix]]
 
* [[Sintelix]]
 
* [[H2O]] Driveless AI
 
* [[H2O]] Driveless AI
 
 
* [http://dandelion.eu/ Dandelion API]  
 
* [http://dandelion.eu/ Dandelion API]  
 
* [http://www.programmableweb.com/api/voxsigma VoxSigma API]
 
* [http://www.programmableweb.com/api/voxsigma VoxSigma API]
Line 79: Line 78:
 
* [http://www.ibm.com/watson/services/natural-language-understanding/  Watson Natural Language Understanding | IBM]
 
* [http://www.ibm.com/watson/services/natural-language-understanding/  Watson Natural Language Understanding | IBM]
 
* [http://www.research.ibm.com/artificial-intelligence/project-debater/ Project Debater | IBM]
 
* [http://www.research.ibm.com/artificial-intelligence/project-debater/ Project Debater | IBM]
 +
 
===== Text Labeling =====
 
===== Text Labeling =====
 
* [http://github.com/dennybritz/bella Bella] open tool aimed at simplifying and speeding up text data labeling. Usually, if a dataset was labeled in a CSV file or Google spreadsheets, specialists need to convert it to an appropriate format before model training. Bella’s features and simple interface make it a good substitution to spreadsheets and CSV files. A graphical user interface (GUI) and a database backend for managing labeled data are Bella’s main features.
 
* [http://github.com/dennybritz/bella Bella] open tool aimed at simplifying and speeding up text data labeling. Usually, if a dataset was labeled in a CSV file or Google spreadsheets, specialists need to convert it to an appropriate format before model training. Bella’s features and simple interface make it a good substitution to spreadsheets and CSV files. A graphical user interface (GUI) and a database backend for managing labeled data are Bella’s main features.

Revision as of 14:44, 27 December 2019

Youtube search... ...Google search

Capability with Javascript

  • TensorFlow.js for training and deploying ML models in the browser and on Node.js (was called Deeplearnjs)
    • Keras.js No longer active - capability now is in TensorFlow.js
  • NLP.js NLP Manager: a tool able to manage several languages (nodejs)
  • Compromise modest natural-language processing (NLP) interprets and pre-parses English and makes some reasonable decisions
  • Natural provides tokenizing, stemming (reducing a word to a not-necessarily morphological root), classification, phonetics, tf-idf, WordNet, string similarity, some inflections, and more. (nodejs)

Capability (other)

Text Labeling
  • Bella open tool aimed at simplifying and speeding up text data labeling. Usually, if a dataset was labeled in a CSV file or Google spreadsheets, specialists need to convert it to an appropriate format before model training. Bella’s features and simple interface make it a good substitution to spreadsheets and CSV files. A graphical user interface (GUI) and a database backend for managing labeled data are Bella’s main features.
  • Tagtog choose three approaches: annotate text manually, hire a team that will label data for them, or use machine learning models for automated annotation.
  • Dataturks provides training data preparation tools. Using its products, teams can perform such tasks as parts-of-speech tagging, named-entity recognition tagging, text classification, moderation, and summarization.
  • Brat rapid annotation tool] a web-based tool for text annotation; that is, for adding notes to existing text documents, designed in particular for structured annotation, where the notes are not freeform text but have a fixed form that can be automatically processed and interpreted by a computer.
  • Yedda a lightweight Collaborative Text Span Annotation Tool developed for annotating chunk/entity/event on text (almost all languages including English, Chinese), symbol and even emoji.