Difference between revisions of "Speech Recognition"

From
Jump to: navigation, search
m
m
Line 12: Line 12:
  
  
* [[Capabilities]]
+
* [[Capabilities]]  
 
** [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]]  
 
** [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]]  
 +
** [[Video]] ... [[Generated Image]] ... [[Colorize]] ... [[Image/Video Transfer Learning]]
 
* [[Assistants]] ... [[Hybrid Assistants]]  ... [[Agents]]  ... [[Negotiation]]
 
* [[Assistants]] ... [[Hybrid Assistants]]  ... [[Agents]]  ... [[Negotiation]]
 
* [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]]  ...[[Large Language Model (LLM)|LLM]]  ...[[Natural Language Tools & Services|Tools & Services]]
 
* [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]]  ...[[Large Language Model (LLM)|LLM]]  ...[[Natural Language Tools & Services|Tools & Services]]

Revision as of 13:26, 20 March 2023

YouTube ... Quora ...Google search ...Google News ...Bing News




Whisper Automatic Speech Recognition Service (ASR)

YouTube search... ...Google search

Whisper is an Automatic Speech Recognition Service (ASR) by OpenAI trained on 680,000 hours of multilingual and multitask supervised data collected from the web. 'We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition.' - OpenAI