Difference between revisions of "Speech Recognition"
m |
m |
||
Line 19: | Line 19: | ||
* [[Attention]] Mechanism ...[[Transformer]] Model ...[[Generative Pre-trained Transformer (GPT)]] | * [[Attention]] Mechanism ...[[Transformer]] Model ...[[Generative Pre-trained Transformer (GPT)]] | ||
* [[Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)]] | * [[Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)]] | ||
− | * [[Generative AI]] ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]] ... [[Microsoft]]'s [[BingAI]] ... [[You]] ...[[Google]]'s [[Bard]] | + | * [[Generative AI]] ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]] ... [[Microsoft]]'s [[BingAI]] ... [[You]] ...[[Google]]'s [[Bard]] ... [[Baidu]]'s [[Ernie]] |
* [http://www.theverge.com/2022/9/23/23367296/openai-whisper-transcription-speech-recognition-open-source Iused] | * [http://www.theverge.com/2022/9/23/23367296/openai-whisper-transcription-speech-recognition-open-source Iused] | ||
* [https://www.marktechpost.com/2023/03/15/speechmatics-introduces-ursa-a-speech-to-text-system-that-delivers-unprecedented-performance-across-a-diverse-range-of-voices/ Speechmatics Introduces Ursa: A Speech-To-Text System That Delivers Unprecedented Performance Across A Diverse Range of Voices | Tanushree Shenwai - MarkTechPost] | * [https://www.marktechpost.com/2023/03/15/speechmatics-introduces-ursa-a-speech-to-text-system-that-delivers-unprecedented-performance-across-a-diverse-range-of-voices/ Speechmatics Introduces Ursa: A Speech-To-Text System That Delivers Unprecedented Performance Across A Diverse Range of Voices | Tanushree Shenwai - MarkTechPost] |
Revision as of 21:54, 24 March 2023
YouTube ... Quora ...Google search ...Google News ...Bing News
- Capabilities
- Assistants ... Hybrid Assistants ... Agents ... Negotiation ... LangChain
- Natural Language Processing (NLP) ...Generation ...LLM ...Tools & Services
- Attention Mechanism ...Transformer Model ...Generative Pre-trained Transformer (GPT)
- Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)
- Generative AI ... OpenAI's ChatGPT ... Perplexity ... Microsoft's BingAI ... You ...Google's Bard ... Baidu's Ernie
- Iused
- Speechmatics Introduces Ursa: A Speech-To-Text System That Delivers Unprecedented Performance Across A Diverse Range of Voices | Tanushree Shenwai - MarkTechPost
Whisper
YouTube search... ...Google search
Whisper is an Automatic Speech Recognition Service (ASR) by OpenAI trained on 680,000 hours of multilingual and multitask supervised data collected from the web. 'We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition.'
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. The Whisper v2-large model is currently available through our API with the whisper-1 model name. Currently, there is no difference between the open source version of Whisper and the version available through our API. However, through our API, we offer an optimized inference process which makes running Whisper through our API much faster than doing it through other means. For more technical details on Whisper, you can read the paper. - OpenAI