Difference between revisions of "Kosmos-1"

Revision as of 06:58, 1 June 2023

YouTube ... Quora ...Google search ...Google News ...Bing News

Kosmos-1 | Microsoft
Multimodal Language Models
Large Language Model (LLM) ... Natural Language Processing (NLP) ...Generation ... Classification ... Understanding ... Translation ... Tools & Services
Assistants ... Agents ... Negotiation ... LangChain
Attention Mechanism ...Transformer ...Generative Pre-trained Transformer (GPT) ... GAN ... BERT
Generative AI ... Conversational AI ... OpenAI's ChatGPT ... Perplexity ... Microsoft's Bing ... You ...Google's Bard ... Baidu's Ernie
Capabilities
- Video/Image ... Vision ... Colorize ... Image/Video Transfer Learning
- End-to-End Speech ... Synthesize Speech ... Speech Recognition ... Music
Development ...AI Pair Programming Tools ... Analytics ... Visualization ... Diagrams for Business Analysis
Prompt Engineering (PE)
Foundation Models (FM)
Singularity ... Sentience ... AGI ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning

Can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot). It can analyze images for content, solve visual puzzles, perform visual text recognition, and pass visual IQ tests. 1.6B

@@ Line 19: / Line 19: @@
 * [[Capabilities]]
 ** [[Video/Image]] ... [[Vision]] ... [[Colorize]] ... [[Image/Video Transfer Learning]]
-** [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]]
+** [[End-to-End Speech]] ... [[Synthesize Speech]] ... [[Speech Recognition]] ... [[Music]]
 * [[Development]]  ...[[Development#AI Pair Programming Tools|AI Pair Programming Tools]] ... [[Analytics]]  ... [[Visualization]]  ... [[Diagrams for Business Analysis]]
 * [[Prompt Engineering (PE)]]

Difference between revisions of "Kosmos-1"

Revision as of 06:58, 1 June 2023

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools