Difference between revisions of "Amazon"
m |
m (→Inferentia) |
||
Line 48: | Line 48: | ||
= Inferentia = | = Inferentia = | ||
* [https://aws.amazon.com/machine-learning/inferentia/ Why Inferentia? | AWS] | * [https://aws.amazon.com/machine-learning/inferentia/ Why Inferentia? | AWS] | ||
+ | |||
+ | |||
+ | ChatGPT | ||
+ | AWS Inferentia is a custom-designed machine learning inference chip developed by Amazon Web Services (AWS) to accelerate deep learning workloads. The chip is specifically optimized for high performance, low latency, and cost-effective inference, which is the process of running trained machine learning models to make predictions or classifications. By using AWS Inferentia, organizations can achieve faster and more cost-effective deployment of machine learning models for a variety of applications, including image and speech recognition, natural language processing, and recommendation engines. Key features and benefits of AWS Inferentia include: | ||
+ | |||
+ | * <b>High Performance: </b>Inferentia delivers high throughput and low latency, making it ideal for real-time applications. It supports multiple machine learning frameworks such as TensorFlow, PyTorch, and Apache MXNet. | ||
+ | * <b>Cost Efficiency: </b> By providing a dedicated hardware solution for inference, Inferentia can reduce the cost of inference operations compared to using general-purpose CPUs or GPUs. | ||
+ | * <b>Compatibility: </b> AWS Inferentia is integrated with Amazon SageMaker, AWS's fully managed machine learning service, and supports models trained on popular frameworks. This makes it easier for developers to deploy their existing models on Inferentia-based instances. | ||
+ | * <b>Scalability: </b> It can be scaled to handle large-scale machine learning workloads, allowing users to deploy multiple models simultaneously or to serve a high volume of inference requests. | ||
+ | * <b>Availability: </b> Inferentia-powered instances, such as the Inf1 instance type, are available on Amazon EC2. These instances are designed to provide optimal performance for inference applications. | ||
+ | |||
+ | |||
+ | <youtube>2XUoDfdBoM8</youtube> | ||
+ | <youtube>pokM1r3rgIg</youtube> | ||
= Integrated Components/Technologies = | = Integrated Components/Technologies = |
Revision as of 14:09, 2 June 2024
YouTube ... Quora ...Google search ...Google News ...Bing News
- Amazon Machine Learning Documentation
- Development ... Notebooks ... AI Pair Programming ... Codeless ... Hugging Face ... AIOps/MLOps ... AIaaS/MLaaS
- Bedrock
- CodeWhisperer
- AWS with TensorFlow
- DeepLens - deep learning enabled video camera
- AWS Internet of Things (IoT)
- AmazonML
- Deep Learning (DL) Amazon Machine Image (AMI) - DLAMI
- Alexa Skills Kit implementing Amazon’s cloud-based voice service
- FloydHub - training and deploying your DL models
- On-Demand AWS Tech Talks
- AWS Training and Certification
- Agents ... Robotic Process Automation ... Assistants ... Personal Companions ... Productivity ... Email ... Negotiation ... LangChain
- Amazon Science
- Amazon Q brings generative AI-powered assistance to IT pros and developers (preview) | Amazon
- Amazon and Anthropic deepen their shared commitment to advancing generative AI | Amazon ... using Anthropic's Claude on Amazon Bedrock
_______________________________________________
Contents
Inferentia
ChatGPT
AWS Inferentia is a custom-designed machine learning inference chip developed by Amazon Web Services (AWS) to accelerate deep learning workloads. The chip is specifically optimized for high performance, low latency, and cost-effective inference, which is the process of running trained machine learning models to make predictions or classifications. By using AWS Inferentia, organizations can achieve faster and more cost-effective deployment of machine learning models for a variety of applications, including image and speech recognition, natural language processing, and recommendation engines. Key features and benefits of AWS Inferentia include:
- High Performance: Inferentia delivers high throughput and low latency, making it ideal for real-time applications. It supports multiple machine learning frameworks such as TensorFlow, PyTorch, and Apache MXNet.
- Cost Efficiency: By providing a dedicated hardware solution for inference, Inferentia can reduce the cost of inference operations compared to using general-purpose CPUs or GPUs.
- Compatibility: AWS Inferentia is integrated with Amazon SageMaker, AWS's fully managed machine learning service, and supports models trained on popular frameworks. This makes it easier for developers to deploy their existing models on Inferentia-based instances.
- Scalability: It can be scaled to handle large-scale machine learning workloads, allowing users to deploy multiple models simultaneously or to serve a high volume of inference requests.
- Availability: Inferentia-powered instances, such as the Inf1 instance type, are available on Amazon EC2. These instances are designed to provide optimal performance for inference applications.
Integrated Components/Technologies
- Textract in the Elastic Stack Architecture
- Comprehend - natural language processing (NLP) service
- Kendra - natural language search capabilities to your websites and applications; allowing users to ask a question then searches across repositories
- Lex - conversational interfaces using voice and text
- SageMaker - build, train, and deploy
- Polly - text to speech
- Rekognition - video analysis service
- Kinesis - collect, process, and analyze real-time, streaming data
- Lambda - run code without managing servers
- AWS Internet of Things (IoT) Services Overview; process and route those messages to AWS endpoints
- MQTT - A lightweight messaging protocol
- AWS IoT Button
- Greengrass - connected devices can run AWS Lambda functions, keep device data in sync
- Intel® Compute Library for Deep Neural Networks (clDNN) & OpenVINO - deep learning primitives for computer vision
- Simple Queue Service (SQS) - message queuing
- Simple Notification Service (SNS) - pub/sub messaging and mobile notifications
- DynamoDB - NoSQL database
- Simple Storage Service (S3) - object storage
- Amazon Forecast ... looking at a historical series of data, which is called time series data
- Athena interactive query service to analyze data in Amazon S3 using standard SQL
- run applications and services without thinking about servers; Serverless
- Glue a fully managed extract, transform, and load (ETL) service to prepare and load data for analytics
- Crawlers to populate the AWS Glue Data Catalog with tables
- Management Console - manage web services
- Deep Learning (DL) Amazon Machine Image (AMI) - DLAMI
- SoftAP - software enabled access point
- Ubuntu - operating system
Libraries & Frameworks
Training
Business Decision Maker... ...Data Platform Engineer... ... Data Scientist.... ..... .... Developer
Business Decision Maker
Data Platform Engineer
Data Scientist
Developer
AWS Summit New York City 2023