Difference between revisions of "Hugging Face"

From
Jump to: navigation, search
m
m (LightGPT)
Line 40: Line 40:
 
* [https://huggingface.co/amazon/LightGPT amazon/LightGPT]
 
* [https://huggingface.co/amazon/LightGPT amazon/LightGPT]
 
* [https://huggingface.co/amazon/LightGPT/blob/main/README.md  README.md · amazon/LightGPT]
 
* [https://huggingface.co/amazon/LightGPT/blob/main/README.md  README.md · amazon/LightGPT]
 +
* [https://huggingface.co/EleutherAI/gpt-j-6b EleutherAI/gpt-j-6b] 
 +
* [https://en.wikipedia.org/wiki/GPT-J  GPT-J | Wikipedia]
 +
* [https://huggingface.co/blog/gptj-sagemaker  Deploy GPT-J 6B for inference using Hugging Face Transformers]
 +
* [https://betterprogramming.pub/fine-tuning-gpt-j-6b-on-google-colab-or-equivalent-desktop-or-server-gpu-b6dc849cb205 Fine-tuning GPT-J 6B on Google Colab or Equivalent Desktop or Server]
  
 
LightGPT is a language model developed by AWS Contributors. It is based on GPT-J 6B and was instruction fine-tuned on the high-quality, Apache-2.0 licensed OIG-small-chip instruction dataset with ~200K training examples. The model is designed to generate text based on a given instruction, and it can be deployed to [[Amazon]] [[SageMaker]]
 
LightGPT is a language model developed by AWS Contributors. It is based on GPT-J 6B and was instruction fine-tuned on the high-quality, Apache-2.0 licensed OIG-small-chip instruction dataset with ~200K training examples. The model is designed to generate text based on a given instruction, and it can be deployed to [[Amazon]] [[SageMaker]]
 +
 +
GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters. The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3. GPT-J learns an inner representation of the English language that can be used to extract features useful for downstream tasks. The model is best at what it was pretrained for however, which is generating text from a prompt. GPT-J-6B is not intended for deployment without fine-tuning, supervision, and/or moderation. It is not a product in itself and cannot be used for human-facing interactions. For example, the model may generate harmful or offensive text. Please evaluate the risks associated with your particular use case.

Revision as of 05:18, 15 June 2023

YouTube ... Quora ...Google search ...Google News ...Bing News


Hugging Face is an American company that develops tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets. Hugging Face is a community and a platform for artificial intelligence and data science that aims to democratize AI knowledge and assets used in AI models. The platform allows users to build, train and deploy state of the art models powered by open source machine learning. It also provides a place where a broad community of data scientists, researchers, and ML engineers can come together and share ideas, get support and contribute to open source projects. Is there anything else you would like to know? - Wikipedia


Hugging Face Community

LightGPT

LightGPT is a language model developed by AWS Contributors. It is based on GPT-J 6B and was instruction fine-tuned on the high-quality, Apache-2.0 licensed OIG-small-chip instruction dataset with ~200K training examples. The model is designed to generate text based on a given instruction, and it can be deployed to Amazon SageMaker

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters. The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3. GPT-J learns an inner representation of the English language that can be used to extract features useful for downstream tasks. The model is best at what it was pretrained for however, which is generating text from a prompt. GPT-J-6B is not intended for deployment without fine-tuning, supervision, and/or moderation. It is not a product in itself and cannot be used for human-facing interactions. For example, the model may generate harmful or offensive text. Please evaluate the risks associated with your particular use case.