Hugging Face

From
Revision as of 05:18, 15 June 2023 by BPeat (talk | contribs) (LightGPT)
Jump to: navigation, search

YouTube ... Quora ...Google search ...Google News ...Bing News


Hugging Face is an American company that develops tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets. Hugging Face is a community and a platform for artificial intelligence and data science that aims to democratize AI knowledge and assets used in AI models. The platform allows users to build, train and deploy state of the art models powered by open source machine learning. It also provides a place where a broad community of data scientists, researchers, and ML engineers can come together and share ideas, get support and contribute to open source projects. Is there anything else you would like to know? - Wikipedia


Hugging Face Community

LightGPT

LightGPT is a language model developed by AWS Contributors. It is based on GPT-J 6B and was instruction fine-tuned on the high-quality, Apache-2.0 licensed OIG-small-chip instruction dataset with ~200K training examples. The model is designed to generate text based on a given instruction, and it can be deployed to Amazon SageMaker

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters. The model consists of 28 layers with a model dimension of 4096, and a feedforward dimension of 16384. The model dimension is split into 16 heads, each with a dimension of 256. Rotary Position Embedding (RoPE) is applied to 64 dimensions of each head. The model is trained with a tokenization vocabulary of 50257, using the same set of BPEs as GPT-2/GPT-3. GPT-J learns an inner representation of the English language that can be used to extract features useful for downstream tasks. The model is best at what it was pretrained for however, which is generating text from a prompt. GPT-J-6B is not intended for deployment without fine-tuning, supervision, and/or moderation. It is not a product in itself and cannot be used for human-facing interactions. For example, the model may generate harmful or offensive text. Please evaluate the risks associated with your particular use case.