Game Development with Generative AI

Reinforcement Learning from Human Feedback: From Zero to ChatGPT In this talk, we will cover the basics of Reinforcement Learning (RL) from Human Feedback (RLHF) and how this technology is being used to enable state-of-the-art ML tools like ChatGPT. Most of the talk will be an overview of the interconnected ML models and cover the basics of Natural Language Processing (NLP) and Reinforcement Learning (RL) that one needs to understand how RLHF is used on large language models. It will conclude with open question in RLHF. [https://huggingface.co/blog/rlhf RLHF Blogpost The Deep RL Course Slides from this talk Nathan Twitter: https://twitter.com/natolambert Thomas Twitter: https://twitter.com/thomassimonini Nathan Lambert is a Research Scientist at Hugging Face. He received his PhD from the University of California, Berkeley working at the intersection of machine learning and robotics. He was advised by Professor Kristofer Pister in the Berkeley Autonomous Microsystems Lab and Roberto Calandra at Meta AI Research. He was lucky to intern at Facebook AI and DeepMind during his Ph.D. Nathan was was awarded the UC Berkeley EECS Demetri Angelakos Memorial Achievement Award for Altruism for his efforts to better community norms.

But How Does ChatGPT Actually Work? You’ll learn how ChatGPT works and this will provide many benefits, such as helping you to use the model more effectively, evaluate its outputs more critically, and staying informed about the latest developments in the field so you are better prepared to take advantage of new opportunities. ChatGPT is a type of Natural Language Processing (NLP) known as a Generative Pre-trained Transformer (GPT) developed by OpenAI. These are the two big terms we will focus on in this video. On top of that you will also get a base understanding of common Machine Learning techniques like Supervised Learning, and Reinforcement Learning (RL), which were used to make ChatGPT as good as it is.

Navigation menu