Add to Favourites
To login click here

Large Language Models (LLMs) are continuously advancing and improving, contributing to economic and societal transformations. Popular LLMs such as ChatGPT, developed by OpenAI, are natural language processing models that can generate meaningful text, answer questions, summarize long paragraphs, write codes and emails, etc. Reinforcement Learning (RL) is used for fine-tuning LLMs, which is a feedback-driven Machine Learning method based on a reward system. ChatGPT uses Reinforcement Learning from Human Feedback (RLHF) to minimize biases. Sebastian Raschka, an AI and ML researcher, shared reasons why Reinforcement Learning is used in fine-tuning instead of supervised learning.