State of GPT - 2023 by karpathy

Excellent talk from Karpathy defining the moment in the LLM Training, Applications, Tools/Frameworks and Usecases

State of GPT 2023 - Microsoft Build 2023 event

LLM Training pipeline - Karpathy explains the significant engineering execution involved in building LLMs. The numbers in the slide below speak for themselves.

Reward Modeling - It's a small LLM model training in itself involving the creation of a large set of prompts and responses and creating a human rating for each response produced by the LLM.

RLHF - Reinforcement Learning with Human Feedback is the magic which makes a base model into a human like agent with amazing abilities in language tasks and multi-modality. The reward model created earlier will evaluate each response produced by the LLM with each higher-rewarded one going up in the response stream

System 1 Vs System 2 thinking - RLHF trained model is still a raw power. It just imitates what it found during training (System 1 thinking). However, most of the tasks we encounter in the world need planning, reflection, corrections and reformulations (System 2 thinking). How do we achieve this?

The answer lies in prompting - which is instructing the LLM to produce the responses in a certain way. There are simple prompting techniques to more involved in augmented prompting

Chain of Thought - Go through a step by step sequence of steps
Ensemble multiple attempts and choose best response
Ask for reflection - a second round evaluation to prompt the LLM to double check the answer
Thought Action Reaction Models - A pattern of prompting the sequence of human thinking
AutoGPT - LLM itself break it down a prompt into multiple subtasks
Condition for good performance - Encouraging and assigning the LLM explicitly to produce best answer
Retrieval Augmented Generation - Prompts are enhanced with additional documents/context which forces LLM to work within that context and produce a better response
LLM plugins - which will talk to external APIs (weather API) and tools (calculator)
Finally; Finetuning - but this is a highly involved exercise. Merely adding additional training data to specific context has become a lot more accessible with new techniques such as PEFT (Parameter Efficient Fine Tuning)

Karpathy closes with critical limitations and then important recommendations

Collationist.

State of GPT - 2023 by karpathy

Recent Posts

Comments

Technology Posts

Obervations from Karpathy on AI evolution

21 Lessons for 21st Century by Yuval Noah Harari

The future of AI compute - with Jonathan Ross

Who will dominate the AI Ecosystem

Top trends in the AI industry

AI trade is still on?

State of the Union with Andreas Steno

Tesla's growth narrative since DOGE days