Skip to main content

ChatGPT, OpenAI's Newest AI Model, Engages In Intelligent Dialogue - Free To Use In Preview

OpenAI has released a new GPT variant model for dialogue called ChatGPT, current available free to use while in research preview.
Created on November 30|Last edited on November 30
Today, OpenAI has revealed a new model in the GPT lineup called ChatGPT, currently in a research preview state. This model is designed specifically for engaging in dialogue with the user, being able to follow instructions or prompts and provide detailed responses. Luckily for us, it's currently free to use while in preview.


How ChatGPT works

ChatGPT is based on InstructGPT, a model that OpenAI released almost a year back in January, which was made to follow instructions better than basic GPT-3. ChatGPT stands apart from InstructGPT in it's stronger ability for ongoing dialogue, rather than the simpler focus on just responding to single prompts.
The training pipeline for ChatGPT is nearly identical to it's predecessor (RLHF):
  • The first step is collecting human data, which in the case of ChatGPT is dialogue between two participants - one acts as the user, and the other as the AI assistant. Real AI model responses helped inform the human response composition. This collected data is then used to fine-tune a GPT-3.5 model.
  • The next step lets the fine-tuned model produce several outputs to a given prompt, and a human labeler ranks the responses from best to worst. This ranking data is then used to train a reward model which will automate the final training step.
  • The final step further fine-tunes the GPT-3.5 model (ChatGPT) using Proximal Policy Optimization, a standard-use reinforcement learning algorithm employed by OpenAI.


ChatGPT's limitations

ChatGPT gives lengthy responses to any of your questions or comments - however, it still has many limitations that are common for NLP models.
Like all NLP models, ChatGPT has the ability to produce toxic, incorrect, and harmful content. It's knowledgebase is limited to that of the training data, and it doesn't reference any sort of ultimate fact authority. It tries to steer clear of difficult subjects to avoid issue, but it can still happen. ChatGPT's outputs are restricted by OpenAI's Moderation API because of this.
ChatGPT's writing style is also informed by it's training data, and tends to write long-winded responses to everything, including repeating or re-using certain phrasing often. It does not ask clarifying questions, only guesses at the meaning of ambiguous phrases.


Use ChatGPT for yourself

While it's in research preview, ChatGPT is free to use for anyone with an OpenAI account. While using it, users can provide feedback to the responses they receive which will help guide ChatGPT's improvement.
Head to this URL to try it out: https://chat.openai.com/chat
The release blog post also has examples if you just want to take a quick look.
Here's a conversation I had with ChatGPT about making an arcade racing video game:

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.