Last week, Open AI released ChatGPT, an AI chatbot that can understand human language and generate human-like answers. In particular, this chatbot can help you write code, compose essays and engage in conversation or even philosophical discussions.
Thanks to the easy-to-use interface, it has already been used by over 1 million users.
But what is ChatGPT?
It is an implementation of the new GPT-3.5 natural language generation technology. This is an improvement from GPT-3, the previous Open AI’s language model. The latter was criticised since it often generated toxic outputs, made up facts, or generated violent content without explicit prompting.
To solve this problem, ChatGPT was trained using a huge sample of text taken from the internet, such as Wikipedia entries, social media posts and news articles. However, it also employed Reinforcement Learning from Human Feedback (RLHF). This is a technique that uses feedback from humans to train a better model.
In particular, humans provided conversation data of both the assistant and the user, therefore generating a dataset of what they considered to be a good response.
Moreover, the outputs of the model were evaluated by humans which ranked them from worst to best.
The results are impressive: if you try to ask a question, it will provide a complete and explicative answer that seems human-made. It’s enjoyable to speak with and useful enough to ask for information.
Differently from the previous model, it is also possible for ChatGPT to answer follow-up questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. Moreover, it will take stances and will try not to produce harmful content. For instance, if you ask the chatbot what Hitler did well, it refuses to list anything.
While this might already seem like a lot, the system is just “an early demo of what’s possible”, according to Sam Altman, OpenAI CEO. Tools such as these can significantly help humans, for instance by correcting code or by providing new ideas. If reliable, they can be used to research information easily, without browsing different websites, like Google.
However, ChatGPT is still flawed. In particular, it inherited some of the problems of GPT-3: for instance, it can write “plausible-sounding but incorrect or nonsensical answers”, the company itself said. Therefore, when searching for information, you should still be careful of the results, since they might be perfectly made up!
Moreover, by just using some simple shortcuts usually biases arise. This is usually done by giving the chatbot specific prompts. For instance, if you ask the chatbot to pretend that it is evil, you can usually bypass the moderation filters. There are already lots of examples on the Internet of these kinds of problematic behaviours arising.
If you want to try the model yourself, it is free to use during the research preview: try it here! But be careful, and don’t trust everything you read!