When the original Star Wars movie was released in 1977, the character of C-3PO was seen as a marvel of science fiction. With his advanced language processing capabilities and his ability to assist the characters with a wide range of tasks, C-3PO was seen as a highly advanced AI system.
Now, over 40 years later, we have ChatGPT, a prototype AI chatbot created by OpenAI that can generate highly human-like text and perform a wide range of natural language processing (NLP) tasks. With an estimated 175 billion parameters, ChatGPT is one of the largest and most advanced language models ever created, and it is revolutionizing the field of NLP.
In comparison to ChatGPT, C-3PO may not look like the advanced technology that he was considered to be not so long ago. While he is a physical robot with unique abilities, he is not as flexible or adaptable as GPT-3. He’s limited to the languages and tasks that he has been programmed for, while this new model can learn from a large and diverse text dataset to adapt to most tasks and languages.
Overall, the development of ChatGPT represents a significant milestone in the field of AI and NLP. While C-3PO was advanced for his time, our new sophisticated AI systems already have the potential to enable new and innovative applications that we couldn’t even imagine 40 years ago.
What is most brilliant and uncanny about ChatGPT is its ability to capture the nuances and subtleties of human language, making it more accurate and sophisticated than previous language models like GPT-2, developed by OpenAI in 2019, or ELIZA, developed back in the 60s. Behind the Scenes Since ChatGPT is not open-source yet, it is not possible to say exactly how its network of algorithms currently works. However, we can infer some general characteristics based on what we observe from its capabilities and architecture of similar NLP models.
According to Tom Goldstein, Associate Professor at Maryland, comparing the token generation time of a similar machine-learning model to ChatGPT, we can see that it probably takes around 350ms for ChatGPT to print out one word. He mentions in the thread that “you would need 5 80Gb A100 GPUs just to load the model and text. ChatGPT cranks out about 15-20 words per second. If it uses A100s, that could be done on an 8-GPU server”
ChatGPT crossed 1 million users after only 5 days of operation. With this explosive volume of queries to process, it is estimated that its cost is around $100k per day. However, the calculations to come up with this number assume an ideal scenario where the compute nodes don’t idle and the system works at full efficiency. Therefore, it is safe to assume that ChatGPT is costing OpenAI significantly more than it would be if operating in perfect settings, where the GPUs are 100% utilized and there’s no parallelization issues.
Sam Altman, the CEO of OpenAI, shared an average of their current cost per chat with Elon Musk in a tweet, and also stated that they’re currently looking to further optimize it.
There are several ways to optimize the costs of NLP models. One way is to use open-source tools and techniques such as data sampling and data augmentation to reduce the amount of data. However, optimizing cloud resources is one of the most important and effective ways to reduce costs.
One way to reduce wasteful use of cloud resources is by optimizing the use of compute nodes through services like Shakudo Platform, which lowers cloud costs by reducing idle node time and by having all clusters running at maximum efficiency. The ability to use only as much infrastructure as you need is critical to decrease costs of many data-heavy applications. The key to optimizing infrastructure is resilience and robustness. Shakudo manages auto-checkpointing and auto-restart while utilizing lower cost nodes, which maximizes the stability and efficiency of your infrastructure.
The extraordinary ability of ChatGPT to perform tasks such as language translation, summarization, and text generation, is an amazing advancement in generative AI which is greatly valuable for not only companies but also individuals. Data scientists are now diving deep into ChatGPT and evaluating how it can be used to process and analyze large datasets of text and improve the accuracy of future NLP tasks.
ChatGPT is able to identify and extract specific entities from text data, such as people, locations and organizations. However, what separates ChatGPT from other models is its potential to be like an interactive version of Google, being able to tell the user exactly what they want with amazing references without them having to go through countless websites and blogs. This can possibly lead to advancements in translation, coreference resolution, speech recognition, text classification and many more data science applications.
There have been debates about the ethical implications surrounding the use of language modeling and ChatGPT, since AI systems can’t tell when they’re being ethical or unethical with their responses. The famous writer and scholar Isaac Asimov developed the “Three Laws of Robotics”, with the aim of making possible the coexistence of humans and intelligent robots:
Later, Asimov added the "Zero Law", which above all others defines that a robot may not harm humanity or, through inaction, allow humanity to come to harm.
ChatGPT, as the most advanced language model we have today, is still incomplete when it comes to ethics in AI. The potential ability to “fool” the AI into saying unethical things is one aspect of the model being studied after it was released to the public. By understanding how the AI can be fooled, researchers can further optimize its code to become more ethical and ready for wider public adoption.
ChatGPT also has many limitations when it comes to what it can generate for the user. Although it creates human-like writing, it doesn’t understand the meaning behind the words it writes, which can result in overly simplistic outputs and some with no meaning at all. The AI also has a hard time creating funny or sarcastic stories, with most of its stories being predictable or boring. This has a lot to do with its inability to be authentically creative.
ChatGPT is no doubt a significant milestone in the field of generative AI and NLP. Its advanced abilities to perform a wide range of NLP tasks in a human-like format has the potential to enable new and innovative applications in the field of data science. While some concerns have been raised, like its high cost per day, there are several solutions available to overcome them. Shakudo Platform supports open-source tools and optimizes cloud resources, which helps decrease overall costs.
Like C-3PO, this AI is truly ahead of its time and sets a new standard for what is possible in the field of data science and NLP.