ChatGPT is a leap forward but not the massive breakthrough everyone thinks it according to Elerian AI
Since the ChatGPT prototype launch in November, there’s been massive interest in its application across industries, including customer service. While OpenAi’s ChatGPT does seem to take a massive leap forward and continually improve, Elerian AI CTO, Alfredo Gemma, disagrees that it’s the breakthrough everyone thinks it is – although it’s an impressive milestone on the road to AGI.
Artificial general intelligence (AGI), the ability of an intelligent agent to understand and learn any intellectual task that a human being can, still requires Deep Learning (DL) architecture to generalise effectively to work.
Says Gemma, “Large Language Models (LLM), such as the one powering ChatGPT, remember everything up to the point at which their training stopped. The question then becomes whether the system is capable after that of human-like generalisation capabilities needed to achieve an AGI and the likely answer is no.”
Human intelligence can be considered a combination of specialised intelligence (linguistic, emotional, logical-mathematical, spatial, bodily-kinesthetic, musical, etc.), leveraging memory in a particular way. The ability to generalise our knowledge is a fundamental aspect of human intelligence: humans can extend and apply the knowledge acquired in a specific context to other contexts when we identify similarities. Generalisation is only possible if one can identify the context, which is only remembered through memory. Memory is a requirement for intelligence.
To generalise, an intelligent system must be able to instantly repurpose its existing cognitive building blocks to perceive completely new objects or patterns without having to learn them, that is, without having to create new building blocks specifically for them.
In the end, the real problem with LLMs like ChatGPT is a structural one, which depends on the underlying architecture of the neural networks: the Deep Learning (DL) architecture. The biggest problem with DL is its inherent inability to generalise effectively. Without generalisation, edge cases are an insurmountable problem, something that the autonomous vehicle industry found out the hard way after wasting more than $100 billion by betting on it and still needs to produce a fully self-driving car.
Says Gemma, “Some in the AI community insist that DL’s failure to generalise can be circumvented by scaling (like it is done when LLMs are created), but this is not true. Scaling is precisely what researchers in the self-driving car sector have been doing, and it does not work. The cost and the many long years it would take to accumulate enough data become untenable because corner cases are infinite.”
There are many cases in which ChatGPT was requested to write articles on various topics. The result was decently well-written in almost all these cases, but it needed to be corrected. Every version of the story, even if prompted multiple times, contained errors that the chatbot couldn’t identify when engaged in conversation. ChatGPT is prone to fabricating answers if its knowledge doesn’t cover your request, even when you’re not asking it to write an article.
Concludes Gemma, “Bottom line, cracking generalised perception and the DL architecture needed to achieve that is still an open problem and would be a monumental achievement. For now, ChatGPT is exciting but not exactly a massive game-changer.”