Open Challenges for AI Research and Engineering With Chip Huyen

I go through Chip Huyen's article on LLM's and add my thoughts on the future of AI and LLM's.
Created on August 17|Last edited on August 17
Comment
Chip Huyen is a writer and computer scientist, known for co-founding Claypot AI, a platform for real-time machine learning. She has previously worked on machine learning tools at NVIDIA, Snorkel AI, Netflix, and Primer. Chip's article presents a list of open challenges for AI research and engineering. It addresses key topics in AI such as reducing AI hallucinations, optimizing context in AI responses, making large language models (LLMs) more efficient, exploring new AI architectures, and the potential of AI agents that can act autonomously. I figured I'd go through her article and add some of my thoughts. Feel free to read here full article linked at the bottom of the page.  
﻿
Open Challenge: Reduce Hallucinations ﻿
Chip’s Thoughts: 
Hallucinations, or AI fabrications, are a concern for AI's mainstream adoption. While they can be seen as beneficial for creative tasks, they pose a risk in many other applications. Many companies and startups are working to address and measure hallucination in AI. Tips for reduction include adding context or aiming for model conciseness.
﻿
My Thoughts: 
The entire issue of hallucination is quite interesting. I do believe that for achieving some sort of “super intelligence”, hallucinations are more of a feature than a bug. The best ideas come out of nowhere, and usually sound dumb or impossible at first, and this mostly what hallucinations seem to be. The obvious issue is that this occurs when creativity is unnecessary and high precision is required. My guess this can be solved with some modification to the underlying architectural elements of the transformer, or perhaps a new architecture paradigm entirely.
Here’s a few questions I currently have 
Do large scale LSTM’s and RNN’s hallucinate to the same degree? 
So far in Transformer Research, the MLP layers are mostly attributed to storing memories. Are these “hallucinations” actually just incorrect/corrupted MLP memories, or is there something more going on?
﻿
Open Challenge: Optimize context length and context constructionChips Thoughts: 
Context is pivotal for answering questions. A study by Zhang & Choi in 2021 indicated that about 16.5% of questions from a dataset were context-dependent. In business scenarios, the percentage is likely even higher. 
My Thoughts: 
For me personally, I see great potential for integrating LLMs into my programming environment, specifically to relay important code changes and debugging information directly to the LLM as I use it. I find that if the model has the information it needs, it's usually able to solve issues. One of the most time consuming tasks as a programmer for me currently is simply copying and pasting debugging information between my codebase and ChatGPT. If there was a system that could reduce my time spend here, I would absolutely use it! 
﻿
Open Challenge: Make LLMs faster and cheaperChip's Thoughts: 
Post the GPT-3.5 release, the AI community managed to create a similar-performing model with significantly reduced memory needs. Several techniques exist for model optimization, such as quantization, knowledge distillation, low-rank factorization, and pruning.
My thoughts 
This is something I've thought a lot about. Our brain has around 100B neurons, however not all of them are dedicated to intelligent reasoning, and as a rough estimate, I can make a guess that there is about 25 billion neurons required for intelligent thought and reasoning (I just added up the neuron counts for critical regions of the brain, and this could be completely off). This requires a good amount of GPU’s to run (with our current artificial neurons trained with backpropagation). It's hard to say whether we can achieve human level reasoning at a lesser number of neurons than a human, at least before our compute improves to the point where this is irrelevant. I think efficient LLM’s could be a major tool for every developer. I view LLM’s as sort of a “intelligent if statement” capable making decisions about data with little added training or hard coded rules. By making LLM’s more efficient, it will make integrating them into our code cheaper and ultimately more practical. In addition, making LLM’s more efficient will allow us to make them larger, which likely will make them smarter, so it’s sort of turns into a virtuous cycle.  
﻿
Edit: After thinking about it a bit, neurons and parameters are two separate things (artificial neural nets are usually benchmarked by parameter count). A single artificial neuron could have thousands of parameters associated with it, and biological neurons could have up to 200,000 synapses, so maybe this could be an interesting thing to research as well!! 
Open Challenge: Design a new model architectureChips Thoughts: 
While the Transformer architecture has dominated since 2017, there's anticipation for the next big architectural shift. The architecture needs to work efficiently on current hardware and at current scales. A promising direction is aiming for sub-quadratic complexity.
My Thoughts: 
This is another interesting topic I've pondered. The true reasons Transformers emerged as the best architecture seem to revolve around easy scalability, predictable training, and the ability to handle data in long sequences. As chip mentioned, the transformer’s major compute bottleneck currently is the quadratic nature of the attention mechanism. If we can achieve similar performance with less compute, we can make the model more efficient, and likely more intelligent as a result. 
My Questions 
It seems like if we would’ve just scaled the Transformer architecture earlier, we could’ve achieved something like ChatGPT much faster. Have we really tried scaling all other model architectures? Although they may not show the same performance on smaller datasets, research like "Grokking: Generalization Beyond Overfitting on Small Algorithmic Dataset" make me think that maybe there is more potential for seemingly inadequate models to have more impressive capabilities at larger scale. I understand this would be expensive, but the potential payoff is huge, and I think it could give us a deep understanding of how these models really work. 
﻿
Open Challenge: Make agents usableChips Thoughts: 
Agents can perform actions, from browsing online to sending emails. There's excitement around the potential of agents, evidenced by popular repositories like Auto-GPT. Yet, concerns linger about their reliability. Startups, like Adept, are exploring their potential with internet browsing and other tasks.
My Thoughts: 
LLM’s are incredibly powerful when guided by a human. The issue right now is that it requires a large amount of instruction in order for the model to stay on task. I believe as the models simply get larger and smarter, the need for human supervision will be reduced, however, I still think we will need framework that will guide these models to function autonomously (essentially loop themselves) and manage a todo list or set of goals. This is non-trivial, and it will be interesting to see breakthroughs will emerge. 
﻿
Overall, Chip had a lot of interesting points. I would definitely recommend checking out the article! https://huyenchip.com/2023/08/16/llm-research-open-challenges.html﻿
﻿
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.