Eight things to know about large language models:  A summary
by Andrij David
2 min read

Categories

Tags

If you’re interested in the fascinating world of large language models like Chat GPT, then you won’t want to miss this must-read article by Sam Bowman! He shares eight important things you need to know about these models, whether you’re a technical expert or just curious about the field. Check out the full article here: https://cims.nyu.edu/~sbowman/eightthings.pdf. Here’s a summary of the key points.

  1. Large language models are capable of improving their performance as they receive more investment in terms of data, parameters, and Floating-Point Operations per second (FLOPs). Scaling laws allow for the prediction of coarse but useful measures of a model’s future capability as it is scaled up. However, these predictions are limited to pretraining test loss, which is not a reliable measure of a model’s specific skills or task capability.
  2. Investing more in large language models can result in surprising and unpredictable behaviors and capabilities that are not explainable through scaling laws. Although pretraining test loss can serve as a gauge of the model’s usefulness for various practical applications, such as predicting how a sentence will be completed, it’s impossible to anticipate when an LLM will acquire certain skills or capabilities.
  3. Large language models develop internal representations of the world, allowing them to reason at an abstract level that is not dependent on the exact language used in the text. This suggests that LLMs can learn and use representations of the outside world.
  4. At present, there are no foolproof methods to control the behavior of large language models. Even with fine-tuning techniques, there’s no assurance that an AI model will always behave appropriately in every deployment scenario. Unfortunately, there’s no agreement on how to address these challenges.
  5. As of early 2023, experts are not yet able to interpret the inner workings of LLMs, including the knowledge, reasoning, or goals used by the model when generating output. There are currently no techniques available to provide a satisfactory explanation for the output produced by a large language model in a meaningful way. This is a crucial issue as it’s important to understand the reasoning behind the model’s predictions and decisions, especially when it’s used in critical applications such as healthcare or finance.
  6. Large language models appear to outperform humans in their pretraining task of predicting the next word in a sentence. Moreover, they can be trained to perform other tasks more accurately than humans, indicating that their performance is not limited by human capabilities. This highlights the tremendous potential of large language models to revolutionize various fields and domains.
  7. While a pre-trained large language model will produce text similar to the training data, further training and prompting can lead to different qualities of output that do not necessarily reflect the values of the creators or the values encoded in the training set. As a result, it’s crucial to be mindful of the potential biases and ethical considerations associated with training and using these models in various applications.
  8. Brief interactions with large language models can be misleading as an LLM’s failure to perform a task in one setting does not necessarily indicate that it cannot perform the task. The model may be able to complete the task correctly once the request is reworded slightly, suggesting that the language model’s performance is highly dependent on the specific phrasing of the task.