Phase Transitions in the Output Distribution of Large Language Models

在物理系统中,诸如温度等参数的改变可能会引发相变:从一种物质状态到另一种状态的突然转变。最近,大型语言模型中观察到了类似的现象。通常,确定相变需要人类分析和对其系统的一些了解,以缩小要监测和分析的低维度性质。在物理学界,最近提出了用统计方法自动检测相变的建议。这些方法对系统是无关的,并且像这里所示,可以适应研究大型语言模型的行为。特别地,我们通过统计距离量化生成输出的分布变化,这些变化可以通过获取下一个单词的概率分布来高效估计。这种多功能的策略能够发现新的行为相和未探索的转换——特别是在语言模型迅速发展和其新兴能力的基础上,这种能力格外令人兴奋。

In a physical system, changing parameters such as temperature can induce a phase transition: an abrupt change from one state of matter to another. Analogous phenomena have recently been observed in large language models. Typically, the task of identifying phase transitions requires human analysis and some prior understanding of the system to narrow down which low-dimensional properties to monitor and analyze. Statistical methods for the automated detection of phase transitions from data have recently been proposed within the physics community. These methods are largely system agnostic and, as shown here, can be adapted to study the behavior of large language models. In particular, we quantify distributional changes in the generated output via statistical distances, which can be efficiently estimated with access to the probability distribution over next-tokens. This versatile approach is capable of discovering new phases of behavior and unexplored transitions — an ability that is particularly exciting in light of the rapid development of language models and their emergent capabilities.

https://arxiv.org/abs/2405.17088

https://arxiv.org/pdf/2405.17088.pdf

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注