CompanyResearch

Predictability Versus Surprise: The Paradox of Large Generative Models

Recent advancements in pre-training at a large scale have facilitated the development of versatile and capable generative models such as GPT-3, Megatron-Turing NLG, Gopher, among others. In this paper, we spotlight an intriguing trait of these models, reflecting on the policy implications ensuing from this characteristic. Specifically, these generative models exhibit an unusual interplay of predictable loss across a wide training distribution, epitomized in their "scaling laws", along with unpredictable individual capabilities, inputs, and outputs. We posit that the macro-level predictability and emergence of valuable capabilities fuels the rapid evolution of these models, while the unforeseeable aspects complicate forecasting the implications of model deployment. We delve into examples showcasing how this dichotomy can culminate in socially detrimental outcomes, drawing upon literature and real-world instances. Additionally, we conduct two original experiments to underscore the potential harm deriving from unpredictability. We further explore how these contrasting properties intertwine, influencing developers' motives for model deployment and creating challenges that could obstruct deployment. In conclusion, we propose a suite of potential interventions that the AI community might consider to enhance the likelihood of these models contributing positively. This paper is intended to be a resource for policymakers aiming to comprehend and regulate AI systems, technologists mindful of the potential policy impact of their endeavors, and academics interested in analyzing, critiquing, and potentially innovating within the realm of large generative models.