The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, es-pecially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Usingt hese pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the arton a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask:How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmen-tal and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.