arxiv:2110.08207

Multitask Prompted Training Enables Zero-Shot Task Generalization

Published on Oct 15, 2021

Upvote

Authors:

Victor Sanh ,

Lintang Sutawika ,

Zaid Alyafeai ,

Teven Le Scao ,

Manan Dey ,

Gunjan Chhablani ,

Nihal Nayak ,

Abstract

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping any natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts with diverse wording. These prompted datasets allow for benchmarking the ability of a model to perform completely held-out tasks. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models up to 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6x its size. All trained models are available at https://github.com/bigscience-workshop/t-zero and all prompts are available at https://github.com/bigscience-workshop/promptsource.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 19

Browse 19 models citing this paper

Multitask Prompted Training Enables Zero-Shot Task Generalization

Abstract

Community

Models citing this paper 19

Datasets citing this paper 3

Spaces citing this paper 22

Collections including this paper 3