Introducing Aya

I am proud to be one of 3,000 humans who built Aya – a new massively multilingual, generative LLM that outperforms existing open-source models and covers 101 different languages. Check out Aya here, including the dataset, model, two new papers on arXiv and a nice documentary on the entire process.

https://cohere.com/research/aya

 

The paper here details exactly how we put together the dataset and relied on communities of speakers in 101 different languages around the world. Submitted to Association of Computational Linguistics, 2024.

Are Emergent Abilities of Large Language Models a Mirage?

At NeurIPS in December I met Rylan Schaeffer from Stanford, author (with Brando Miranda, and Sanmi Koyejo) of this fascinating paper about the benchmarks used to measure the capabilities of LLMs. He found that many of the most common benchmarks use non-linear or discontinuous metrics to measure capabilities that should really be measured with linear metrics. The non-linear metrics show sudden jumps in ability as models get bigger–so-called emergent abilities. But if you change the metric so it’s linear, as models get bigger they show steady, predictable progress. Nothing magical about it.

Click here for a re-print of an article I wrote for American Scientist, March-April, 2024, Vol. 112.