babyLM
@babyLMchallenge
Train small large language models
We are expecting🫄 A 3rd BabyLM👶, as a workshop @emnlpmeeting Kept: all New: Interaction (education, agentic) track Workshop papers More in 🧵
Who would you want to hear👂 in the babyLM workshop day? self nomintation✅ big dreams ✅ Boring❌ We want to invite who interests the community, not just us. Share, discuss, make fun and also submit papers to the workshop

Close your books, test time! The evaluation pipelines are out, baselines are released and the challenge is on. There is still time to join and we are excited to learn from you on pretraining and the gaps between humans and models. *Don't forget to fast-eval on checkpoints

Understanding efficient and cognitively inspired pretraining helps linguistics. Have anything relevant? Remember the challenge of this year is on and introduces also interaction and a workshop accepting related papers!
How should the humanities leverage LLMs? > Domain-specific pretraining! Pretraining models can be a research tool, it's cheaper than LoRA, and allows studying - grammatical change - emergent word senses - and who knows what more… Train on your data with our pipeline!
Together with the @cambridgenlp BabyLM team, we extend the @babyLMchallenge task beyond English and introduce acquisition-inspired techniques to improve curriculum learning strategies. Check out our group's paper: arxiv.org/abs/2410.22886.
Of course, babies only take small naps See you at BabyLM Workshop in EMNLP25 Suzhou More details soon here, on slack and on babylm.github.io Until then, please share your suggestions for next year
new dataset creation (not allowed last year) seems to be the best way to improve on BLiMP #EMNLP2024
👶I am happy to announce that our paper "BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency" received the ✨Outstanding Paper Award✨ at @babyLMchallenge !! #CoNLL2024 #EMNLP2024
👶Happy to share that our paper on the effects of variation sets on LMs received the 👑Outstanding Paper Award👑 @babyLMchallenge Huge congrats to @_akari000, @akiyohukat_u, @AriannaBisazza and Yohei Oseki Check our paper👉: arxiv.org/abs/2411.09587 #EMNLP2024 #CoNLL2024
👶I am happy to announce that our paper "BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency" received the ✨Outstanding Paper Award✨ at @babyLMchallenge !! #CoNLL2024 #EMNLP2024
🌴We had a great time at #EMNLP2024 presenting our works, meeting old friends, getting to know new people, and winning some prizes (Best Social Impact Award at #EMNLP2024 Main and @babyLMchallenge award #CoNLL2024) 🤩