babyLM

@babyLMchallenge

Train small large language models

Joined March 2024

61Following

317Followers

Pinned

babyLM@babyLMchallenge · Feb 18

We are expecting🫄 A 3rd BabyLM👶, as a workshop @emnlpmeeting Kept: all New: Interaction (education, agentic) track Workshop papers More in 🧵

4.0K

babyLM@babyLMchallenge · Jul 7

Who would you want to hear👂 in the babyLM workshop day? self nomintation✅ big dreams ✅ Boring❌ We want to invite who interests the community, not just us. Share, discuss, make fun and also submit papers to the workshop

babyLMchallenge's tweet image. Who would you want to hear👂 in the babyLM workshop day?
self nomintation✅
big dreams ✅
Boring❌
We want to invite who interests the community, not just us.
Share, discuss, make fun and also submit papers to the workshop

584

babyLM@babyLMchallenge · May 9

Close your books, test time! The evaluation pipelines are out, baselines are released and the challenge is on. There is still time to join and we are excited to learn from you on pretraining and the gaps between humans and models. *Don't forget to fast-eval on checkpoints

babyLMchallenge's tweet image. Close your books, test time!
The evaluation pipelines are out, baselines are released and the challenge is on.
There is still time to join and we are excited to learn from you on pretraining and the gaps between humans and models.

*Don't forget to fast-eval on checkpoints

1.0K

babyLM@babyLMchallenge · Apr 15

Understanding efficient and cognitively inspired pretraining helps linguistics. Have anything relevant? Remember the challenge of this year is on and introduces also interaction and a workshop accepting related papers!

EElisabeth Fittschen@EFittschen · Apr 14

How should the humanities leverage LLMs? > Domain-specific pretraining! Pretraining models can be a research tool, it's cheaper than LoRA, and allows studying - grammatical change - emergent word senses - and who knows what more… Train on your data with our pipeline!

341

babyLM Retweeted

Suchir Salhan@suchirsalhan · Nov 21

Together with the @cambridgenlp BabyLM team, we extend the @babyLMchallenge task beyond English and introduce acquisition-inspired techniques to improve curriculum learning strategies. Check out our group's paper: arxiv.org/abs/2410.22886.

1.0K

babyLM Retweeted

babyLM@babyLMchallenge · Nov 19

Of course, babies only take small naps See you at BabyLM Workshop in EMNLP25 Suzhou More details soon here, on slack and on babylm.github.io Until then, please share your suggestions for next year

716

babyLM Retweeted

Asad Sayeed @[email protected]@asayeed · Nov 16

new dataset creation (not allowed last year) seems to be the best way to improve on BLiMP #EMNLP2024

205

babyLM Retweeted

Akari Haga@_akari000 · Nov 16

👶I am happy to announce that our paper "BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency" received the ✨Outstanding Paper Award✨ at @babyLMchallenge !! #CoNLL2024 #EMNLP2024

110

20.0K

babyLM@babyLMchallenge · Nov 17

👶Happy to share that our paper on the effects of variation sets on LMs received the 👑Outstanding Paper Award👑 @babyLMchallenge Huge congrats to @_akari000, @akiyohukat_u, @AriannaBisazza and Yohei Oseki Check our paper👉: arxiv.org/abs/2411.09587 #EMNLP2024 #CoNLL2024

AAkari Haga@_akari000 · Nov 16

6.0K

babyLM Retweeted

GroNLP@GroNlp · Nov 18

🌴We had a great time at #EMNLP2024 presenting our works, meeting old friends, getting to know new people, and winning some prizes (Best Social Impact Award at #EMNLP2024 Main and @babyLMchallenge award #CoNLL2024) 🤩

1.0K