David Ifeoluwa Adelani 🇳🇬
@davlanade
Assistant Professor @mcgillu, Core Academic Member @Mila_Quebec, Canada CIFAR AI Chair @CIFAR_News | interested in multilingual NLP | Disciple of Jesus
I'm happy to share the electronic copy of my PhD thesis "Natural language processing for African languages" publikationen.sulb.uni-saarland.de/handle/20.500.… . Happy reading especially if you are new to AfricaNLP. A big thank you to my supervisor @dklakow & the examination committee members:
Thank you @NaijaNLP Hallelujah! I'm happy to share that I defended my PhD today on "Natural language processing for African languages". The journey of 8 years is finally over. A big thank you to my supervisor @dklakow and reviewers: Prof. Alex Fraser & Prof. Benoît Sagot.
A blizzard is raging in Montreal when your friend says “Wow, the weather is amazing!” Humans easily interpret irony, while LLMs struggle at it. We propose a 𝘳𝘩𝘦𝘵𝘰𝘳𝘪𝘤𝘢𝘭-𝘴𝘵𝘳𝘢𝘵𝘦𝘨𝘺-𝘢𝘸𝘢𝘳𝘦 probabilistic framework as a solution. arxiv.org/abs/2506.09301 @ #acl2025
🎉Excited to share new work with undergrads from @FASILKOM_UI Multilingual interpretability is often done at the neuron level; but they are polysemantic We show that using a self-autoencoder first yields more monosemantic, interpretable features More: arxiv.org/pdf/2507.11230
Life update: I’m excited to share that I’ll be starting as faculty at the Max Planck Institute for Software Systems(@mpi_sws_) this Fall!🎉 I’ll be recruiting PhD students in the upcoming cycle, as well as research interns throughout the year: lasharavichander.github.io/contact.html
💥New Paper💥 #LLMs encode harmfulness and refusal separately! 1️⃣We found a harmfulness direction 2️⃣The model internally knows a prompt is harmless, but still refuses it🤯 3️⃣Implication for #AI #safety & #alignment? Let’s analyze the harmfulness direction and use Latent Guard 🛡️
1/ 🚨New Paper 🚨 LLMs are trained to refuse harmful instructions, but internally, do they see harmfulness and refusal as the same? ⚔️We find causal evidence that 👈”LLMs encode harmfulness and refusal separately” 👉. ✂️LLMs may know a prompt is harmful internally yet still…
If you want to help us improve language and cultural coverage, and build an open source LangID system, please register to our shared task! 💬 Registering is easy! All the details are on the shared task webpage: wmdqs.org/shared-task/ Deadline: July 23, 2025 (AoE) ⏰
commoncrawl.org/blog/wmdqs-sha…
🤔 Have you ever wondered how good ModernBERT is compared to decoders like Llama? We made an open-data version of ModernBERT and used the same recipe for encoders and decoders. Turns out, our encoder model beat ModernBERT and our decoder model beats Llama 3.2 / SmolLM2 🤯 🧵
When it comes to AI models for daily use, users obviously prefer fast ones. But how can we quantify this? Preferences from global @yupp_ai users provide a few answers. Speed makes a big difference. An A/B test with Kimi K2 by @Kimi_Moonshot shows ~150 points VIBE score… 🧵
We’re thrilled to share our latest work: FLEXITOKENS! In this work, we introduce language models with learnable tokenizers for making tokenization truly flexible during adaptation. See example below ↓ 1/n
Introducing T5Gemma: the next generation of encoder-decoder/T5 models! 🔧Decoder models adapted to be encoder-decoder 🔥32 models with different combinations 🤗Available in Hugging Face and Kaggle developers.googleblog.com/en/t5gemma
📢I am hiring a Postdoc to work on post-training methods for low-resource languages. Apply by August 15 employment.ku.dk/faculty/?show=…. Let's talk at #ACL2025NLP in Vienna if you want to know more about the position and life in Denmark.
Sparsity can make your LoRA fine-tuning go brrr 💨 Announcing SparseLoRA (ICML 2025): up to 1.6-1.9x faster LLM fine-tuning (2.2x less FLOPs) via contextual sparsity, while maintaining performance on tasks like math, coding, chat, and ARC-AGI 🤯 🧵1/ z-lab.ai/projects/spars…
We're organizing a shared task to develop a multilingual physical commonsense reasoning evaluation dataset! Details on how to submit are at: sigtyp.github.io/st2025-mrl.html
As part of the workshop, we are also organizing a shared task to develop a collaborative physical commonsense reasoning evaluation dataset. See the shared task page for more information: sigtyp.github.io/st2025-mrl.html.
🚨 Deadline Extended 🚨 The MELT Workshop Track submission deadline is now June 30, 2025 (AoE). We still welcome new & unpublished work: abstracts (≤2 p) | shorts (≤5 p) | longs (≤9 p) 📌 Non-archival 🔗 Submit here → openreview.net/group?id=colmw… #MELTWorkshop2025 #COLM2025
If you would like to sign up to be a reviewer, please fill in this form: forms.gle/fbizvGghD33cP3…
As part of the workshop, we are also organizing a shared task to develop a collaborative physical commonsense reasoning evaluation dataset. See the shared task page for more information: sigtyp.github.io/st2025-mrl.html.
The Multilingual Team at @cohere is hiring! If this sounds like you, please apply: - strong coding skills and a keen eye for detail - experience working with the challenges & joys of multilingual data Help us bring AI to the world! 🌏🌍🌎 jobs.ashbyhq.com/cohere/a87be94…
In the beginning, there was BERT. Eventually BERT gave rise to RoBERTa. Then, DeBERTa. Later, ModernBERT. And now, NeoBERT. The new state-of-the-art small-sized encoder:
🚀 Tower+: our latest model in the Tower family — sets a new standard for open-weight multilingual models! We show how to go beyond sentence-level translation, striking a balance between translation quality and general multilingual capabilities. 1/5 arxiv.org/pdf/2506.17080