Assaf Ben Kish
@abk_tau
Deep Learning | Large Language Models | Reinforcement Learning
Today we're launching Subconscious: a new platform for building agents with long-horizon reasoning and tool use, backed by MIT research. One API call. Tool use. Context beyond existing limits. If you're building agents, let's talk.
OPRM is accepted to #COLM2025! See you in Montreal 🇨🇦 Big thanks to our great collaborators from TAU, MIT, and IBM! #LLM @COLM_conf
New work! 🚨 Recurrent LLMs like Mamba and RWKV can efficiently process millions of tokens, yet still underperform on real-world long-context tasks. What's holding them back? 🤔 And how can a lightweight fix boost their performance by 35% on LongBench? 👇🏼🧵 Github:…
📄🚨 New! Tired of waiting minutes for LLMs to "think"? Test-time scaling (O3, DeepSeek-R1) lets LLMs reason before answering — but users are left clueless, with no progress or control. Not anymore! We expose the LLM’s internal 🕰️, and show how to monitor 📊 & overclock it⚡ 🧵👇
We know Attention and its linear-time variants, such as linear attention and State Space Models. But what lies in between? Introducing Log-Linear Attention with: - Log-linear time training - Log-time inference (in both time and memory) - Hardware-efficient Triton kernels
Thanks @MIT_CSAIL for featuring our work!🖊️🎨 Huge thanks to the CSAIL news team for the fun article + video!! We'll be presenting SketchAgent at #CVPR2025 next week — come say hi if you're curious how LLMs can be used to collaboratively sketch!🖌️ 👉 bit.ly/43mTme1
Sometimes the best way to express an idea is by sketching it out. A system from MIT CSAIL & Stanford captures this iterative process by teaching LLMs to create sequential sketches. It could work w/users to visually communicate concepts: bit.ly/4kfXFhk
Sometimes the best way to express an idea is by sketching it out. A system from MIT CSAIL & Stanford captures this iterative process by teaching LLMs to create sequential sketches. It could work w/users to visually communicate concepts: bit.ly/4kfXFhk
Overflow Prevention Enhances Long-Context Recurrent LLMs OPRM chunk-based inference: - Split the context into chunks - Process chunks in parallel (speculative prefill) - Select the best one (e.g., lowest entropy). - Decode only from that chunk Advantages: - No training required…
Very nice deep dive explaining OPRM by @xiaolGo youtu.be/7_7kcJsJTLk?si…
New work! 🚨 Recurrent LLMs like Mamba and RWKV can efficiently process millions of tokens, yet still underperform on real-world long-context tasks. What's holding them back? 🤔 And how can a lightweight fix boost their performance by 35% on LongBench? 👇🏼🧵 Github:…
The next frontier for AI shouldn’t just be generally helpful. It should be helpful for you! Our new paper shows how to personalize LLMs — efficiently, scalably, and without retraining. Meet PReF (arxiv.org/abs/2503.06358) 1\n
SketchAgent has been accepted to #CVPR2025 ! This is an early step toward new tools for visual thinking and richer interaction with LLMs 🎨 🔗 sketch-agent.csail.mit.edu
Excited to introduce SketchAgent!👩🎨 We leverage the prior of pretrained multimodal LLMs for language-driven, sequential sketch generation and human-agent collaborative sketching! ✨ Try our fun interface here: github.com/yael-vinker/Sk…