Trelis Research
@TrelisResearch
👷Work for Trelis: https://trelis.com/developer-collab 🎥 Watch on Youtube: https://youtube.com/@trelisresearch 💡 Book a Consultation: https://forms.gle/2VXzrB
+ April 2025 Channel Update + - Tutorials Updates - Grant Announcements - Trelis AI Collabs and AI Residency - Trelis ARC AGI Team TIMESTAMPS: 0:00 Trelis Research April 2025 Channel Update 1:13 Tutorial & Repo Updates. 1:20 Fine-tuning tutorials 1:39 Inference tutorials 2:05…
- This is a **pure neural transductive approach**. - **BUT**, it trains a planning type module that has **no direct access to the problem input-pairs**. - The net is trained only on **one input-output pair at a time**, but also takes in positional embeddings for the grids AND an…
🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with…
Anyone have a connection at @Alibaba_Qwen? Trying to reproduce the results on @arcprize and getting different metrics Want to get a hold of them and find out how they tested
.@arcprize listed on the @Alibaba_Qwen model card 2nd model card for us in 2 weeks Excited for ARC-AGI to be seen as a supported way to measure model performance x.com/Alibaba_Qwen/s…
.@arcprize listed on the @Alibaba_Qwen model card 2nd model card for us in 2 weeks Excited for ARC-AGI to be seen as a supported way to measure model performance x.com/Alibaba_Qwen/s…
Performance
npx ccusage@latest (courtesy of @simonw's great blog). Shows claude code usage. I've been on the 90 EUR per month plan, seems heavy users will be loss making for Antropic here. I downgraded to the 20 per month because I'm not quite using it heavily (except if I need a big…

New video by community members @TrelisResearch and @lewishemens on their ARC Prize 2025 progress * Their approach to solving ARC-AGI * A call for sponsors * Research plan x.com/TrelisResearch…
- How to Beat ARC AGI 2? - --- Competing in the @arcprize with @lewishemens ! We lay out our thoughts on a winning approach. And, we're looking for further team members, and for sponsors. Reach out by DM or to arc [at] trelis [dot] com
🤖Train an ACT Policy for the SO-101 Robot🤖 --- This is the third video in the Trelis series on robotics! I describe how to collect data for training, and then I train the ACT policy for an SO-101 robot, using the @LeRobotHF library from @huggingface ! I then evaluate…
Excited to say that @RonanKMcGovern of @TrelisResearch and I have teamed up to work on ARC-AGI-2! Here's my latest on framing and approach, and a summary thread below: lewish.io/posts/how-to-b… Or in video form: x.com/TrelisResearch…
🤝Join the Team | Sponsor the Team: Trelis.com/arc-agi-2 Video Links: - Slides: docs.google.com/presentation/d… - How to Beat ARC AGI 2 Blog: lewish.io/posts/how-to-b… - ARC Prize Tasks: arcprize.org/play?task=e372… TIMESTAMPS: 0:00 Introduction and Team Formation 01:10 Overview of ARC AGI…
Program Synthesis approach breakthrough in ARC-AGI through Self-Play 📖 Read 201: « Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on Arc-AGI », by @PourcelJulien, @cedcolas and @pyoudeyer github.com/flowersteam/SO… The work of the authors is a…
Funny saying grok is AGI BUT def impressive from grok 4
Grok 4 is basically AGI.
o3 / o4-mini causing bad engineering practices Hard enough to motivate myself to inspect traces when they exist. Impossible when they don’t
Trelis AI Grants Update - 2Q 2025 --- 1. Congratulations to Dima Yanovsky (@yanovskyd) for completing his grant - "Accelerating Robotics Imitation Learning via Simulation and AR Teleoperation." - Note: Grants are announced each quarter based on completion, so there will be some…

HUGE CAVEAT: Ouf this is a lot worse than I thought and conveyed. It's based on a 120-problem split from the ARC-AGI-II split assuming pass@250 !!! That means it only needs to get one out of 250 correct and doesn't even need to know which one! That's much more lax than the…
it's MCTS but allowing for parent node re-sampling - In chess you would never re-sample because moves are finite/deterministic - In LLMs you can resample a lot and keep getting new examples. This allows wider branching at each node. Thought of differently, it's the existing…
To wait for rate limits, claude just writes a python function to wait, pretty nice

it's MCTS but allowing for parent node re-sampling - In chess you would never re-sample because moves are finite/deterministic - In LLMs you can resample a lot and keep getting new examples. This allows wider branching at each node. Thought of differently, it's the existing…
Inference-Time Scaling and Collective Intelligence for Frontier AI sakana.ai/ab-mcts/ We developed AB-MCTS, a new inference-time scaling algorithm that enables multiple frontier AI models to cooperate, achieving promising initial results on the ARC-AGI-2 benchmark.…