Rosmine

@rosmine_b

ML researcher. LLMs + RL + Code gen. Tweets express the views of my employer (myself). DM me ML questions

Joined October 2023

472Following

2KFollowers

Pinned

Rosmine@rosmine_b · Apr 9

I trained a model with GRPO to generate better SVG images. Here's improvement over 120 steps More details below prompt: Two tall giraffes are next to bare trees.

rosmine_b's tweet image. I trained a model with GRPO to generate better SVG images.

Here's improvement over 120 steps

More details below

prompt: Two tall giraffes are next to bare trees.

311

162

35.0K

Rosmine@rosmine_b · 23 h

Setting my batch size to 1 just so my loss graphs update faster

177

Rosmine@rosmine_b · Jul 23

Ok, I want to know the highest effort data cleaning that everyone did

LLuke Heeney@heeney_luke · Jul 18

Academia must be the only industry where extremely high-skilled PhD students spend much of their time doing low value work (like data cleaning). A 1st year management consultant outsources this immediately. Imagine the productivity gains if PhDs could focus on thinking

606

Rosmine@rosmine_b · Jul 22

My friend @alignment_lab is selling prebuilt 2x 5090 machines for $5K. Great deal if you want to get started with your own gpus

AAlignment Lab AI@alignment_lab · Jul 22

Introducing SENTER We are announcing the availability of SENTER, a powerful workstation we built to perform research and train AI without the extreme costs of cloud and API fees. It's designed to put your intelligence, data, privacy, and productivity back into your hands.…

Rosmine@rosmine_b · Jul 22

I currently have 2 gpus doing training... and 4 running data cleaning

kkalomaze@kalomaze · Jul 22

> low value work (like data cleaning) uh...

782

Rosmine@rosmine_b · Jul 22

> low value work (like data cleaning) uh...

LLuke Heeney@heeney_luke · Jul 18

129

5.0K

509

269.0K

Rosmine@rosmine_b · Jul 21

Source: trust me bro

190

Rosmine@rosmine_b · Jul 10

Just talked to a meta recruiter. I asked about comp and he said he didn't know lol Of course he knows, he just doesn't want to say it because with Zuck's hiring spree, people will be disappointed with anything less than 10M

230

Rosmine@rosmine_b · Jul 9

My training runs kept becoming unstable late in training. Turns out a "clever" optimization hack I tried was effectively increasing the LR the closer it got to a local min, making convergence impossible

YYu Xiang@YuXiang_IRVL · Jul 8

A general research tip: pay attention to the details. When something strange happens—don’t ignore it. Especially in robotics, odd or incorrect robot behavior often reveals deeper insights. Dig in and ask why

559

Rosmine@rosmine_b · May 30

Reminder: If your RL reward has multiple components on very different scales, the model’s going to take a loooooong time to learn from the smaller components

413