Jenia Jitsev ๐ณ๏ธโ๐ ๐บ๐ฆ ๐ฎ๐ฑ
@JJitsev
CLIP Interrogator infers: "Arbeitsrat fรผr Kunst, AI Researcher, meet the actor behind the scenes, with curls" they/them. Co-founder & scientific lead LAION e.V.
Our new work uses scaling law derivation to enable robust model and dataset comparison, a step towards guided, reproducible progress in open foundation model research. Following the comparison, we also release openMammut-L-14 with 0-shot IN1K 80.34%. arxiv.org/abs/2506.04598 1/n
Matching/beating closed data based DINOv2 and SigLIP2 with open data (ImageNet 21k and LAION-600M taken from Re-LAION-2B). Vision only SSL, no language involved. Impressive work. Further progress for open source community.
Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research๐งต
If you would like to work on open foundation models and datasets necessary for their creation and cooperate with LAION @laion_ai - here opportunity to have your own lab and work on largest supercomputers in EU together with LAION, doing open-source ML/AI fz-juelich.de/en/careers/jobโฆ
Giving up fixed image resolution on vision tower input is very important step for open-source software stacks for studying and training SOTA vision and language-vision models. Great effort by @wightmanr , importantly hacking efficient data loaders. timm and openCLIP users rejoice
timm's got a new vision transformer (NaFlexVit), and it's flexible! I've been plugging away at this for a bit, integrating ideas from FlexiViT, NaViT, and NaFlex and finally ready to merge for initial exploration. The model supports: * variable aspect/size images of NaFlex (seeโฆ
For releases considered to be important, please compare with relevant reference models of similar compute/kind, not just with those where the release looks better.
Oh there it is: Magistral Medium: AIME-24 - 73.6% AIME-25 - 64.9% GPQA - 70.8% LiveCodeBench - 59.4% Aider - 47.1% vs Qwen3-30B-A3B: AIME-24 - 80.4% AIME-25 - 70.9% GPQA - 65.8% LiveCodeBench - 62.6% Aider - idk but Qwen32B gets 50.2%