Anastasis Germanidis
@agermanidis
Simple ideas, pursued maximally. Co-Founder & CTO @runwayml.
Towards Universal Simulation agermanidis.com/writings/unive…
We are beyond the tipping point. AI is becoming the foundational layer upon which most creative work will be built. Right now, the question is not whether most media will be generated, but how quickly. The progress we see is not slowing, quite the opposite. Real-time generation,…
Models just want to generalize. For the past years, we’ve been pushing the frontier of controllability in video, releasing new models and techniques for inpainting, outpainting, segmentation, stylization, keyframing, motion and camera control. Aleph is a single in-context model…
Introducing Runway Aleph, a new way to edit, transform and generate video. Aleph is a state-of-the-art in-context video model, setting a new frontier for multi-task visual generation, with the ability to perform a wide range of edits on an input video such as adding, removing…
Introducing Act-Two, our next-generation motion capture model with major improvements in generation quality and support for head, face, body and hand tracking. Act-Two only requires a driving performance video and reference character. Available now to all our Enterprise…
This summer has a similar feeling as that of 2022. Something has clicked with the latest generation of models and there’s suddenly so much low hanging fruit. Back to making new prototypes every weekend.
Starting today, we're announcing an 84-hour open challenge for the most interesting app built with the Runway API. The winning app will receive $1,000 and 1,000,000 API credits. Apps will be assessed for originality, functionality and unexpected industry use-cases. Submissions…
Lucid Dream Test Imagine the following scenario. You enter a room and you are asked to wear a VR headset that has a camera and supports a passthrough mode (which can display a real-time feed of your surroundings). Once you put on the headset, you find yourself in what appears to…
Total Pixel Space, which won the Grand Prix at this year's AIFF, is a wonderful video essay and, by the way, one of the clearest descriptions of universal simulation (as search in the space of all possible universes) youtube.com/watch?v=zpAeyg…
Great work by @graceluo_ @jongranskog training diffusion models to be aligned with VLM feedback in minutes, which can be used to improve commonsense reasoning and enable many kinds of visual prompting.
✨New preprint: Dual-Process Image Generation! We distill *feedback from a VLM* into *feed-forward image generation*, at inference time. The result is flexible control: parameterize tasks as multimodal inputs, visually inspect the images with the VLM, and update the generator.🧵
What would you build with a programmable world simulator?
Expanding the Gen-4 API with generalist image capabilities:
Earlier this month, we released Gen-4 References, our most general and flexible image generation model yet. It became one of our most popular releases ever, with new use cases and workflows being discovered every minute. Today, we’re making it available via the Runway API,…
As a field, we're just scratching the surface in building multimodal simulators of the universe. This will be a long journey, and the true purpose of deep learning.
CVPR returns this June. Join us Thursday June 12th for our annual CVPR Friends Dinner. RSVP at the link below.
I'm having more fun with @runwayml's Gen-4 References than I've had in a while with an AI model. This evening I started with the one image and created a whole video sequence in a couple of hours. It feels like a natural way to create. Audio is a track I generated a while ago…
References has been the biggest demonstration so far for me that if you focus on the problems that you really need to solve, rather than the problems that feel most solvable, deep learning will reward you for it.
Today we are releasing Gen-4 References to all paid plans. Now anyone can generate consistent characters, locations and more. With References, you can use photos, generated images, 3D models or selfies to place yourself or others into any scene you can imagine. More examples…
We have released early access to Gen-4 References to all Gen:48 participants. References allow you to create consistent worlds with consistent characters and locations. This early preview is already available to all teams participating in Gen:48 for free. Good luck with your…
A different framing: we're in one continuous era of simulation. The only thing that changes is what's being simulated: from toy worlds, to the world as perceived by humans, to the world beyond human perception.
The short paper "Welcome to the Era of Experience" is literally just released, like this week. Ultimately it will become a chapter in the book 'Designing an Intelligence' edited by George Konidaris and published by MIT Press. goo.gle/3EiRKIH
I’m excited to join Runway! There’s a lot to explore at the edges of media and creativity, and it’s great time to rethink storytelling which is core at Runway’s mission. I’ll be moving to NYC as well - if you wanna grab coffee, text me :)
weirdly often, people ask me 'do you miss the games industry?' and my answer is always 'i miss the lovely people! and the art-code-engineering-r&d cross colaborations you get' - not being 100% tech. i recently joined @runwayml because it has both those things in spades...
Gen-4 Turbo is an amazing feat of research and engineering. To give a sense of the improvement, in our internal evals its outputs were preferred ~90% of the time compared to those of non-Turbo Gen-3 Alpha.
Today we’re introducing Gen-4 Turbo. The fastest way to generate with our most powerful video model yet. With Gen-4 Turbo it now takes just 30 seconds to generate a 10 second video, making it ideal for rapid iteration and creative exploration. Now rolling out across all plans.