Max Schwarzer
@max_a_schwarzer
Post-training @OpenAI
I have always believed that you don't need a GPT-6 quality base model to achieve human-level reasoning performance, and that reinforcement learning was the missing ingredient on the path to AGI. Today, we have the proof -- o1. x.com/OpenAI/status/…
We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introduc…
Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation.
Here is a fun o1 test. I gave it this XKCD comic & the prompt: "make this a reality. i need a gui and clear instructions since i can't code. that means you need to give me full working software" It took less than 15 minutes, and it didn't get caught in any of the usual LLM loops
I have never seen a clearer case of near-enemy/far-enemy than tech people deciding to allign with religious fundamentalists and Slavic nationalists because they’re annoyed with wokeness. It’s the exact analog of leftist Ivy League activists getting annoyed by their consultant…
We’re hosting an AMA for developers from 10–11 AM PT today. Reply to this thread with any questions and the OpenAI o1 team will answer as many as they can.
don't worry we're coming for your eval soon
tried o1-preview on @arcprize result: 1 out of 2 tests correct so o1-preview isn't going to solve 100% ARC Prize tasks tbd on what % it gets compared to SOTA approaches, still testing rest of the lot
I really want to underline the IOI result in our blog post -- our model was as good as the median human contestant under IOI contest conditions, and scores among the best contestants with more test-time compute. Huge props to @markchen90 for setting such an ambitious goal!
As a coach for the US IOI team, I’ve been motivated for a long time to create models which can perform at the level of the most elite competitors in the world. Check out our research blog post - with enough samples, we achieve gold medal performance on this year’s IOI and ~14/15…
what it looks like when deep learning is hitting a wall:
Strawberry has landed. 𝗛𝗼𝘁 𝘁𝗮𝗸𝗲 𝗼𝗻 𝗚𝗣𝗧'𝘀 𝗻𝗲𝘄 𝗼𝟭 𝗺𝗼𝗱𝗲𝗹: It is definitely impressive. BUT 0. It’s not AGI, or even close. 1. There’s not a lot of detail about how it actually works, nor anything like full disclosure of what has been tested. 2. It is not…
Truly God is great
Donald Trump has been convicted on all 34 counts of falsifying business records to cover up a sex scandal that threatened his ascent to the White House in 2016. He is the first U.S. president to be declared a felon. Follow live updates. nyti.ms/4bRN7AE
Our team @Apple will be in Vienna next week for #ICLR2024, where we will be presenting our work on using large language models as RL policies. Come drop by our poster! Website: llm-rl.github.io Joint work with @max_a_schwarzer, @harsh_092, @alexttoshev and others
Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy…
Exceptionally funny to see someone arguing that technological progress is slowing down rely on AIs created in the last 1-2 years to do his work for him while not suffering any cognitive dissonance as a result
The most important inventions of the decade of the 1900s vs the decade of the 2000s. Pretty good evidence for secular stagnation. Source: Mostly various LLMs but had to do a lot of verifying/vetting. Some inventions are hard to date precisely. Other suggestions welcome.