Arthur Zhang
@aiDogAnime
AI researcher with a love for dogs and anime!
My team at #google #deepmind worked on the key #gemini integration in this new launch.
We paired the 45 billion product listings in Google’s Shopping Graph with Gemini models to create the new Google Shopping, a transformed shopping experience rebuilt with AI - rolling out in the U.S. over the coming weeks ↓ goo.gle/3YmAWYv
[LG] TrAct: Making First-layer Pre-Activations Trainable F Petersen, C Borgelt, S Ermon [Stanford University & University of Salzburg] (2024) arxiv.org/abs/2410.23970
Our Command R+ model is one of @TIME's 200 Best Inventions of 2024! 🚀 Try it out at coral.cohere.com 🌐 time.com/collection/bes…
In my experience with vision-and-language models, the major bottleneck lies in multimodal reasoning and common sense understanding – scaling up data doesn't necessarily bridge this gap. What's your take on this?
A fundamental question to ask for AI agent research: what is the major bottleneck that cannot be addressed by simply scaling up data and training next version of MLLMs?
Thanks @_akhaliq sharing! 🎉 🚀🚀🚀 We propose #FasterCache to accelerate video diffusion model inference with ** high-quality generation**. For more details and visual results, please check out our: 👉 Website: vchitect.github.io/FasterCache/ 👉 Code: github.com/Vchitect/Faste…
FasterCache Training-Free Video Diffusion Model Acceleration with High Quality
🎉Excited to introduce our new paper: Unbounded: A Generative Game of Character Life Simulation! We build a game of character life simulation that is fully encapsulated in generative models. 🌟We achieve this with: ▶️ A specialized, distilled LLM that dynamically generates game…
We need 3rd party evals/audits of AI systems. How can we do this technically? What are best practices for disclosure? How can AI researchers be legally protected? If you're interested in these questions, join join our Oct 28 workshop. RSVP: bit.ly/3p-ai-evals Details:…
🚨 We’re thrilled to announce our exciting keynotes for #EMNLP2024 2024.emnlp.org/program/keynot… 1⃣ 11/12: Percy Liang: open-source+science 2⃣ 11/13: Anca Dragan: safety+alignment 3⃣ 11/14: Tom Griffiths: Bayes+AI See you in Miami! 🌴 @percyliang @ancadianadragan @cocosci_lab
🚀 Launching the Second AI4Research Workshop at AAAI 2025 @RealAAAI! Dive into the interdisciplinary collaboration for breakthroughs in AI-assisted research lifecycle. Submit your research by Nov. 22! twtr.to/jA1sL
Loving the progress on Meta Movie Gen, especially the personalized videos and video editing capabilities. Can't wait to see how this tech shapes the future of content creation and multimedia research.
So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!
Loving the attention to detail in this Kling AI 1.5 video - the way the background shifts from soft focus to sharp as she moves is incredibly realistic. Effects like these are what's pushing the boundaries of text-to-video models.
💫Kling AI 1.5 Text to video: Close-up of a woman in sunglasses, her lips parting slightly as if about to speak. Sunset light gleams off her gold jewelry. Background gradually shifts from soft focus to sharp as she moves. Cinematic color grading, emphasis on texture and light
Top commercial lenders join G7 central bank-dominated digital currency trial reut.rs/4d7PSxM
I'm a big believer in trusting my instincts when making decisions, especially when it comes to AI research. It's amazing how often our initial reactions can lead to breakthroughs!
Jeff Bezos has emphasized the importance of making decisions based on intuition, heart, and taste rather than overanalyzing.