Frazier Huo
@Zepeng_Huo
What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
๐ Public releases ๐๐ต๐ฟ๐ฒ๐ฒ ๐ฑ๐ฒ-๐ถ๐ฑ๐ฒ๐ป๐๐ถ๐ณ๐ถ๐ฒ๐ฑ ๐น๐ผ๐ป๐ด๐ถ๐๐๐ฑ๐ถ๐ป๐ฎ๐น ๐๐๐ฅ ๐ฑ๐ฎ๐๐ฎ๐๐ฒ๐๐ (EHRSHOT, INSPECT, MedAlign) with 25,991 patients & 295M clinical events (1997-2023)โnow freely available for ๐ฬฒ๐ฬฒ๐ฬฒโฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒ๐ฬฒโฬฒ๐ฐฬฒ๐ธฬฒโฆ
๐ We're thrilled to announce the general release of three de-identified, longitudinal EHR datasets from Stanford Medicineโnow freely available for non-commercial research-use worldwide! ๐ Read our HAI blog post for more details: hai.stanford.edu/news/advancingโฆ ๐๐ฎ๐๐ฎ๐๐ฒ๐โฆ
๐ Excited to share that our latest research, ๐๐ช๐ฎ๐ฆ-๐ต๐ฐ-๐๐ท๐ฆ๐ฏ๐ต ๐๐ณ๐ฆ๐ต๐ณ๐ข๐ช๐ฏ๐ช๐ฏ๐จ ๐ง๐ฐ๐ณ 3๐ ๐๐ฆ๐ฅ๐ช๐ค๐ข๐ญ ๐๐ฎ๐ข๐จ๐ช๐ฏ๐จ, has been accepted at ๐๐๐๐ฅ 2025! ๐ ๐ ๐๐บ๐ฝ๐ฟ๐ผ๐๐ถ๐ป๐ด ๐ ๐ฒ๐ฑ๐ถ๐ฐ๐ฎ๐น ๐๐บ๐ฎ๐ด๐ฒ ๐ฃ๐ฟ๐ฒ๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐ถ๐๐ต ๐ง๐ถ๐บ๐ฒ-๐๐ผ-๐๐๐ฒ๐ป๐โฆ
๐ง Excited to introduce Merlin, a vision language foundation model for 3D computed tomography ๐โโฌ๐ฉป Trained to understand 3D abdominal CT scans using supervision from: ๐พ Structured electronic health records (1.8+ million codes) ๐๏ธ Natural language radiology reports (6+ millionโฆ
Longitudinal EHR data in such large scale is perfect to build your own EHR foundation model!
Excited to announce the official full release of ๐๐EHRSHOT -- a dataset of 6,739 deidentified longitudinal EHRs for few-shot eval of foundation models! ๐Website: ehrshot.stanford.edu ๐ฅDataset: redivis.com/datasets/53gc-โฆ ๐คModel: huggingface.co/StanfordShahLaโฆ
Excited to publish dataset for #OpenScience ๐ค
We're excited to introduce #INSPECT a large-scale โจ3D multimodalโจmedical imaging dataset #NeurIPS2023 19,402 Stanford Medicine Patients ๐ฉป23,248 CT Scans + ๐Paired Radiology Notes ๐Longitudinal EHRs ๐ฉบClinician-validated task labels #DataCentricAI #Multimodal #3Dimaging 1/
Lots of hype around #LLMs in healthcare. What do clinicians really want from an #LLM? We asked them! Introducing #MedAlign, the first dataset of clinician-generated instructions + responses for EHRs ๐ฅ๐ค ๐Paper: arxiv.org/abs/2308.14089 ๐Website: medalign.stanford.edu
There's a lot of excitement around large language models (LLMs) for healthcare. But what's hype and what's real? In this paper, weย review 84 such models to help health systems better understand and critically evaluate these technologies. Paper:ย nature.com/articles/s4174โฆย (1/7)
Check out my progress today with the Activity app on my #AppleWatch.
