My name is Xiaoke Huang, and I am a Ph.D. student at the UC Santa Cruz, advised by Prof. Yuyin Zhou and Prof. Cihang Xie. My research focuses on multi-modal reasoning, agentic models, and AI for healthcare; previously, I received my Master’s degree from Tsinghua University (worked on vision–language learning and 3D reconstruction) and Bachelor’s degree from Beijing Normal University. I have interned at Microsoft Research and Meta.

News

[More]
  • 2025-06 Start an internship at Meta.
  • 2025-03 A preprint about test-time scaling for medical reasoning LLMs.
  • 2025-01 A preprint on human DNA methylation prediction.
  • 2024-10 A preprint on text-to-image long story visualization.
  • 2024-09 Start my PhD journey at UCSC.
  • 2024-06 (Accepted to CVPR’25) A preprint on visual compression with LLM, featured in Hugging Face 🤗 Daily Papers!
  • 2024-06 A preprint on unified in-context medical vision models.
  • 2023-12 (Accepted to CVPR’24) Excited to share a new preprint enhancing SAM with regional captioning capabilities (featured in Hugging Face 🤗 Daily Papers)! Had amazing days at Microsoft!
  • 2023-05 Start an internship at Microsoft Research Lab - Asia (MSRA)
  • 2023-03 A preprint on efficient human digitization
  • 2022-09 A paper on language-guided ordinal regression accepted to NeurIPS’22
  • 2021-03 A paper on uncertainty-aware ordinal regression accepted to CVPR’21

Publications and Preprints

For more works please check here.

* indicates equal contribution.

Vision-Language Learning

  • Segment and Caption Anything
    Xiaoke Huang, Jianfeng Wang, Yansong Tang, Zheng Zhang, Han Hu, Jiwen Lu, Lijuan Wang, Zicheng Liu
    Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [project page] [paper] [code]

    sca-teaser
  • OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
    Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
    Conference on Neural Information Processing Systems (NeurIPS), 2022
    [project page] [paper] [code] [中文解读]

    ordinalclip_framework

Human Digitization

Internship

Research Intern, Meta (MGenAI), London, UK. June-Nov., 2025.

Research Intern, Microsoft Research, Asia, Beijing, China. April-September, 2023.

Misc

I enjoy reading non-fiction.

Some recent and highly recommended selections (Dec. 2025):

  • Exercised: Why Something We Never Evolved to Do Is Healthy and Rewarding,
  • The Story of the Human Body: Evolution, Health and Disease,
  • Fooled by Randomness: The Hidden Role of Chance in Life and in the Markets,
  • The Black Swan,
  • Antifragile: Things That Gain from Disorder,
  • Skin in the Game: Hidden Asymmetries in Daily Life,
  • and The Fabric of Reality: The Science of Parallel Universes and Its Implications.