Xiaoke Huang

Master's student at THU; Computer Vision & Computer Graphics

I am third-year Master’s student at Tsinghua University, advised of Prof. Jiwen Lu and Prof. Yansong Tang. Before that, I got B.S. in Computer Science from Beijing Normal University (BNU) in 2021. My current research focuses on multi-modal learning and human digitization.

My primary research interest is in developing multi-modal generative systems, encompassing modalities such as 2D/3D vision and language, to enable general-purpose understanding, reasoning, and planning in the physical world. I am also interested in the alignment of such systems. I have a wide range of interests in both computer vision and computer graphics.

I am excited to apply for Fall 2024 Ph.D. programs and investigate potential collaborations. If you are interested in discussing opportunities or have any questions, please feel free to EMAIL me. I genuinely appreciate your consideration and look forward to connecting with you.


  • 2022-09: A paper on language-guided ordinal regression accepted by NeurIPS’22
  • 2021-03: A paper on uncertainty learning and ordinal regression accepted by CVPR’21

Publications and Preprints

For more works please check here.

* indicates equal contribution.

Vision-Language Learning

  • Segment and Caption Anything
    Xiaoke Huang, Jianfeng Wang, Yansong Tang, Zheng Zhang, Han Hu, Jiwen Lu, Lijuan Wang, Zicheng Liu
    Preprint (accepted by CVPR 2024), 2023
    [project page] [paper] [code]

  • OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
    Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
    Conference on Neural Information Processing Systems (NeurIPS), 2022
    [project page] [paper] [code] [中文解读]


Human Digitization

  • EMA: Efficient Meshy Neural Fields for Animatable Human Avatars
    Xiaoke Huang, Yiji Cheng, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
    Preprint, 2023
    [project page] [paper] [code] [demo video]

  • SD-NeRF: Lifelike Talking Head Animation via Spatially-adaptive Dual-driven NeRFs.
    Shuai Shen*, Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Jie Zhou, Jiwen Lu
    IEEE Transactions on Multimedia (TMM), 2023



Research Intern, Microsoft Research - Asia (MSRA), Beijing, China. April-September, 2023.



  • National Scholarship of China, March 2021
  • Huiyan Talent Second Prize of THU, Nov. 2022
  • JingShi First Prize of BNU, Oct. {2018, 2019, 2020}

Programming Contest

  • ACM-ICPC Contest Jiang Su, Silver Medal, June 2018
  • ICPC Asia Regional Contest {Nanchang, Xuzhou, Shanghai}, Bronze Medal, {June. Nov. Nov.} 2019


Reviewer of CVPR’{22,23,24}, ICCV’23, ECCV’22, FG’{23,24}, VCIP’22.



I take pleasure in reading non-fiction books. A recent and highly recommended read is Johann Hari’s “Stolen Focus”, which thoroughly examines the challenges of living in this information-saturated era.

Additionally, I often go hiking in the suburbs, where I find tranquility and inner peace.

Since June 2023, I began to become passionate about bodybuilding as it deeply connects me with the sensation of “being present”.

English Proficiency: IELTS 7.5 (L/R 8.5, W/S 6.5).

My Chinese name is 黄 小可 (Huang Xiaoke).