I am a PhD Student in VLAA lab, UCSC, advised by Prof. Yuyin Zhou and Prof. Cihang Xie, working on training large-scale neural networks for biology discovery. Previously I got my M.Eng from Tsinghua University (THU) advised by Prof. Jiwen Lu and Prof. Yansong Tang, working on vision-language models and human digitization. I got my B.S. in CS from Beijing Normal University (BNU) in 2021.
News
- 2024-10: A preprint on text-to-image long story visualization.
- 2024-09: Start my PhD journey at UCSC!
- 2024-06: A preprint on visual compression with LLM, featured in Hugging Face 🤗 Daily Papers!
- 2024-06: A preprint on unified in-context medical vision models.
- 2024-02, 2023-12: (Accepted by CVPR’24) Excited to share a new preprint enhancing SAM with regional captioning capabilities (featured in Hugging Face 🤗 Daily Papers)! Had amazing days at Microsoft!
[More]
- 2023-05: Start an internship at Microsoft Research Lab - Asia (MSRA)
- 2023-03: A preprint on efficient human digitization
- 2022-09: A paper on language-guided ordinal regression accepted by NeurIPS’22
- 2021-03: A paper on uncertainty-aware ordinal regression accepted by CVPR’21
Publications and Preprints
For more works please check here.
* indicates equal contribution.
Vision-Language Learning
Segment and Caption Anything
Xiaoke Huang, Jianfeng Wang, Yansong Tang, Zheng Zhang, Han Hu, Jiwen Lu, Lijuan Wang, Zicheng Liu
Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[project page] [paper] [code]OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
Conference on Neural Information Processing Systems (NeurIPS), 2022
[project page] [paper] [code] [ä¸ć–‡č§ŁčŻ»]
Human Digitization
EMA: Efficient Meshy Neural Fields for Animatable Human Avatars
Xiaoke Huang, Yiji Cheng, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
Preprint, 2023
[project page] [paper] [code] [demo video]SD-NeRF: Lifelike Talking Head Animation via Spatially-adaptive Dual-driven NeRFs.
Shuai Shen*, Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Jie Zhou, Jiwen Lu
IEEE Transactions on Multimedia (TMM), 2023
[paper]
Internship
Research Intern, Microsoft Research - Asia (MSRA), Beijing, China. April-September, 2023.
Project: Generative Regional Understanding with Vision and Language
Work with Dr. Jianfeng Wang, Dr. Zheng Zhang,
Dr. Han Hu, Dr. Lijuan Wang and Dr. Zicheng Liu.
Awards
Scholarship
- National Scholarship of China, March 2021
- Huiyan Talent Second Prize of THU, Nov. 2022
- JingShi First Prize of BNU, Oct. {2018, 2019, 2020}
Programming Contest
- ACM-ICPC Contest Jiang Su, Silver Medal, June 2018
- ICPC Asia Regional Contest {Nanchang, Xuzhou, Shanghai}, Bronze Medal, {June. Nov. Nov.} 2019
Activities
Reviewer of CVPR’{22,23,24}, ICCV’23, ECCV’22, ICML’24, FG’{23,24}.
Links
- Github: https://www.github.com/xk-huang
- Google Scholar: https://scholar.google.com/citations?user=BD9AT04AAAAJ&hl=en
- Twitter: https://twitter.com/xiaoke_shawn_h
- Linkedin: https://www.linkedin.com/in/xiaoke-huang-283470189/
- E-mail: click me
Misc
I take pleasure in reading non-fiction books. A recent and highly recommended read is Johann Hari’s “Stolen Focus”, which thoroughly examines the challenges of living in this information-saturated era.
Additionally, I often go hiking in the suburbs, where I find tranquility and inner peace.
Since June 2023, I began to become passionate about bodybuilding as it deeply connects me with the sensation of “being present”.
English Proficiency: IELTS 7.5 (L/R 8.5, W/S 6.5).
My Chinese name is 黄 小可 (Huang, Xiaoke).