I am third-year Master’s student at Tsinghua University, advised of Prof. Jiwen Lu and Prof. Yansong Tang. Before that, I got B.S. in Computer Science from Beijing Normal University (BNU) in 2021. My current research focuses on multi-modal learning and human digitization.
My primary research interest is in developing multi-modal generative systems, encompassing modalities such as 2D/3D vision and language, to enable general-purpose understanding, reasoning, and planning in the physical world. I am also interested in the alignment of such systems. I have a wide range of interests in both computer vision and computer graphics.
I am excited to apply for Fall 2024 Ph.D. programs and investigate potential collaborations. If you are interested in discussing opportunities or have any questions, please feel free to EMAIL me. I genuinely appreciate your consideration and look forward to connecting with you.
News
- 2024-02, 2023-12: (Accepted by CVPR 2024) Excited to share a new preprint enhancing SAM with regional captioning capabilities (featured in Hugging Face 🤗 Daily Papers)! Had amazing days at Microsoft!
- 2023-08: A paper on spontaneous expression-aware talking head is accepted by IEEE Transactions on Multimedia
- 2023-06: A preprint on text-guided 3D face generation
- 2023-05: Start an internship at Microsoft Research Lab - Asia (MSRA)
- 2023-03: A preprint on efficient human digitization
[More]
- 2022-09: A paper on language-guided ordinal regression accepted by NeurIPS’22
- 2021-03: A paper on uncertainty learning and ordinal regression accepted by CVPR’21—
Publications and Preprints
For more works please check here.
* indicates equal contribution.
Vision-Language Learning
Segment and Caption Anything
Xiaoke Huang, Jianfeng Wang, Yansong Tang, Zheng Zhang, Han Hu, Jiwen Lu, Lijuan Wang, Zicheng Liu
Preprint (accepted by CVPR 2024), 2023
[project page] [paper] [code]OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
Conference on Neural Information Processing Systems (NeurIPS), 2022
[project page] [paper] [code] [中文解读]
Human Digitization
EMA: Efficient Meshy Neural Fields for Animatable Human Avatars
Xiaoke Huang, Yiji Cheng, Yansong Tang, Xiu Li, Jiwen Lu, Jie Zhou
Preprint, 2023
[project page] [paper] [code] [demo video]SD-NeRF: Lifelike Talking Head Animation via Spatially-adaptive Dual-driven NeRFs.
Shuai Shen*, Wanhua Li*, Xiaoke Huang*, Zheng Zhu, Jie Zhou, Jiwen Lu
IEEE Transactions on Multimedia (TMM), 2023
[paper]
Internship
Research Intern, Microsoft Research - Asia (MSRA), Beijing, China. April-September, 2023.
Project: Generative Regional Understanding with Vision and Language
Work with Dr. Jianfeng Wang, Dr. Zheng Zhang,
Dr. Han Hu, Dr. Lijuan Wang and Dr. Zicheng Liu.
Awards
Scholarship
- National Scholarship of China, March 2021
- Huiyan Talent Second Prize of THU, Nov. 2022
- JingShi First Prize of BNU, Oct. {2018, 2019, 2020}
Programming Contest
- ACM-ICPC Contest Jiang Su, Silver Medal, June 2018
- ICPC Asia Regional Contest {Nanchang, Xuzhou, Shanghai}, Bronze Medal, {June. Nov. Nov.} 2019
Activities
Reviewer of CVPR’{22,23,24}, ICCV’23, ECCV’22, FG’{23,24}, VCIP’22.
Links
- Github: https://www.github.com/xk-huang
- Google Scholar: https://scholar.google.com/citations?user=BD9AT04AAAAJ&hl=en
- Twitter: https://twitter.com/xiaoke_shawn_h
- Linkedin: https://www.linkedin.com/in/xiaoke-huang-283470189/
- E-mail: click me
Misc
I take pleasure in reading non-fiction books. A recent and highly recommended read is Johann Hari’s “Stolen Focus”, which thoroughly examines the challenges of living in this information-saturated era.
Additionally, I often go hiking in the suburbs, where I find tranquility and inner peace.
Since June 2023, I began to become passionate about bodybuilding as it deeply connects me with the sensation of “being present”.
English Proficiency: IELTS 7.5 (L/R 8.5, W/S 6.5).
My Chinese name is 黄 小可 (Huang Xiaoke).