Text-Referenced
Input: style tags + target character.
"Active, Cute, Vintage" + "V"
Output: a valid SVG path that matches requested style and content.
CVPR 2026
1Meta AI 2UC Santa Cruz *Work done at Meta
February 7, 2026
We argues this is a domain-data gap: glyph programs are underrepresented in standard LLM corpora.
A practical reason is that most real font data is distributed in binary files (for example .ttf and
.otf) rather than directly exposed as text-like SVG path programs. This highlights a current
generalization boundary of LLMs.
VecGlypher is a single multimodal language model that generates editable SVG glyph outlines directly from either natural-language style prompts or reference glyph images. Instead of relying on raster intermediates and post-vectorization, it autoregressively emits path tokens in one pass. The training recipe combines large-scale continuation on noisy Envato fonts with post-training on expert-tagged Google Fonts, alongside typography-aware preprocessing for stable long-sequence decoding.
Input: style tags + target character.
"Active, Cute, Vintage" + "V"
Output: a valid SVG path that matches requested style and content.
Input: 1-8 exemplar glyph images + target character.
[ref glyphs] + "V"
Output: style-consistent target glyph with closed, editable outlines.
Filter malformed and duplicate fonts, normalize coordinate systems, keep only SVG path geometry, and quantize to one decimal.
Stage 1 (Envato, text-referenced): learn SVG syntax and long-horizon geometry. Stage 2 (Google Fonts, text+image): align style conditioning with geometry.
A single multimodal LLM predicts SVG tokens directly. No raster denoiser or vector post-optimizer is required.
Google Fonts: 2,497 fonts
Envato: 39,497 fonts
Google: 1,117 families
Envato: 23,543 families
Google: 157,899
Envato: 2,495,363
VecGlypher outperforms both general-purpose LLMs on text-referenced generation and dedicated vector-font baselines on image-referenced generation.
| Model | R-ACC ↑ | CD ↓ | DINO ↑ | FID ↓ |
|---|---|---|---|---|
| Claude Sonnet 4.5 | 46.65 | 5.28 | 88.31 | 19.59 |
| GPT-5 | 43.98 | 6.12 | 86.92 | 29.00 |
| VecGlypher 27B (T,I,A) | 100.5 | 1.72 | 94.22 | 3.46 |
| VecGlypher 70B (T,A) | 100.4 | 1.68 | 94.28 | 3.34 |
| Model | R-ACC ↑ | CD ↓ | DINO ↑ | FID ↓ |
|---|---|---|---|---|
| DeepVecFont-v2 | 37.86 | 14.58 | 79.41 | 115.5 |
| DualVector | 49.20 | 16.45 | 79.57 | 105.5 |
| VecGlypher 27B (T,I,A) | 99.12 | 1.18 | 95.82 | 2.32 |
@article{VecGlypher,
title = {VecGlypher: Unified Vector Glyph Generation with Language Models},
author = {Huang, Xiaoke and Gauri, Bhavul and Ng, Kam Woh and Ng, Tony and Xu, Mengmeng and Liu, Zhiheng and Ren, Weiming and An, Zhaochong and Zhou, Zijian and Qiu, Haonan and Zhou, Yuyin and He, Sen and Wang, Ziheng and Xiang, Tao and Han, Xiao},
journal = {arXiv preprint arXiv},
year = {2026}
}