Getting Started¶
TorchFont ships a small but opinionated toolbox for working with vector fonts in PyTorch. This page walks you through installation, validates the environment, and provides a short end-to-end example that loads glyphs, applies transforms, and feeds them into a dataloader.
Prerequisites¶
Python 3.10 or newer
PyTorch 2.6 or newer (CPU-only wheels are enough to get started)
fonttoolsfor parsing outlines (installed automatically as a dependency)gitwhen you plan to usetorchfont.datasets.GoogleFontsortorchfont.datasets.FontRepo
Installation¶
The recommended path is to manage dependencies with uv.
uv add torchfont
If you prefer pip:
pip install torchfont
You can confirm the installation by importing the package and printing the version number:
import torchfont
print(torchfont.__version__)
Downloading Fonts¶
TorchFont does not bundle any font assets. For a lightweight sandbox, clone a
subset of Google Fonts into data/google_fonts with the provided dataset
helper:
from torchfont.datasets import GoogleFonts
dataset = GoogleFonts(
root="data/google_fonts",
ref="main",
download=True, # performs a sparse checkout the first time
)
The dataset transparently keeps the sparse checkout up to date on subsequent
runs. You can point root to any writable cache directory.
First Glyph Dataset¶
The torchfont.datasets.FontFolder dataset turns a directory of .otf
or .ttf files into an indexable PyTorch-style dataset. Each item contains the
pen command types and normalized coordinates emitted by
torchfont.io.pens.TensorPen, plus style/content labels.
from torch.utils.data import DataLoader
from torchfont.datasets import FontFolder
from torchfont.transforms import Compose, LimitSequenceLength, Patchify
transform = Compose(
(
LimitSequenceLength(max_len=256),
Patchify(patch_size=32),
)
)
dataset = FontFolder(
root="~/fonts", # scans recursively
codepoint_filter=[ord("A"), ord("B"), ord("C")],
transform=transform,
)
def collate_fn(batch):
types, coords, styles, contents = [], [], [], []
for (sample, (style, content)) in batch:
t, c = sample
types.append(t)
coords.append(c)
styles.append(style)
contents.append(content)
return types, coords, styles, contents
loader = DataLoader(dataset, batch_size=32, shuffle=True, collate_fn=collate_fn)
types, coords, style_labels, content_labels = next(iter(loader))
Working with Google Fonts¶
The torchfont.datasets.GoogleFonts dataset mirrors the official
repository via sparse checkout. It exposes the same sample structure as
FontFolder, so you can reuse the exact transforms and dataloaders:
from torchfont.datasets import GoogleFonts
google_fonts = GoogleFonts(
root="data/google_fonts",
ref="main",
transform=transform,
download=True,
patterns=("ofl/*/*.ttf", "ofl/*/*.otf"),
)
sample, (style_label, content_label) = google_fonts[0]
Next Steps¶
Continue to the User Guide for a deeper dive into the dataset, transform, and I/O utilities.
Review Contributor Guide if you plan to submit fixes or new features.