Pen Plotter Autoresearch

An editorial field journal for an autonomous loop running on 22 generative art factories. 25,040 specimens, scored by six algorithmic signals and five vision-language judges. Set in Boska, printed on aged paper.

An autonomous research loop, in the spirit of Karpathy's nanochat experiments, applied to generative pen plotter art. Each cycle audits all factories, identifies the single weakest metric across the catalog, edits the corresponding generator, regenerates a thousand new variants, scores them, and either commits the change or reverts. The loop has been running since the third week of March 2026.

The most interesting finding is what the algorithms get wrong. A piece that scores 88.3 on six algorithmic signals, top five of 24,927, can score 1.8 on a vision-language judge that calls it 'a degenerate parameter combination, no discernible composition, focal point, or visual interest.' The algorithm cannot tell the difference between a starburst and a piece of woven fabric. That is what the judges are for.

The published artifact is a five-section editorial: method, anomaly, catalog, best of run, colophon. It is set in Boska and JetBrains Mono and reads like a real catalog. An addendum page contains every kept specimen, all 24,927 of them, laid out as a single grid you can scroll through and click into.

Highlights

22 procedural factories · 25,040 scored specimens · top score 89.1
Scoring v3: CLIP aesthetic predictor + six algorithmic signals + five vision-language judges
Ten-spread editorial: method · anomaly · catalog · best of run (incl. new colour factories) · colophon
Full addendum page with all 24,927 specimens, click-to-zoom
Set in Boska + JetBrains Mono, served on warm specimen paper

Try it live

Stack

HTML
Python
marimo
Boska
vpype
Claude · Gemini · Qwen · Cerebras

Related Project

Pen Plotter Art (Project)

Links

Live demo