Render PDFs safely in a long-running worker
At a glance
Section titled “At a glance”A long-running PHP (PHP: Hypertext Preprocessor) worker (RoadRunner, Swoole, Laravel Octane) keeps one process alive across many requests. If you parse the same fonts and decode the same images for every request, you waste processor time and increase resident memory. NextPDF avoids that cost by separating two lifetimes:
- Process-lifetime, shared:
FontRegistryandImageRegistryhold parsed font tables and decoded image caches. Create the registries once when the worker boots. - Request-lifetime, disposable: the
Documentreturned byDocumentFactory::create(). Build it, write it, and let it leave scope. PHP’s garbage collector can then reclaim the entire object graph.
This recipe shows you the worker boot sequence, the per-request body, and the per-cycle reset that keeps peak memory flat.
Install
Section titled “Install”composer require nextpdf/core:^3The worker pattern does not require another extension, and a worker runtime
(RoadRunner / Swoole / Octane) is optional. You can run the same factory
pattern in a command-line interface (CLI) for loop, which is what the harness
tests.
Conceptual overview
Section titled “Conceptual overview”For worker code, start with DocumentFactory. Construct it once with a shared
FontRegistry and ImageRegistry:
FontRegistry::warmup()parses the font files you provide and caches the parsed tables.FontRegistry::lock()freezes the registry so per-request code cannot mutate the shared font set.isLocked()reports the current state. After you lock the registry, it is safe to share across concurrent coroutines.- Construct
ImageRegistrywith amaxCacheBytesbudget. When the budget is exceeded, it evicts least-recently-used entries. An image larger than the budget bypasses the cache instead of thrashing it. ImageRegistry::reset()evicts every cached image while the registry remains ready to use. The next request repopulates it on demand. Call it on a cadence (every N requests, or whenmemoryUsage()crosses a threshold) to bring the high-water mark back to baseline.
Each document the factory creates is an independent Portable Document Format
(PDF) file. ISO 32000-2 §7.5.5 defines the trailer of a never-updated file as
having no Prev entry, and each worker request emits that kind of
first-generation file. Requests therefore do not share document state, even
though they share the font and image caches. The subset-font BaseFont tag
(ISO 32000-2 §9.6.4) stays stable across requests because the parsed font lives
in the shared registry.
API surface
Section titled “API surface”This recipe uses the API surface generated from PHPDoc on
NextPDF\Core\DocumentFactory, NextPDF\Typography\FontRegistry,
NextPDF\Graphics\ImageRegistry, and NextPDF\Support\MemoryReport. The key
members are DocumentFactory::create(), FontRegistry::warmup() /
lock() / isLocked() / memoryUsage(), ImageRegistry::reset() /
memoryUsage(), and MemoryReport::$currentBytes / $peakBytes /
$entryCount / utilizationPercent().
Code sample — Quick start
Section titled “Code sample — Quick start”<?php
declare(strict_types=1);
require_once __DIR__ . '/vendor/autoload.php';
use NextPDF\Core\DocumentFactory;use NextPDF\Graphics\ImageRegistry;use NextPDF\Typography\FontRegistry;
// --- Worker boot (run ONCE, before the request loop) ---------------------$fonts = new FontRegistry();$fonts->lock(); // freeze the shared font set$images = new ImageRegistry(maxCacheBytes: 50 * 1024 * 1024);$factory = new DocumentFactory($fonts, $images);
// --- Per request ---------------------------------------------------------$doc = $factory->create();$doc->setTitle('Worker output');$doc->addPage();$doc->setFont('helvetica', 'B', 16);$doc->cell(0, 12, 'Generated in a shared-registry worker', newLine: true);$doc->save(getenv('NEXTPDF_COOKBOOK_OUTPUT') ?: __DIR__ . '/out.pdf');// $doc leaves scope here → GC reclaims the whole document tree.Code sample — Production
Section titled “Code sample — Production”The complete example honors the harness output channel. It shows the boot
sequence, a bounded request loop, the per-cycle reset(), and a memory
high-water assertion. This is the script the reproducibility harness runs twice.
<?php
declare(strict_types=1);
require_once __DIR__ . '/vendor/autoload.php';
use NextPDF\Core\DocumentFactory;use NextPDF\Graphics\ImageRegistry;use NextPDF\Typography\FontRegistry;
// --- Worker boot: shared, process-lifetime registries --------------------$fonts = new FontRegistry();$fonts->lock(); // share-safe once locked$images = new ImageRegistry(maxCacheBytes: 50 * 1024 * 1024);$factory = new DocumentFactory($fonts, $images);
$resetEvery = 4; // reset cadence in requests$peakAfterReset = 0;
// --- Simulated request loop ---------------------------------------------for ($request = 1; $request <= 12; $request++) { $doc = $factory->create(); $doc->setTitle("Worker Request #{$request}"); $doc->addPage(); $doc->setFont('helvetica', 'B', 16); $doc->cell(0, 12, "Worker Request #{$request}", newLine: true); $doc->setFont('helvetica', '', 11); $doc->cell(0, 8, 'Shared FontRegistry / ImageRegistry across requests.', newLine: true);
// The harness captures the LAST request's PDF via the side channel. if ($request === 12) { $doc->save(getenv('NEXTPDF_COOKBOOK_OUTPUT') ?: __DIR__ . '/out.pdf'); } else { $doc->getPdfData(); // force render, then drop }
unset($doc); // explicit end-of-request
// Bound the cache high-water mark on a fixed cadence. if ($request % $resetEvery === 0) { $images->reset(); \gc_collect_cycles(); $report = $images->memoryUsage(); $peakAfterReset = \max($peakAfterReset, $report->currentBytes); }}
$final = $images->memoryUsage();
fwrite(STDERR, \sprintf( "fonts.locked=%s images.entries=%d images.current=%dB peak_after_reset=%dB\n", $fonts->isLocked() ? 'yes' : 'no', $final->entryCount, $final->currentBytes, $peakAfterReset,));STDOUT stays free for the harness; progress text goes to STDERR. The PDF
is written only to NEXTPDF_COOKBOOK_OUTPUT; it is never echoed.
Edge cases & gotchas
Section titled “Edge cases & gotchas”- Lock before you share. Call
FontRegistry::lock()at boot. A registry that is still mutable when two coroutines touch it is a data race. UseisLocked()as the assertion in a health check. reset()is notunset().ImageRegistry::reset()evicts cached binary data and keeps the registry usable, so it is the right periodic call. If you destroy and rebuild the registry for every request, you lose the benefit of the shared cache.- Oversized image bypass. An image larger than
maxCacheBytesis decoded per use and never cached, so it cannot evict the working set. This is intentional. Size the budget for your common images, not for the rare large one. - The document must leave scope. If you hold the
Documentin a static, a long-lived container binding, or a closure captured by the worker, the entire object graph stays alive and per-request collection cannot work. Anunset()call or a scope exit is mandatory. gc_collect_cycles()placement. PHP’s cycle collector does not know about request boundaries. Call it after the reset cadence, not on every request. This bounds the high-water mark without adding collection cost to the hot path.- Determinism caveat. Document timestamps and the trailer
/IDare regenerated per save (ISO 32000-2 §14.3). The captured PDF is therefore compared with the semantic profile (structural abstract syntax tree (AST) plus metadata, never volatile bytes). See “Conformance”.
Performance
Section titled “Performance”- The shared registry makes repeated font parsing and image decoding a one-time boot cost. Per-request work then becomes layout and serialization.
- Peak resident memory is bounded by
maxCacheBytesplus the working set of one in-flight document. The per-cyclereset()returns the cache to baseline, so a long-lived worker does not show an upward-trending sawtooth. - The
performance_budgetfront-matter (wall_ms: 4000,peak_mb: 192) bounds the harness run of the 12-request loop. The harness enforces this budget; it is not a guarantee for arbitrary documents. - This recipe provides the §4.3 gap-list “memory/GC” coverage for #31. The
backing
examples/14-worker-factory.phpexists, andtests/Cookbook/Php/WorkerSafeBatchRenderingRecipeTest.phpadds the missing memory/GC assertion (peak does not grow across cycles after reset).
Security notes
Section titled “Security notes”- The worker pattern processes one document per request and shares only parsed-font and decoded-image caches. Document content does not cross the request boundary. A request cannot read another request’s document data through the shared registries.
- Untrusted input still flows through the normal NextPDF input boundaries, and the worker pattern does not relax validation. Treat each request’s HyperText Markup Language (HTML) and asset input as untrusted, just as you would in a per-request process.
Conformance
Section titled “Conformance”| Statement | Spec | Clause | reference_id |
|---|---|---|---|
| The document modification date is regenerated on each save, so per-request output is not byte-stable. | ISO 32000-2 | §14.3 | |
Each worker document is a never-updated file (no Prev in the trailer); requests do not share document state. | ISO 32000-2 | §7.5.5 | |
| The subset-font tag prefix is stable across requests because the parsed font lives in the shared registry. | ISO 32000-2 | §9.6.4 |
Because the trailer /ID and the modification date are regenerated per save,
this recipe is verified with the semantic reproducibility profile
(structural abstract syntax tree (AST) equality plus a metadata-only
comparison). A bitwise or structural claim would be inaccurate for worker
output.