Skip to content

Render PDFs safely in a long-running worker

A long-running PHP (PHP: Hypertext Preprocessor) worker (RoadRunner, Swoole, Laravel Octane) keeps one process alive across many requests. If you parse the same fonts and decode the same images for every request, you waste processor time and increase resident memory. NextPDF avoids that cost by separating two lifetimes:

  • Process-lifetime, shared: FontRegistry and ImageRegistry hold parsed font tables and decoded image caches. Create the registries once when the worker boots.
  • Request-lifetime, disposable: the Document returned by DocumentFactory::create(). Build it, write it, and let it leave scope. PHP’s garbage collector can then reclaim the entire object graph.

This recipe shows you the worker boot sequence, the per-request body, and the per-cycle reset that keeps peak memory flat.

Terminal window
composer require nextpdf/core:^3

The worker pattern does not require another extension, and a worker runtime (RoadRunner / Swoole / Octane) is optional. You can run the same factory pattern in a command-line interface (CLI) for loop, which is what the harness tests.

For worker code, start with DocumentFactory. Construct it once with a shared FontRegistry and ImageRegistry:

  • FontRegistry::warmup() parses the font files you provide and caches the parsed tables. FontRegistry::lock() freezes the registry so per-request code cannot mutate the shared font set. isLocked() reports the current state. After you lock the registry, it is safe to share across concurrent coroutines.
  • Construct ImageRegistry with a maxCacheBytes budget. When the budget is exceeded, it evicts least-recently-used entries. An image larger than the budget bypasses the cache instead of thrashing it.
  • ImageRegistry::reset() evicts every cached image while the registry remains ready to use. The next request repopulates it on demand. Call it on a cadence (every N requests, or when memoryUsage() crosses a threshold) to bring the high-water mark back to baseline.

Each document the factory creates is an independent Portable Document Format (PDF) file. ISO 32000-2 §7.5.5 defines the trailer of a never-updated file as having no Prev entry, and each worker request emits that kind of first-generation file. Requests therefore do not share document state, even though they share the font and image caches. The subset-font BaseFont tag (ISO 32000-2 §9.6.4) stays stable across requests because the parsed font lives in the shared registry.

This recipe uses the API surface generated from PHPDoc on NextPDF\Core\DocumentFactory, NextPDF\Typography\FontRegistry, NextPDF\Graphics\ImageRegistry, and NextPDF\Support\MemoryReport. The key members are DocumentFactory::create(), FontRegistry::warmup() / lock() / isLocked() / memoryUsage(), ImageRegistry::reset() / memoryUsage(), and MemoryReport::$currentBytes / $peakBytes / $entryCount / utilizationPercent().

<?php
declare(strict_types=1);
require_once __DIR__ . '/vendor/autoload.php';
use NextPDF\Core\DocumentFactory;
use NextPDF\Graphics\ImageRegistry;
use NextPDF\Typography\FontRegistry;
// --- Worker boot (run ONCE, before the request loop) ---------------------
$fonts = new FontRegistry();
$fonts->lock(); // freeze the shared font set
$images = new ImageRegistry(maxCacheBytes: 50 * 1024 * 1024);
$factory = new DocumentFactory($fonts, $images);
// --- Per request ---------------------------------------------------------
$doc = $factory->create();
$doc->setTitle('Worker output');
$doc->addPage();
$doc->setFont('helvetica', 'B', 16);
$doc->cell(0, 12, 'Generated in a shared-registry worker', newLine: true);
$doc->save(getenv('NEXTPDF_COOKBOOK_OUTPUT') ?: __DIR__ . '/out.pdf');
// $doc leaves scope here → GC reclaims the whole document tree.

The complete example honors the harness output channel. It shows the boot sequence, a bounded request loop, the per-cycle reset(), and a memory high-water assertion. This is the script the reproducibility harness runs twice.

<?php
declare(strict_types=1);
require_once __DIR__ . '/vendor/autoload.php';
use NextPDF\Core\DocumentFactory;
use NextPDF\Graphics\ImageRegistry;
use NextPDF\Typography\FontRegistry;
// --- Worker boot: shared, process-lifetime registries --------------------
$fonts = new FontRegistry();
$fonts->lock(); // share-safe once locked
$images = new ImageRegistry(maxCacheBytes: 50 * 1024 * 1024);
$factory = new DocumentFactory($fonts, $images);
$resetEvery = 4; // reset cadence in requests
$peakAfterReset = 0;
// --- Simulated request loop ---------------------------------------------
for ($request = 1; $request <= 12; $request++) {
$doc = $factory->create();
$doc->setTitle("Worker Request #{$request}");
$doc->addPage();
$doc->setFont('helvetica', 'B', 16);
$doc->cell(0, 12, "Worker Request #{$request}", newLine: true);
$doc->setFont('helvetica', '', 11);
$doc->cell(0, 8, 'Shared FontRegistry / ImageRegistry across requests.', newLine: true);
// The harness captures the LAST request's PDF via the side channel.
if ($request === 12) {
$doc->save(getenv('NEXTPDF_COOKBOOK_OUTPUT') ?: __DIR__ . '/out.pdf');
} else {
$doc->getPdfData(); // force render, then drop
}
unset($doc); // explicit end-of-request
// Bound the cache high-water mark on a fixed cadence.
if ($request % $resetEvery === 0) {
$images->reset();
\gc_collect_cycles();
$report = $images->memoryUsage();
$peakAfterReset = \max($peakAfterReset, $report->currentBytes);
}
}
$final = $images->memoryUsage();
fwrite(STDERR, \sprintf(
"fonts.locked=%s images.entries=%d images.current=%dB peak_after_reset=%dB\n",
$fonts->isLocked() ? 'yes' : 'no',
$final->entryCount,
$final->currentBytes,
$peakAfterReset,
));

STDOUT stays free for the harness; progress text goes to STDERR. The PDF is written only to NEXTPDF_COOKBOOK_OUTPUT; it is never echoed.

  • Lock before you share. Call FontRegistry::lock() at boot. A registry that is still mutable when two coroutines touch it is a data race. Use isLocked() as the assertion in a health check.
  • reset() is not unset(). ImageRegistry::reset() evicts cached binary data and keeps the registry usable, so it is the right periodic call. If you destroy and rebuild the registry for every request, you lose the benefit of the shared cache.
  • Oversized image bypass. An image larger than maxCacheBytes is decoded per use and never cached, so it cannot evict the working set. This is intentional. Size the budget for your common images, not for the rare large one.
  • The document must leave scope. If you hold the Document in a static, a long-lived container binding, or a closure captured by the worker, the entire object graph stays alive and per-request collection cannot work. An unset() call or a scope exit is mandatory.
  • gc_collect_cycles() placement. PHP’s cycle collector does not know about request boundaries. Call it after the reset cadence, not on every request. This bounds the high-water mark without adding collection cost to the hot path.
  • Determinism caveat. Document timestamps and the trailer /ID are regenerated per save (ISO 32000-2 §14.3). The captured PDF is therefore compared with the semantic profile (structural abstract syntax tree (AST) plus metadata, never volatile bytes). See “Conformance”.
  • The shared registry makes repeated font parsing and image decoding a one-time boot cost. Per-request work then becomes layout and serialization.
  • Peak resident memory is bounded by maxCacheBytes plus the working set of one in-flight document. The per-cycle reset() returns the cache to baseline, so a long-lived worker does not show an upward-trending sawtooth.
  • The performance_budget front-matter (wall_ms: 4000, peak_mb: 192) bounds the harness run of the 12-request loop. The harness enforces this budget; it is not a guarantee for arbitrary documents.
  • This recipe provides the §4.3 gap-list “memory/GC” coverage for #31. The backing examples/14-worker-factory.php exists, and tests/Cookbook/Php/WorkerSafeBatchRenderingRecipeTest.php adds the missing memory/GC assertion (peak does not grow across cycles after reset).
  • The worker pattern processes one document per request and shares only parsed-font and decoded-image caches. Document content does not cross the request boundary. A request cannot read another request’s document data through the shared registries.
  • Untrusted input still flows through the normal NextPDF input boundaries, and the worker pattern does not relax validation. Treat each request’s HyperText Markup Language (HTML) and asset input as untrusted, just as you would in a per-request process.
StatementSpecClausereference_id
The document modification date is regenerated on each save, so per-request output is not byte-stable.ISO 32000-2§14.3
Each worker document is a never-updated file (no Prev in the trailer); requests do not share document state.ISO 32000-2§7.5.5
The subset-font tag prefix is stable across requests because the parsed font lives in the shared registry.ISO 32000-2§9.6.4

Because the trailer /ID and the modification date are regenerated per save, this recipe is verified with the semantic reproducibility profile (structural abstract syntax tree (AST) equality plus a metadata-only comparison). A bitwise or structural claim would be inaccurate for worker output.