High-volume document generation

Spec: ISO 24495-1:2023, §5 Spec: ISO 9241-112:2025, §6.1.2.3 Evidence: Benchmark-backed

At a glance

Generating one PDF is a function call. Generating a hundred thousand on a schedule is a systems problem: memory that must stay bounded, work that must be parallel, and numbers that must mean something. This page walks the batch-generation scenario from the throughput question through a deployment that holds up. It says plainly that the honest answer is “measure it on your documents”, not a headline figure.

Why this matters

Batch generation fails in two characteristic ways. The first is memory creep. A long-lived worker accumulates retained state document by document until it is killed mid-batch, and the run is neither complete nor cleanly failed. The second is a confident but meaningless number: a benchmark from a trivial document is used to size a fleet that renders complex ones, and it turns out to be wrong only under production load.

You can avoid both, but only if you design the memory shape and measurement method from the start, instead of adding them after the first incident.

The short version

The unit of work is a disposable document, not a shared one. Keep process-lifetime data (fonts, image cache) in shared registries; create and discard the document per render.
Memory has two parts, and only one matters for a long-lived worker. Transient peak during a render is expected; retained memory that does not come back is the leak that ends a batch.
Throughput is parallelism plus bounded per-render cost. The shape that holds up is a queue feeding stateless workers, each rendering and releasing.
A number without its method is not a number. NextPDF reports per-render measurements as data you collect, and refuses unqualified speed claims. The most important figure is the one you measure on your own templates (ISO 24495-1 §5.x11 — put the message that matters where the reader finds it).

How NextPDF approaches it

The architecture is built around a single decision: state that lives for the process is shared and immutable; state that lives for a render is fresh and thrown away. Fonts are structural data parsed once and then locked, so no render can mutate them and pollute the next one. The image cache is a bounded least-recently-used store that is never locked, so memory stays capped without leaking across requests. The document factory is a stateless singleton; every document it creates is disposable.

That separation is what makes a worker safe to run for hours under Octane, RoadRunner, or Swoole. It removes the failure mode where “request N corrupts request N+1” by construction, rather than by hoping the document resets itself.

The scenario has four stages.

Warm the shared state once On worker boot, parse and lock the font registry and size the image cache. This cost is paid once, not per document.
Enqueue the work A queue holds the render jobs. The queue is the throughput dial — workers scale horizontally behind it.
Render on a disposable document Each worker creates a fresh document from the factory, renders, emits the bytes, and lets the document go.
Measure, then size Collect per-render time and peak memory. Size the fleet from measurements on your own templates, not a generic figure.

The high-volume scenario end to end: shared immutable state is warmed once; each job renders on a disposable document and releases; throughput scales by adding workers, not by enlarging one.

The framework bridges make this shape the default rather than something you assemble. The Laravel service provider registers the font registry as a warmed, locked singleton and binds the document as a fresh instance per resolve. It ships a queued job with bounded tries, a timeout, and exponential backoff. That job validates its output path on the worker side, because a serialized queue payload can be tampered with in transit. The Symfony and CodeIgniter integrations follow the same disposable-document, shared-registry discipline.

What the evidence says

The memory model is code-backed. Evidence: Code-backed The Laravel NextPdfServiceProvider registers the FontRegistry as a singleton that is warmed then lock()-ed, the ImageRegistry as a bounded-LRU singleton that is deliberately not locked, and the Document as a per-resolve binding via a stateless factory. The disposable-document model is in the wiring, not in prose. The GeneratePdfJob carries tries, timeout, and backoff and re-validates its output path inside handle().

The measurement surface is benchmark-backed. Evidence: Benchmark-backed The engine emits an immutable RenderReport per generation carrying render time in milliseconds, peak memory in bytes, page count, warning counts, and fallback occurrences — the exact inputs you need to size a fleet. A separate memory-fragmentation analyzer distinguishes peak (transient) from retained memory. That distinction tells you whether a long-lived worker is healthy or slowly leaking. The benchmark harness itself is configured for repeated revolutions with warmup, because a single timing is noise.

The discipline is a design principle: Evidence: Design principle NextPDF reports performance with its method and refuses unqualified speed claims. That is consistent with how this documentation is written — Spec: ISO 24495-1:2023, §5 places the message that matters where the reader will find it. The message that matters here is “measure your own workload”.

Practical example

The code below is the disposable-document loop with measurement. The engine produces the RenderReport; the queue is your infrastructure.

<?php

declare(strict_types=1);

use NextPDF\Contracts\DocumentFactoryInterface;
use NextPDF\Observability\RenderReport;
use Psr\Log\LoggerInterface;

/**
 * One batch worker iteration: render, emit, release, measure.
 *
 * The factory and its registries are process-lifetime singletons; the
 * document is disposable. Retained memory must return to baseline between
 * iterations or the worker is leaking.
 *
 * @param iterable<int, callable(\NextPDF\Core\Document): \NextPDF\Core\Document> $jobs
 */
function runBatch(
    DocumentFactoryInterface $factory,
    LoggerInterface $logger,
    iterable $jobs,
): void {
    foreach ($jobs as $jobId => $build) {
        $startedAt = hrtime(true);

        // Fresh, disposable document — shares the warmed registries.
        $doc = $factory->create();
        $doc = $build($doc);
        $bytes = $doc->getPdfData();

        // Hand the bytes off to your sink (object store, response, etc.).
        unset($doc, $bytes); // let the per-render state go

        $elapsedMs = (hrtime(true) - $startedAt) / 1_000_000;

        $logger->info('pdf.render.complete', [
            'job_id'          => $jobId,
            'render_time_ms'  => round($elapsedMs, 2),
            'peak_memory_mb'  => round(memory_get_peak_usage(true) / 1_048_576, 2),
        ]);
    }
}

The unset() is not cosmetic. The per-render state is meant to be released each iteration so retained memory returns to baseline. A worker whose baseline climbs across iterations is the failure this loop is designed to avoid.

Common misconception

The headline misconception is “how many PDFs per second can NextPDF do?” as if it had one answer. It does not, and quoting one is how fleets get mis-sized. Render cost is dominated by the document, so the only number worth acting on is the one measured on your own templates with the engine’s own per-render report. A figure without the document, the hardware, and the method behind it is decoration, not data.

The second misconception is that peak memory is the thing to watch. Peak is transient and expected — it returns. The number that ends a batch is retained memory that does not return. That is exactly why the engine separates the two.

Limits and boundaries

There is no universal throughput figure, and this page deliberately states none. Render cost depends on your documents; measure with the per-render report.
Bounded memory depends on the disposable-document model being used. Holding a document across many renders, or sharing mutable per-render state, reverts the guarantee. The framework bridges default to the safe shape. Hand-rolled wiring must replicate it.
The image cache is bounded, not unbounded. Under heavy unique-image workloads the LRU evicts. That is the design, not a regression.
Worker-pool sizing, queue choice, and autoscaling are deployment decisions outside the engine. NextPDF supplies the measurements and the bounded primitive. It does not run your queue.
RenderReport is data, not a verdict. It tells you what happened on a render. Turning that into a capacity plan is your analysis.
This page is benchmark-backed for the measurement surface and code-backed for the memory model. It asserts no specific rate.

Queued high-volume generation primitives — edition availability
Edition	Availability
Core	The disposable-document model, shared immutable registries, the per-render `RenderReport`, and the memory-fragmentation analyzer are Core. Plain high-volume PDF generation needs no commercial tier.
Pro	Same primitives; commercial features (signing, PDF/A) add per-render cost you should measure, not assume.
Enterprise	Same primitives; structured-invoice and validation work adds further per-render cost that scales with payload and rule-set size.

Memory and streaming — how the engine keeps memory bounded on large documents and where it streams.
Honest benchmarking — what a benchmark number is worth without its method, and how NextPDF reports performance.
Operating NextPDF in production — turning per-render reports into health signals once the batch runs for real.

Glossary

Disposable document — a document instance created for a single render and discarded after, so no state leaks into the next render.
Shared registry — process-lifetime, immutable-after-warmup state (fonts, image cache) reused across renders without per-render cost.
Peak memory — the transient high-water mark during a render; expected and returns to baseline.
Retained memory — memory still held after a render completes; a rising retained baseline across renders is a leak.
Worker — a long-lived process that pulls render jobs from a queue; must stay memory-bounded to survive a batch.
RenderReport — the engine’s immutable per-render metrics snapshot (time, peak memory, page count, warnings) used to size capacity from real data.