Writer: PDF 2.0 serializer + xref
At a glance
Section titled “At a glance”The Writer module serializes a document into Portable Document Format (PDF) bytes. It selects a version strategy, writes the object graph, and emits the cross-reference structure and trailer.
Install
Section titled “Install”composer require nextpdf/core:^3Conceptual overview
Section titled “Conceptual overview”Use PdfWriter as the entry point. Pass a DocumentData value object to
write(). The method returns the complete PDF as a byte string. The writer
assembles the object graph, assigns object numbers, records byte offsets, and
writes the cross-reference structure last.
For each call, the writer uses one serialization strategy. The
PdfSerializationStrategy interface defines four methods: writeHeader(),
getCatalogVersion(), writeXrefAndTrailer(), and usesXrefStream(). Three
strategies implement it. Pdf20StreamStrategy writes the %PDF-2.0 header,
sets the catalog version to /2.0, and emits a cross-reference stream.
Pdf17TableStrategy writes %PDF-1.7 and a classic cross-reference table.
Pdf14TableStrategy writes %PDF-1.4 and a cross-reference table. PdfWriter
picks the strategy with a match on DocumentData::$outputProfile. The
default is Pdf20StreamStrategy.
The PdfOutputProfile enum carries the three target versions: Pdf20,
Pdf17, and Pdf14. It exposes headerVersion(), catalogVersion(),
allowsObjectStreams(), and usesXrefStream(). An archival conformance mode
overrides the chosen profile before strategy selection. Pdf14FeatureGuard
rejects PDF 2.0 features when the profile is Pdf14.
A cross-reference stream maps each object number to its byte offset, as defined
by ISO 32000-2 §7.
Incremental updates append new objects to the end of the file, as defined by
ISO 32000-2 §7.5.6.
The writer escapes every literal string through the canonical
PdfStringEscaper::escapeLiteral() path, which follows the normative escape
table in ISO 32000-2 §7.3.4.2 (ADR-015).
The writer supports deterministic output. setDeterministicMode() pins object
identifiers and dictionary key order. setReproducibleClock() pins the document
timestamp. With both pins set, a fixed input produces byte-identical output. The
writeChunked() method returns a generator that yields the PDF in fixed-size
chunks. Streaming/StreamingPdfWriter writes one page at a time to a
caller-supplied stream for documents that exceed the memory budget.
Linearizer rewrites a finished PDF into a linearized layout. It places the
first page early, so a viewer can show it before the full download completes.
shadowValidate() checks the rewrite without changing the input.
Caution.
PdfWriter.phpandLinearizer.phpare critical to byte offsets and the object graph (manifest danger zones). Do not change object numbering or xref offset arithmetic without the Writer golden suite.
API surface
Section titled “API surface”| Class | Key methods | Role |
|---|---|---|
PdfWriter | write(DocumentData): string, writeChunked(DocumentData, int): Generator, setDeterministicMode(), setReproducibleClock(), setOutputColorProfile(), getLastXrefOffset(), getFileId() | Primary serializer |
PdfSerializationStrategy (interface) | writeHeader(), getCatalogVersion(), writeXrefAndTrailer(), usesXrefStream() | Version strategy contract |
Pdf20StreamStrategy | writeHeader() → %PDF-2.0, getCatalogVersion() → /2.0, usesXrefStream() → true | PDF 2.0 xref-stream strategy |
Pdf17TableStrategy | writeHeader() → %PDF-1.7, xref table | PDF 1.7 xref-table strategy |
Pdf14TableStrategy | writeHeader() → %PDF-1.4, xref table | PDF 1.4 xref-table strategy |
PdfOutputProfile (enum) | Pdf20, Pdf17, Pdf14; headerVersion(), catalogVersion(), allowsObjectStreams() | Target-version selector |
PdfXrefWriter | generateFileId(), finalizeTrailerAndXref() | File ID + trailer/xref finalization |
Linearizer | linearize(string): string, shadowValidate(string): array | Fast-web-view rewrite |
Streaming\StreamingPdfWriter | open(), newPage(), close() | Single-pass streaming writer |
Run composer docs:generate-api-php -- --module=Writer to generate the full
PHPDoc table.
Code sample — Quick start
Section titled “Code sample — Quick start”Source: examples/02-pdf-factory.php.
<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use NextPDF\Writer\PdfWriter;
$writer = new PdfWriter();$pdfBytes = $writer->write($documentData);
file_put_contents('out.pdf', $pdfBytes);The default profile is PDF 2.0. Output starts with %PDF-2.0 and ends with a
cross-reference stream.
Code sample — Production
Section titled “Code sample — Production”This pins determinism and a fixed clock for byte-identical output, then streams the result in fixed chunks.
<?php
declare(strict_types=1);
require_once __DIR__ . '/../vendor/autoload.php';
use DateTimeImmutable;use NextPDF\Writer\PdfWriter;use NextPDF\Writer\ReproducibleClock;
$pinned = new DateTimeImmutable('2026-01-01T00:00:00Z');
$writer = new PdfWriter();$writer->setDeterministicMode($pinned, 'nextpdf-fixed-file-id');$writer->setReproducibleClock(new ReproducibleClock($pinned));
$out = fopen('php://output', 'wb');foreach ($writer->writeChunked($documentData, chunkSize: 65536) as $chunk) { fwrite($out, $chunk);}fclose($out);Edge cases & gotchas
Section titled “Edge cases & gotchas”- Only one strategy runs per
write()call. The writer resets the strategy from the profile on every call. A prior call does not leak its version. - An archival conformance mode overrides the requested profile. A PDF/A-3 build forces PDF 1.7. A PDF/A-4 build forces PDF 2.0.
- Byte-identical output requires both pins. Set the deterministic mode and a reproducible clock. One pin alone is not enough.
writeChunked()yields a generator. Consume it completely. A partial read produces a truncated, invalid PDF.Linearizerrewrites cross-reference offsets. In a pipeline that cannot tolerate a failed rewrite, runshadowValidate()first.Pdf14TableStrategyisfinal readonly. The PDF 1.4 path rejects PDF 2.0 features throughPdf14FeatureGuard; it does not degrade them.
Performance
Section titled “Performance”Serialization is linear in the object count and total byte size. The
cross-reference stream adds one pass over the object table. writeChunked()
holds the assembled document but yields it in bounded slices, so peak memory is
the document size plus one chunk. Streaming\StreamingPdfWriter does not hold
the whole document; use it for inputs larger than the memory budget. The
reference workload budget is 1500 ms wall and 64 MB peak. Linearization adds a
second full pass and a measure pass. Budget for it explicitly.
Security notes
Section titled “Security notes”The writer serializes a trusted in-memory object graph. Its inputs are the main
threat boundary. Every literal string passes through the canonical
PdfStringEscaper::escapeLiteral() (ADR-015), so embedded control bytes cannot
break out of a string token. Encryption is wired through PdfEncryptionWriter
and the /Encrypt trailer entry. Public-key encryption is rejected with an
explicit exception rather than silently downgraded. The deterministic and
reproducible-clock modes remove timestamp and ordering side channels from the
output. See /modules/core/security/ for the document threat model and the
encryption trust boundary.
Conformance
Section titled “Conformance”The Writer produces PDF 2.0 file structures: the %PDF-2.0 header, a /2.0
catalog version, a cross-reference stream, and literal-string escaping per the
ISO 32000-2 §7.3.4.2 escape table. These are implementation facts. Evidence
lives in src/Writer/Pdf20StreamStrategy.php, src/Writer/PdfSerializationStrategy.php,
and the strategy selection in src/Writer/PdfWriter.php. The behavior is
exercised by tests/Unit/Writer/ (192 tests, including the
Pdf20StreamStrategy, PdfXrefWriter, and Linearizer* suites) and the
tests/Golden/PdfWriter/PdfWriterGoldenBaselineSmokeTest baseline.
This is not a claim of full PDF 2.0 conformance. Full ISO 32000-2
conformance is a property of a complete document validated by an external
oracle, not of the serializer alone. End-to-end conformance is asserted only
where an oracle confirms it: tests/Integration/Accessibility/VeraPdfUa2GoldenTest
validates a generated fixture against veraPDF for PDF/UA-2, and
tests/Standards/Profile/PdfRConformanceTest covers the PDF/R profile. The
veraPDF golden test skips when the veraPDF binary is absent from the runner,
so it is an opt-in oracle gate, not an unconditional one. Set VERAPDF_BINARY
to run it. Archival-profile selection (PDF/A-3 → PDF 1.7, PDF/A-4 → PDF 2.0) is
decided by ADR-011 and the conformance mode, and validated by the conformance
suites in /modules/core/conformance/. Outside those oracle-backed profiles,
state that the Writer “produces PDF 2.0 structures; conformance is validated by
veraPDF for the PDF/UA-2 profile” rather than asserting unqualified conformance.