Zum Inhalt springen

How signatures sit in a PDF

Dieser Inhalt ist noch nicht in deiner Sprache verfügbar.

Spec: ETSI EN 319 142-1 Spec: RFC 5652 Evidence: Standard-backed

A PDF signature is not wrapped around the file. It is embedded inside it: a dictionary that names the signature, and a digest computed over a declared range of bytes that deliberately skips the signature value itself. This page explains that mechanism and, equally importantly, what it does not promise.

“The document is signed” is a sentence people use to make decisions. They wire it to a payment, an approval, a legal obligation. If you do not know precisely which bytes a signature covers, you cannot say what a valid result actually proves. A PDF can carry a perfectly valid signature and still show a reader content that the signer never saw, because that content was added after signing, in a region the signature never claimed. Knowing where the signature’s authority starts and stops is the difference between a defensible decision and a hopeful one.

  • A PDF signature lives in a signature dictionary and a signature field inside the document, not as an external envelope.
  • The signed bytes are declared by a ByteRange array: two (offset, length) segments that together cover the whole file except the hexadecimal signature value held in the Contents entry.
  • The digest of those two concatenated segments is what the cryptographic signature actually protects.
  • Anything appended later in a new revision is outside the original byte range. The original signature stays valid; it never made a claim about the new bytes.
  • An approval signature and a certification signature differ in scope: certification (DocMDP) constrains what later changes are allowed; approval does not.

NextPDF builds the signature according to the format’s model, in a fixed order, so the byte range is exact rather than approximate.

When the engine writes a signature, it first reserves a fixed-size slot for the Contents value and writes a ByteRange placeholder of fixed width. It waits until the complete document is written, including the cross-reference table and end-of-file marker. Only then does it compute the two real offsets, write them back into the placeholder without shifting any byte, hash the two segments, and place the resulting CMS object into the reserved slot. The placeholder is zero-padded to a constant length precisely so filling in the real numbers cannot move the bytes being hashed. This is the only order that produces a self-consistent signature. The engine treats any failure in this sequence as a hard error rather than a silent fallback.

For the PDF 2.0 profile, the signature object itself is a detached CMS SignedData structure. The PDF dictionary says where and how; the CMS object carries the who and the cryptographic proof.

  1. Step 1 of 4: ISO 32000-2 §12.8.1 — ByteRange digest & signature dictionary
  2. Step 2 of 4: ISO 32000-2 §12.8.3.3 — ETSI.CAdES.detached SubFilter
  3. Step 3 of 4: ETSI EN 319 142-1 PAdES baseline profile
  4. Step 4 of 4: RFC 5652 CMS SignedData in Contents
Where a PDF signature is defined, from the container format down to the cryptographic object: ISO 32000-2 specifies the dictionary and byte-range mechanism, ETSI EN 319 142-1 profiles it for PAdES, and RFC 5652 defines the CMS SignedData object placed in Contents.

Evidence: Standard-backed The mechanism is defined by Spec: ISO 32000-2, §12.8.1 . A byte-range digest is computed over a range of bytes indicated by the ByteRange entry. That range should be the entire file including the signature dictionary but excluding the signature value — the Contents entry. ByteRange is an array of integer pairs — starting offset and length. Discontiguous ranges are used specifically so the digest can omit the signature value itself.

For the PDF 2.0 profile, Spec: ISO 32000-2, §12.8.3.3 specifies that when the SubFilter is ETSI.CAdES.detached, the Contents value is a DER-encoded CMS SignedData object — the same structure Spec: RFC 5652 defines — and the PAdES profile of that object is the one Spec: ETSI EN 319 142-1 describes.

Scope is not uniform across signatures. Spec: ISO 32000-2, §12.7.4.5 defines the MDP permission: a value of 0 makes the signature an approval signature, while values 13 make it a certification signature that constrains which later modifications keep the document conformant. Same byte-range mechanism; different promise about the future.

NextPDF’s engine implements exactly this: a fixed-width ByteRange placeholder, the two-segment concatenated digest, and a detached CMS object in a reserved Contents slot, finalized only after the file is complete.

You rarely hand-build a ByteRange. The point of the example is to show the shape of the result so it is recognizable when you inspect a signed file.

<?php
declare(strict_types=1);
use NextPDF\Security\Signature\ByteRangeCalculator;
// Offsets the engine knows only after the whole PDF is written:
// $contentsStart — byte just before the '<' of the hex signature
// $contentsEnd — byte just after the '>' that closes it
// $fileLength — total file size in bytes
$range = ByteRangeCalculator::calculate(
contentsStart: $contentsStart,
contentsEnd: $contentsEnd,
fileLength: $fileLength,
);
// $range === [0, $contentsStart, $contentsEnd, $fileLength - $contentsEnd]
// Segment 1: file start → just before the signature value
// Segment 2: just after the signature value → end of file
// The signature value itself is the gap. It is never hashed.
$signedMessage = ByteRangeCalculator::extractSignedData($pdfBytes, $range);
// $signedMessage is segment 1 concatenated with segment 2 — exactly the
// bytes the cryptographic digest is computed over.

The gap between the two segments is the signature value. It cannot be part of its own digest, which is why the range is two pieces and not one.

The trap is believing a valid signature means the whole file you are looking at is what was signed. It does not. It means the bytes inside the declared range are intact. A later revision can legitimately append content — a second signature, form data, validation material — outside that range. The first signature stays valid, and it says nothing about the addition. A correct viewer tells you a signature covers “the document as it existed at signing,” not “every byte on screen.” Treating the two as the same is how a signed document acquires unsigned content that looks signed.

This page explains the structure, not the trust. A correctly formed ByteRange and CMS object tell you the bytes are intact and which key signed them. They do not, by themselves, tell you whether that key belongs to who you think, whether its certificate was valid at signing, or whether it was later revoked. That is certificate-path and revocation work, covered in Validating a signature properly. This page also does not cover when the signing happened with any independent authority. A self-asserted signing time is not trusted time — see Timestamps and trusted time. NextPDF builds the structure described here; the certificates, trust anchors, and timestamp authority are supplied by your deployment, not by the engine.

What the engine ships, by tier, is the structure-building capability:

PAdES signature structure (byte range, signature dictionary, detached CMS) — edition availability
Edition Availability
Core

PAdES B-B: the signature dictionary, the fixed-width ByteRange, and the detached CMS SignedData object described on this page.

Pro

Adds PAdES B-T — a trusted timestamp on the signature value — over the same structure.

Enterprise

Adds the long-term profiles (B-LT, B-LTA): embedded validation material and document timestamps layered on the same byte-range foundation.

  • Signature dictionary — the PDF dictionary that names the signature handler, the SubFilter, the ByteRange, and the Contents value.
  • ByteRange — an array of (offset, length) integer pairs declaring the exact bytes the signature digest covers.
  • Contents — the hexadecimal entry holding the signature value (for PDF 2.0, a detached CMS SignedData object); it is excluded from its own digest.
  • CMS SignedData — Cryptographic Message Syntax (RFC 5652) structure carrying the signer’s certificate and the signature bytes.
  • PAdES — PDF Advanced Electronic Signatures: the ETSI profile of CMS signatures for PDF, defined in the ETSI EN 319 142 series.
  • Approval signature — a signature with MDP permission 0; it asserts the content without constraining later changes.
  • Certification signature — a signature with a DocMDP permission (MDP 13) that limits which later modifications keep the document conformant.