CCITT compression is a lossless encoding algorithm that compresses purely
black-and-white (bitonal) images. Inside PDF files, it appears as the
CCITTFaxDecode filter. Group 4 — the most efficient variant —
compresses a typical scanned page by encoding runs of identical pixels and comparing lines to
eliminate redundancy. A document that weighs 500 KB as an uncompressed scan can shrink to under
25 KB with Group 4, with zero quality loss.
What Is CCITT Compression?
The name comes from the Comité Consultatif International Téléphonique et Télégraphique — the international committee that originally standardized it for fax machine transmission in the 1980s. Today, the standard is maintained by the ITU (International Telecommunication Union) and referred to as ITU-T T.4 (Group 3) and T.6 (Group 4).
The core insight behind CCITT is simple: most of a scanned black-and-white page is white. Rather than storing every individual white pixel, the algorithm records run lengths — for example, "the next 24,000 pixels are white." This makes the compressed data far smaller than the raw image data, with no information lost at all.
In the PDF specification, CCITT compression is applied through the
CCITTFaxDecode filter. When a PDF viewer opens a page with scanned content, it reads
this filter, decompresses the bitonal stream, and renders the image. The standard is supported by
every PDF reader ever made — from Adobe Acrobat to browser-based viewers.
Bitonal only: CCITT works exclusively on images with exactly two colors — pure black and pure white. It cannot be applied to grayscale or color images. For those, PDF uses JPEG, JPEG 2000, or Flate compression instead.
How CCITT Group 4 Works
Group 4 uses two-dimensional coding, which is what makes it so efficient. Here is the process from scan to compressed PDF:
- The scanner produces a bitonal bitmap. Every pixel is either 0 (white) or 1 (black). A standard A4 page at 300 DPI produces about 8.5 million pixels — nearly 1 MB of raw data before any compression.
- Run-length encoding handles each row. Instead of writing every pixel, the encoder writes the length of each alternating run. A row that starts with 800 white pixels then 12 black pixels is stored as two numbers: (800, 12).
- Two-dimensional coding compares adjacent rows. Group 4 looks at each line relative to the line above it. Because consecutive lines of text are often nearly identical, only the differences between lines need to be stored — dramatically reducing data.
-
The result is stored as a CCITTFaxDecode stream in the PDF. The PDF stores
the compressed bitonal data along with a filter dictionary specifying
K: -1(indicating Group 4), image dimensions, and color space.
The PDF Filter Dictionary
Here is what the CCITT filter parameters look like inside a PDF stream:
<< /Type /XObject /Subtype /Image /Width 2551 % pixels wide (A4 @ 300 DPI) /Height 3508 % pixels tall /ColorSpace /DeviceGray /BitsPerComponent 1 % bitonal: 1 bit per pixel /Filter /CCITTFaxDecode /DecodeParms << /K -1 /Columns 2551 >> >> % K = -1 → Group 4 (T.6) % K = 0 → Group 3 1D (T.4) % K > 0 → Group 3 2D (T.4)
Group 3 vs. Group 4 at a Glance
| Feature | CCITT Group 3 | CCITT Group 4 |
|---|---|---|
| ITU Standard | T.4 | T.6 |
| Coding dimension | 1D (per row) | 2D (row vs. previous row) |
| Error correction | Yes (for noisy fax lines) | No (assumes clean channel) |
| Compression efficiency | Good | Excellent — typically 2–4× better |
| PDF K parameter | K=0 (1D) or K>0 (2D) |
K=-1 |
| Best for | Legacy fax compatibility | PDF document archiving |
Real-World Examples
Digitizing a Clinic's Paper Records
A medical clinic scans 5,000 patient history files — black ink on white paper. Saved as color JPEG, the archive would consume 2.5 GB. Saved with CCITT Group 4, the entire archive fits into roughly 120 MB. That is small enough to store on a cheap USB drive, attach to an email, or back up to the cloud in minutes. Every character remains perfectly sharp because not a single bit was lost in compression.
Court Filing a 1,000-Page Transcript
A court reporter scans a signed paper transcript using a document scanner set to CCITT G4. The resulting PDF looks identical to the paper original — sharp, clean, high-contrast text. Because there is no gray or color noise, the file is also highly compatible with OCR software, making every word searchable. Courts and e-filing systems around the world specifically require this format for long-term electronic record keeping.
Archiving Historic Documents
A national archive scans millions of typed letters and government forms from the 1950s and
1960s. Using CCITT Group 4, each page averages 15–30 KB. The same page in TIFF without
compression would be 900 KB. The archive saves petabytes of storage while meeting the ISO
standard for long-term PDF archiving (PDF/A), which explicitly supports
CCITTFaxDecode.
Benefits of CCITT Group 4
Completely Lossless
Every pixel is reproduced perfectly after decompression. No blurring, no artifacts, no character distortion — ever.
Exceptional Compression Ratios
A typical scanned text page compresses at 15:1 to 30:1. A 1 MB raw scan becomes 30–70 KB in the PDF.
Universal Compatibility
Every PDF viewer, printer, scanner and e-filing system in existence supports CCITTFaxDecode. Zero compatibility risk.
Fast Encode and Decode
The algorithm is computationally simple. Even embedded systems, old hardware, and network printers can run it at full speed.
Standards Compliant
Supported in PDF/A for archiving, TIFF, and ISO 19005. Required by many government and legal e-filing standards.
OCR-Friendly
The perfect black-to-white contrast of bitonal images is ideal for optical character recognition — better accuracy than grayscale scans.
CCITT vs. JBIG2: Which to Use?
CCITT Group 4 has dominated bitonal PDF compression for decades, but JBIG2 — introduced in the PDF 1.4 specification — can sometimes produce even smaller files. Here is how they compare:
| Criteria | CCITT Group 4 | JBIG2 |
|---|---|---|
| Compression type | Lossless only | Lossless or lossy |
| Typical compression ratio | 15:1 – 30:1 on text | 30:1 – 100:1 on text (lossy mode) |
| How it works | Run-length + 2D diff coding | Symbol dictionary — reuses repeated glyphs |
| PDF viewer support | Universal (100%) | Modern viewers (PDF 1.4+) |
| Risk of quality loss | None (always lossless) | Possible in lossy mode — characters may look substituted |
| Best use case | Legal, medical, archival — anywhere accuracy is critical | Web delivery, size optimization, modern audiences |
Lossy JBIG2 controversy: In 2013, researchers found that Xerox scanners using lossy JBIG2 were silently substituting digits in scanned numbers — "6" became "8", for example. CCITT Group 4 leaves no room for this kind of error because it is always lossless. For legal and financial documents, CCITT Group 4 remains the safer choice.
Common Mistakes to Avoid
- Applying CCITT to grayscale or color scans. CCITT only works on bitonal images. Trying to use it on a grayscale photo will produce errors or incorrect output. Use Flate or JPEG for those.
- Scanning at too low a resolution before compression. CCITT is not a cure for a blurry scan. For readable text, scan at a minimum of 200 DPI — 300 DPI is the standard. CCITT will faithfully preserve whatever resolution you capture.
-
Confusing the K parameter values.
K=0is Group 3 one-dimensional (not Group 4). For maximum compression, you needK=-1. This is a common source of confusion in custom PDF generators and libraries. - Using CCITT for documents with mixed content. If a page contains both a black- and-white scan and a color photo or logo, CCITT can only handle the bitonal regions. Mixed-content PDFs typically combine multiple compression filters — one per image object.
- Assuming all scanners output true bitonal images. Many consumer scanners dither grayscale into pseudo-bitonal images. This reduces CCITT efficiency dramatically. Use a dedicated document scanner with a hardware bitonal (1-bit) output mode.
Frequently Asked Questions
-
CCITT compression is a lossless encoding method designed for bitonal (pure black-and-white) images. Originally developed for fax machines, it is now the standard for compressing scanned documents in PDF files via the
CCITTFaxDecodefilter. Group 4 is the most efficient variant and produces no quality loss whatsoever. -
CCITTFaxDecodeis the PDF filter name for CCITT compression. When a PDF reader encounters this filter, it decompresses the bitonal image data using the CCITT algorithm. TheKparameter in the filter dictionary determines which variant is used:K=0is Group 3 1D,K>0is Group 3 2D, andK=-1is Group 4. -
Group 3 uses one-dimensional run-length encoding and was designed for fax transmission over noisy phone lines — it includes error correction that adds overhead. Group 4 uses two-dimensional encoding, comparing each scan line to the previous one, and drops error correction since it assumes a reliable channel. Group 4 produces significantly smaller files and is the standard for PDF documents.
-
Yes, completely. CCITT Group 4 is always lossless — every black and white pixel is reproduced identically after decompression. No information is discarded. This is critical for legal and medical documents where text sharpness and pixel-perfect accuracy are required by law or regulation.
-
Use CCITT Group 4 when maximum compatibility and guaranteed accuracy matter — legal filings, medical records, government archives, or any context where a single corrupted character is unacceptable. Use JBIG2 when file size is the top priority and you know your audience uses modern PDF software. JBIG2 can achieve better compression but carries compatibility risks with older systems.
-
No. CCITT compression only works on bitonal images — exactly two values per pixel: black or white. It cannot be applied to grayscale or color images. For those, use JPEG (DCTDecode), JPEG 2000 (JPXDecode), or Flate (FlateDecode) compression inside the PDF.
Compress and Optimize Your PDFs for Free
PDFlyst tools let you compress, merge, split, and optimize PDF files — directly in your browser, with nothing to install.
Compress PDF — Free