Standard PDF compression treats the entire scanned page as one giant image. Compress it hard enough to email and text blurs. MRC (Mixed Raster Content) solves this by analysing the page and splitting it into three layers: a foreground layer for sharp text (JBIG2 — tiny files, perfect edges), a background layer for paper texture (low-quality JPEG thumbnail), and an image layer for photos (JPEG/JPX). A 10 MB scan becomes 300 KB — and the text is crisper than the original.
What Is MRC Compression?
Every scanned page contains three different types of content at once: sharp high-contrast text and line art; smooth colour background and paper texture; and full-colour photographs. These types have completely different visual characteristics — and completely different compression requirements.
Standard compression treats the whole page as one image, making it impossible to optimally compress text and photos simultaneously. MRC (Mixed Raster Content), standardised in ITU-T T.44, solves this by segmenting the scanned image into constituent layers before compressing each independently:
- The Foreground (Text) Layer is compressed with JBIG2 or CCITT Group 4 — lossless algorithms for bi-level (black/white) content. Text edges are mathematically perfect.
- The Background Layer is down-sampled and compressed aggressively with JPEG. Nobody needs to see individual paper fibres in high resolution.
- The Image Layer (photographs) is compressed with JPEG or JPX at a quality level appropriate for photos — high enough to look good, low enough to save space.
Only for scanned (image-based) PDFs. MRC works on raster scans. PDFs created from Word, Excel, or InDesign already have separate text and image streams — MRC is not needed or applicable to those.
How MRC Segmentation Works
- Pixel classification. Each pixel is analysed — is it a sharp high-contrast edge (text/line art)? Or a gradual colour gradient (background/photo)?
- Layer separation. The engine builds three layers: binary foreground mask, background colour map, and photo regions.
- Per-layer compression. JBIG2/CCITT for foreground. Heavy JPEG for the background thumbnail. JPEG/JPX for photo regions.
- Mask encoding. A binary mask tells the PDF renderer which pixels come from foreground and which from background. The renderer composites layers on the fly during display.
- Optional OCR layer. An invisible text layer is added above the composited image, making the PDF fully searchable without changing visual quality.
Real-World Examples
Law Firm: 50,000-Page Discovery Archive
A law firm digitises 50,000 pages of colour discovery documents. Standard compression would produce 500 GB — slow to email, expensive to host. With MRC they achieve 30 GB. Footnote text at 7pt remains crystal clear, ensuring no legal detail is accidentally lost. Files open instantly on mobile, speeding document review.
University Library: 100-Year-Old Textbook
A university digitises a century-old textbook with yellowed paper, handwritten annotations, and complex diagrams. MRC separates the yellow background from the black ink. Students receive a PDF where text appears sharp on a clean white background, even though the original was stained and faded. The file is small enough to stream on a smartphone.
Hospital: Patient Record Digitisation
A hospital scans thousands of mixed-content patient records daily — forms with handwriting, printed labels, and small photos. MRC stores each page at under 30 KB while maintaining text readability at any zoom level. Storage costs drop by 80% compared to TIFF, with no loss in diagnostic readability.
Why MRC Compression Matters
Extreme Size Reduction
10–50× smaller than standard TIFF or flat-JPEG scans. A 500 MB invoice archive can shrink to 10 MB with no visible quality loss.
Superior OCR Accuracy
Isolating text on a clean binary layer removes background noise, dramatically improving word-level OCR accuracy for searchable and indexable archives.
Instant Web Loading
Tiny file sizes enable documents to open instantly in browsers and mobile apps, even on 3G — essential for document portals and remote access.
Archival-Grade Quality
The lossless text foreground layer meets preservation requirements for library and archival collections — text looks sharper than the original scan.
Storage Cost Reduction
80–95% storage savings versus TIFF — a direct and significant reduction in cloud hosting costs for high-volume document workflows.
Print-Ready Output
The lossless text layer means MRC PDFs can be printed at full resolution without re-scanning — suitable for reproduction as well as display.
MRC vs. Standard Compression
| Aspect | Standard Flat Compression | MRC Compression |
|---|---|---|
| Text quality at high compression | Blurry, pixelated | ✓ Sharp — lossless JBIG2 |
| Photo quality | Compromised | ✓ Optimised per region |
| Typical file size per page | 500 KB–5 MB | 20–80 KB |
| OCR accuracy | Limited — background noise | ✓ Excellent — clean text layer |
| Encoding complexity | Simple — single filter | Complex — segmentation + 3 layers |
| Best for | Simple grayscale text-only scans | Mixed-content colour scans |
Common Mistakes to Avoid
- Applying MRC to born-digital PDFs. MRC only helps when the source is a raster scan. PDFs from Word or InDesign already have separate text streams — no gain, just processing overhead.
- Using aggressive segmentation on poor-quality scans. A noisy 150 DPI scan can cause misclassification of text pixels as background. Always scan at 300+ DPI and visually verify output before committing to a batch workflow.
- Assuming MRC recovers sharpness. MRC optimises compression of whatever you give it. A blurred source scan remains blurred — MRC cannot add resolution that was never there.
- Not verifying OCR after MRC. Segmentation errors at foreground/background boundaries can affect OCR of characters near region edges. Spot-check critical documents after processing.
- Deploying in old lightweight PDF viewers without testing. MRC PDFs require the viewer to composite multiple layers. Modern viewers handle this correctly; very old or minimal renderers may not. Test in your target environment first.
Frequently Asked Questions
MRC (Mixed Raster Content) segments each scanned page into foreground text (JBIG2), background paper (JPEG), and image regions (JPEG/JPX) — applying optimal compression to each, achieving 10–50× smaller files while keeping text perfectly sharp.
Standard compression applies one algorithm to the whole page — compress hard and text blurs. MRC recognizes text, photo, and background independently and applies the best algorithm to each, so text stays sharp even at extreme size reduction.
1. Foreground: Text/line art — losslessly compressed with JBIG2 or CCITT G4. 2. Background: Paper texture — aggressively compressed low-resolution JPEG. 3. Mask: Binary map telling the renderer which pixels come from which layer.
Colour scanned documents with both text and photos — brochures, books, records, invoices, mixed-content forms. MRC delivers the greatest benefit when sharp text coexists with colour backgrounds or photographs.
MRC generally improves OCR accuracy by isolating clean text on the foreground layer. Poor segmentation on low-quality source scans can occasionally misclassify pixels — always validate OCR output on critical documents.
No. MRC is a compression technique applied to raster image streams inside a PDF. PDF/A is an archival format standard. MRC is an advanced feature found in high-quality scanners and document management software.
Compress Your Scanned PDFs — Free
PDFlyst shrinks large scanned PDFs without blurring text or ruining photo quality.
Compress PDF — Free