PDF Accessibility

PDF Accessibility Tagging: Structure Tags for Screen Readers

PDF accessibility tagging adds a logical structure tree to a PDF — headings, paragraphs, lists, tables, alt text, reading order — that enables screen readers and assistive technology to convey document content meaningfully to users with disabilities. Without tags, a screen reader reads a PDF as a flat, orderless stream of characters.

Quick Answer

To a sighted reader, a PDF looks like a formatted document with clear headings, tables, and images. To a screen reader on an untagged PDF, it's a flat stream of characters — no hierarchy, no table structure, no image descriptions, often in the wrong order. PDF structure tags solve this by adding a hidden logical structure tree: <H1> for main headings, <P> for paragraphs, <Table>/<TR>/<TD> for data cells, <Figure Alt="..."> for images. A screen reader navigates this tree — letting a visually impaired user hear "Heading Level 2: Financial Summary" followed by table data read as rows and columns. This is what makes a PDF accessible rather than just visual.

What Is PDF Accessibility Tagging?

A standard PDF document stores content as a flat collection of drawing operations — text rendered at specific coordinates, images placed at positions on the page. This representation is purely visual: it tells a viewer where to draw things, not what those things mean. It cannot express "this is a main heading," "these cells form a table row," or "this image shows a bar chart of quarterly revenue."

PDF accessibility tagging adds a parallel logical structure called the document's structure tree (or tag tree). The structure tree assigns semantic roles to every content element using a set of standard tag types defined in the PDF specification and PDF/UA standard:

  • Document structure: Document, Part, Sect, Article, Div
  • Block-level elements: H1–H6 (headings), P (paragraph), Blockquote, Caption
  • Lists: L (list), LI (list item), LBody (list body)
  • Tables: Table, TR, TH (header cell), TD (data cell)
  • Inline elements: Span, Link, Annot, Code
  • Illustration/media: Figure (with Alt attribute for alt text), Formula
  • Decorative content: Artifact — marks page numbers, decorative rules, backgrounds as content to be skipped by screen readers
📌

Tagged ≠ Accessible. A PDF can have tags that are incorrect, in the wrong reading order, or missing alt text — passing automated checks while still being unusable with a screen reader. Tags must be accurate, complete, and in logical reading order.

Key PDF Accessibility Standards

StandardWhat It CoversWho It Applies To
PDF/UA (ISO 14289)Technical specification for universally accessible PDFs — complete tag tree, alt text, reading order, keyboard navigationAny organisation creating accessible PDFs
WCAG 2.1 (W3C)Web Content Accessibility Guidelines — PDF success criteria apply under techniques PDF1–PDF23Public sector, regulated industries (EU Web Accessibility Directive)
Section 508 (US)US federal law requiring ICT accessibility — PDF documents in US government must be tagged to Section 508 / WCAG 2.0 AAUS federal agencies and their contractors
EN 301 549 (EU)European ICT accessibility standard referenced by EU Web Accessibility Directive — requires PDF/UA for documentsEuropean public sector organisations

Real-World Examples

🏛️ Government Scenario

EU public sector: Accessible Annual Report

A European government ministry is required by the EU Web Accessibility Directive to publish all PDFs as accessible documents. Their communications team exports the annual report from InDesign with full structure tags — heading hierarchy, data tables with row and column headers, figure alt text for charts, and all decorative borders marked as Artifacts. A screen reader user explores the 200-page report using heading navigation (H, Insert+F6 in NVDA), jumping directly between chapter headings, then navigating a complex budget table using arrow keys across rows and columns — exactly as the sighted reader tabs through it visually.

🎓 Education Scenario

University: Accessible Exam Paper

A university is legally required under the Equality Act to provide accessible exam papers for students with visual impairments. The exam PDF is fully tagged: each question is a numbered list item, mathematical equations are tagged as Formula with a MathML alt text representation, diagrams have detailed alt text describing what a sighted student would observe, and the reading order matches the visual layout. A student using JAWS can complete the exam independently on their own computer without requiring a human reader — reducing administrative burden and increasing the student's privacy and autonomy.

💼 Corporate Scenario

Financial Institution: Section 508 Compliant Reports

A US bank's investor relations team publishes quarterly earnings reports as PDFs that must meet Section 508 requirements under US federal accessibility law. Each report is verified with PAC 2024 (PDF Accessibility Checker) and then manually tested with NVDA + Firefox before publication. The PDF passes automated validation and the manual test confirms: the earnings tables are navigable as proper data tables, all charts have text descriptions as alt text, and a screen reader user can follow the complete narrative without missing any data or context available to sighted readers.

Why PDF Accessibility Tagging Matters

Equal Access

Tagged PDFs are usable by people with visual impairments, dyslexia, motor disabilities, and cognitive differences — ensuring equal access to information for all users.

⚖️

Legal Compliance

Required by Section 508 (US), EU Web Accessibility Directive (public sector), Equality Act (UK), ADA (US), and AODA (Canada) — non-compliant documents create legal exposure.

🔍

Better Search & Reflow

Tags enable accurate copy-paste, text search, Reflow mode for mobile, and correct text extraction for indexing by search engines and document management systems.

🤖

AI Parsing

Large language model systems and document AI tools extract structured information more accurately from tagged PDFs — headings, tables, lists are interpreted semantically, not just as flat character strings.

🏢

Workplace Inclusion

Tagged PDFs allow employees with disabilities to participate fully in internal document workflows — policy documents, training materials, legal contracts — without requiring accommodation workarounds.

🌍

Language & Translation

Structure tags with declared natural language metadata improve machine translation accuracy and enable assistive technology to correctly switch language profiles when reading multilingual PDFs.

Tagged PDF vs. Untagged PDF

CapabilityUntagged PDFTagged PDF
Screen reader heading navigation✗ Not possible✓ Jump between H1–H6 with keystrokes
Table row/column navigation✗ Read as plain text✓ Navigate cells with arrow keys
Image description (alt text)✗ "Graphic" or silence✓ Full alt text announced
Correct reading order (multi-column)✗ Often wrong✓ Defined by tag tree order
Reflow mode on mobile✗ Layout breaks✓ Content reflows cleanly
Text extraction accuracy⚠ Coordinate-guessed order✓ Semantic reading order
Legal compliance✗ Non-compliant✓ Meets PDF/UA, WCAG, Section 508

Common Mistakes to Avoid

  • Relying only on auto-tagging. Adobe Acrobat's "Autotag Document" is a starting point, not a finished accessible document. Auto-tagging frequently misclassifies table cells, misses alt text, and gets reading order wrong in multi-column layouts. Always follow auto-tagging with manual verification using a screen reader.
  • Not marking decorative content as Artifact. Page numbers, decorative rules, background images, repeating headers and footers that are not meaningful content must be marked as Artifacts. Without this, a screen reader announces every decorative element as real content — creating a confusing and exhausting experience.
  • Using generic alt text for informative images. Alt text like "image" or "chart" tells a screen reader user nothing. Alt text must convey the same information that a sighted reader gets from looking at the image — for a chart: the trend it shows; for a photo: what is actually depicted and why it matters.
  • Failing to check reading order in multi-column layouts. In a two-column layout, the PDF's internal text order often follows the page geometry — not the intended reading order. Tags must explicitly define the correct reading sequence. Always verify this with a screen reader or the Tags panel.
  • Passing automated checks but not testing with a real screen reader. Automated tools (PAC, Acrobat Checker) catch structural issues but cannot verify that the content is meaningfully communicated to a screen reader user. Always perform a final manual test with NVDA, JAWS, or VoiceOver before publishing.

Frequently Asked Questions

  • PDF accessibility tagging adds a logical structure tree to a PDF — assigning semantic roles (H1–H6 headings, paragraphs, list items, table cells, image alt text, reading order) to every content element. Screen readers and assistive technology use this tree to convey document structure and meaning to users with disabilities.

  • PDF/UA (ISO 14289) is the international standard for universally accessible PDF files. It requires a complete structure tree, alt text on all images, keyboard-accessible interactive elements, correct reading order, and language declaration. It is the technical foundation for WCAG 2.x PDF accessibility.

  • H1–H6 (headings), P (paragraph), L/LI/LBody (list structure), Table/TR/TH/TD (table cells), Figure with Alt attribute (image alt text), Link (hyperlinks), Artifact (decorative content to skip). These 7 tag groups cover the vast majority of document content.

  • Tagging is necessary but not sufficient. WCAG compliance also requires: sufficient colour contrast, meaningful link text, language declaration, correct reading order, keyboard navigation, no seizure-triggering content, and form field labels. Run an automated checker (PAC 2024) AND manual screen reader testing (NVDA/JAWS).

  • Best approach: tag in the source document (Word styles, InDesign paragraph styles) before exporting to PDF. In Adobe Acrobat: Tools > Accessibility > Autotag Document as a starting point, then verify and fix with the Tags panel and Reading Order tool. Always validate with veraPDF or PAC 2024 and test with a real screen reader.

  • An Artifact marks content as purely decorative — page numbers, decorative rules, background pattern images, headers/footers — telling the screen reader to skip it. Without Artifact tags, a screen reader announces every decorative element as real content, creating confusing noise for the user.

Create Accessible PDFs — Free

PDFlyst helps you build and manage PDFs that work for every reader, on every device.

Open PDF Editor — Free