Document Architecture

PDF Info Dictionary: Legacy Metadata

Before XMP brought modern XML to PDFs, all metadata was stored in a rigid, hardcoded object called the Document Information Dictionary (/Info). It acts as the container holding the Title, Author, Keywords, and the exact timestamps of when a PDF was created or modified.

Quick Answer

When you press `Ctrl + D` in Acrobat to view Document Properties, you are looking directly at the `/Info` dictionary. It lives completely independently from the visual text on the pages. A PDF might have the giant visual headline "Annual Final Report" on page 1, but if the author never filled out the Info Dictionary, the internal `/Title` might still read 'Microsoft Word - Doc1.docx', ruining its SEO ranking on Google.

Standard Keys of the Info Dictionary

Unlike the rest of a PDF which is highly extensible, the traditional Info Dictionary relies on several specific core keys defined in 1993:

  • /Title: The actual name of the document. Crucial for Accessibility (screen readers read this first) and Google Search indexes (it becomes the `` tag).</li> <li><strong>/Author & /Subject:</strong> Descriptive string data defining the creator and the purpose of the document.</li> <li><strong>/Keywords:</strong> A comma-separated list of strings. Historically used heavily by primitive corporate search engines to index documents before full-text scraping existed.</li> <li><strong>/Creator vs /Producer:</strong> <em>Creator</em> is the desktop application the user was typing in (e.g., Apple Pages). <em>Producer</em> is the hidden mathematical engine that converted the file (e.g., macOS Quartz PDFContext).</li> <li><strong>/CreationDate & /ModDate:</strong> Machine-readable timestamps dictating physical file history.</li> </ul> </section> <section id="how-it-works" aria-labelledby="how-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><polyline points="22 12 18 12 15 21 9 3 6 12 2 12"/></svg></span><span id="how-heading">The Shift to XMP Metadata</span></h2> <div class="gloss-table-wrap"><table class="gloss-table"> <thead><tr><th>Feature</th><th>Legacy /Info Dictionary</th><th>Modern XMP Stream</th></tr></thead> <tbody> <tr><td><strong>Format</strong></td><td>PDF Specific Dictionary Syntax (e.g., <code>/Title (Report)</code>)</td><td>Universal XML language.</td></tr> <tr><td><strong>Extensibility</strong></td><td>Poor. Adding custom fields like `/CopyrightStatus` is hacky and non-standard.</td><td>Infinite. Can embed entirely custom XML schemas (Creative Commons, DRM tokens).</td></tr> <tr><td><strong>Tool Agnostic</strong></td><td>No. Only a specialized PDF parser can read the end of the file.</td><td>Yes. A bash script or general web-crawler can find the cleartext `<?xpacket>` XML without understanding PDF architecture.</td></tr> <tr><td><strong>Adoption</strong></td><td>Officially Deprecated in PDF 2.0.</td><td>The absolute mandatory standard for all modern PDF archival and prepress formats.</td></tr> </tbody> </table></div> </section> <section id="real-world-example" aria-labelledby="rwe-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="2" y="7" width="20" height="14" rx="2"/><path d="M16 21V5a2 2 0 0 0-2-2h-4a2 2 0 0 0-2 2v16"/></svg></span><span id="rwe-heading">Real-World Scenarios</span></h2> <div class="gloss-example-card"><span class="gloss-example-label">๐Ÿ” SEO & Web Hosting</span><h3>The Browser Tab Disaster</h3><p>A marketing team spends thousands designing a beautiful "2024 Product Catalog" PDF, but the graphic designer forgets to update the PDF properties. The designer originally cloned last year's file to save time. When thousands of customers open the link in Chrome, the browser pulls the `/Title` key from the Info Dictionary. The tab explicitly says "2019 Internal Rough Draft V3", destroying brand credibility instantly.</p></div> <div class="gloss-example-card"><span class="gloss-example-label">โš–๏ธ Legal Discovery (eDiscovery)</span><h3>Proving Document Origins</h3><p>In a lawsuit, a plaintiff claims they authored a critical contract on January 5th. Digital forensics experts extract the PDF and read the <code>/Info</code> dictionary. The <code>/CreationDate</code> explicitly shows `D:20240210...` (February 10). Furthermore, the <code>/Creator</code> field reveals the document was built in 'Adobe Photoshop' (implying image manipulation) rather than 'Microsoft Word' (standard drafting), completely shifting the legal strategy.</p></div> <div class="gloss-example-card"><span class="gloss-example-label">โ™ฟ Accessibility Compliance (ADA)</span><h3>Screen Reader First Pass</h3><p>A government website uploads a tax form. A blind user navigates to the PDF using the JAWS screen reader. According to WCAG ADA compliance laws, the screen reader first intercepts the `/Title` tag from the Info Dictionary to announce the document context. If the `/Title` is blank, the screen reader defaults to violently reading out the hideous, 40-character algorithmic filesystem URL it was downloaded from, failing compliance audits.</p></div> </section> <section id="benefits" aria-labelledby="ben-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><polyline points="20 6 9 17 4 12"/></svg></span><span id="ben-heading">Best Practices for Metadata</span></h2> <div class="gloss-benefit-grid"> <div class="gloss-benefit-card"><span class="bc-icon">๐Ÿงน</span><h3>Data Sanitization</h3><p>Always manually scrub the Info Dictionary before publishing external PR documents to prevent leaking internal author names or embarrassing original working file titles.</p></div> <div class="gloss-benefit-card"><span class="bc-icon">๐Ÿ”„</span><h3>Syncing Info and XMP</h3><p>Because the `/Info` dictionary is deprecated, professional software perfectly mirrors data. If you change the Title in the XMP stream, the software silently rewrites the legacy `/Info` Title to match, ensuring both old and new search engines find the file.</p></div> <div class="gloss-benefit-card"><span class="bc-icon">๐Ÿ“…</span><h3>Timestamp Integrity</h3><p>Never rely on your Windows/Mac operating system's "File Modified" right-click data. Copy-pasting a file resets Windows file times. <em>Only</em> the `/CreationDate` inside the PDF Info dictionary travels permanently with the document contents themselves.</p></div> </div> </section> <section id="technical-breakdown" aria-labelledby="tech-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><polyline points="16 18 22 12 16 6"/><polyline points="8 6 2 12 8 18"/></svg></span><span id="tech-heading">The Info Dictionary Syntax</span></h2> <div class="gloss-code-block" role="region" aria-label="PDF Info Dictionary Example"> <div class="gloss-code-header"><span class="code-title">PDF OBJECT โ€” Standard Trailer & Info Definition</span><div class="gloss-code-dots"><span></span><span></span><span></span></div></div> <pre class="gloss-code"><span class="c-comment">% 1. The Info dictionary exists as a standard indirect object</span> 90 0 obj <span class="c-key"><<</span> <span class="c-key">/Title</span> <span class="c-val">(Quarterly Earnings Report Q3)</span> <span class="c-key">/Author</span> <span class="c-val">(Jane Doe - CFO Office)</span> <span class="c-key">/Subject</span> <span class="c-val">(Financials)</span> <span class="c-key">/Keywords</span> <span class="c-val">(finance, earnings, q3, public)</span> <span class="c-key">/Creator</span> <span class="c-val">(Microsoft Word)</span> <span class="c-key">/Producer</span> <span class="c-val">(Acrobat PDFMaker 21 for Word)</span> <span class="c-key">/CreationDate</span> <span class="c-val">(D:20231015093000-04'00')</span> <span class="c-comment">% Oct 15, 2023, 9:30 AM (UTC-4)</span> <span class="c-key">/ModDate</span> <span class="c-val">(D:20231016140500-04'00')</span> <span class="c-comment">% Oct 16, 2023, 2:05 PM</span> <span class="c-key">>></span> endobj ... <span class="c-comment">% 2. Crucially, the master Trailer physically points to it </span> <span class="c-comment">% so interpreters can find it instantly at the bottom of the file.</span> <span class="c-comment">trailer</span> <span class="c-key"><<</span> <span class="c-key">/Size</span> 91 <span class="c-key">/Root</span> 1 0 R <span class="c-key">/Info</span> 90 0 R <span class="c-comment">% Points to Object 90 above</span> <span class="c-key">>></span> <span class="c-comment">startxref</span> 112344 <span class="c-comment">%%EOF</span></pre> </div> </section> <section id="common-mistakes" aria-labelledby="cm-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="12" cy="12" r="10"/><line x1="15" y1="9" x2="9" y2="15"/><line x1="9" y1="9" x2="15" y2="15"/></svg></span><span id="cm-heading">Common Metadata Pitfalls</span></h2> <ul class="gloss-mistakes"> <li><strong>PDF 2.0 Confusion.</strong> The PDF 2.0 specification officially deprecated the `/Info` dictionary. However, thousands of legacy scripts and corporate search engines still rely on it. A common mistake is using a hyper-modern software engine that only spits out XMP data, resulting in older recipient systems marking the PDF Title as "Unknown."</li> <li><strong>Inconsistent XMP Syncing.</strong> Overriding the `/Title` inside the dictionary using a cheap python script, but forgetting to update the bloated XMP XML stream in the same file. The PDF now houses contradictory metadata, and different software (Adobe vs Chrome) will display completely different document histories.</li> <li><strong>Encrypting the Metadata Stream.</strong> Applying standard password-security to a PDF encrypts both the text AND the `/Info` dictionary. When hosted on a web server, Google's crawler physically cannot read the Title or Keywords, severely impacting search visibility. Modern encryption settings explicitly offer options to "Leave metadata unencrypted."</li> </ul> </section> <section id="faq" aria-labelledby="faq-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="12" cy="12" r="10"/><path d="M9.09 9a3 3 0 0 1 5.83 1c0 2-3 3-3 3"/><line x1="12" y1="17" x2="12.01" y2="17"/></svg></span><span id="faq-heading">Frequently Asked Questions</span></h2> <ul class="gloss-faq" role="list"> <li class="gloss-faq-item" id="faq-1"><button class="gloss-faq-q" aria-expanded="false" aria-controls="faq-1-a" id="faq-1-btn" onclick="glossToggleFaq(this)"><span>Where do I find this data in Acrobat?</span><svg class="faq-chevron" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><polyline points="6 9 12 15 18 9"/></svg></button><div class="gloss-faq-a" id="faq-1-a" role="region" aria-labelledby="faq-1-btn"><p>Open any PDF, hit `Ctrl+D` (or go to File > Properties). The "Description" tab detailing the Title, Author, and Custom keys is a direct visual translation of the underlying `/Info` object.</p></div></li> <li class="gloss-faq-item" id="faq-2"><button class="gloss-faq-q" aria-expanded="false" aria-controls="faq-2-a" id="faq-2-btn" onclick="glossToggleFaq(this)"><span>Why is the Info Dictionary deprecated?</span><svg class="faq-chevron" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><polyline points="6 9 12 15 18 9"/></svg></button><div class="gloss-faq-a" id="faq-2-a" role="region" aria-labelledby="faq-2-btn"><p>It was too rigid. As documents evolved, publishers needed highly structured, nestable data logic for tracking licenses, DRM, and accessibility (which XML/XMP handles perfectly), whereas the `/Info` dictionary is just a flat list of strings.</p></div></li> <li class="gloss-faq-item" id="faq-3"><button class="gloss-faq-q" aria-expanded="false" aria-controls="faq-3-a" id="faq-3-btn" onclick="glossToggleFaq(this)"><span>What's the difference between Creator and Producer?</span><svg class="faq-chevron" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><polyline points="6 9 12 15 18 9"/></svg></button><div class="gloss-faq-a" id="faq-3-a" role="region" aria-labelledby="faq-3-btn"><p>`Creator` is the human-facing desktop software where the content was typed (like Microsoft Word or Adobe Illustrator). `Producer` is the low-level backend mathematical engine that handled the actual PDF conversion (like Acrobat Distiller).</p></div></li> <li class="gloss-faq-item" id="faq-4"><button class="gloss-faq-q" aria-expanded="false" aria-controls="faq-4-a" id="faq-4-btn" onclick="glossToggleFaq(this)"><span>How is the CreationDate formatted?</span><svg class="faq-chevron" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><polyline points="6 9 12 15 18 9"/></svg></button><div class="gloss-faq-a" id="faq-4-a" role="region" aria-labelledby="faq-4-btn"><p>It uses a strict ASN.1 format: `D:YYYYMMDDHHmmSSOHH'mm'`. So `D:20241024083000-05'00'` means October 24, 2024, at 8:30:00 AM, offset against UTC by -5 hours.</p></div></li> <li class="gloss-faq-item" id="faq-5"><button class="gloss-faq-q" aria-expanded="false" aria-controls="faq-5-a" id="faq-5-btn" onclick="glossToggleFaq(this)"><span>Do I still need to fill out the Info Dictionary today?</span><svg class="faq-chevron" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><polyline points="6 9 12 15 18 9"/></svg></button><div class="gloss-faq-a" id="faq-5-a" role="region" aria-labelledby="faq-5-btn"><p>Yes, for backward compatibility. While XMP is the modern standard, legacy web servers, email filtering systems, and older PDF readers only know how to look for the `/Info` block at the bottom of the file.</p></div></li> </ul> </section> <section id="related-terms" aria-labelledby="rt-heading"> <h2><span class="h2-icon"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"/><path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"/></svg></span><span id="rt-heading">Related Glossary Terms</span></h2> <div class="gloss-related-grid"> <a href="https://pdflyst.com/glossary-of-pdf-terms/pdf-2-0" class="gloss-related-card"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="12" cy="12" r="10"/><path d="M12 8v4l3 3"/></svg><span class="rc-title">PDF 2.0 Specification</span><span class="rc-desc">The modern standard that officially deprecated the legacy Info dictionary.</span></a> <a href="https://pdflyst.com/glossary-of-pdf-terms/pdf-accessibility-tagging" class="gloss-related-card"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><circle cx="12" cy="12" r="10"/><path d="M12 8v4l3 3"/></svg><span class="rc-title">PDF Accessibility (/Tags)</span><span class="rc-desc">Compliance systems that heavily grade PDFs based on the existence of a valid `/Title` property.</span></a> <a href="https://pdflyst.com/glossary-of-pdf-terms/pdf-catalog" class="gloss-related-card"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M4 19.5A2.5 2.5 0 0 1 6.5 17H20"/><path d="M6.5 2H20v20H6.5A2.5 2.5 0 0 1 4 19.5v-15A2.5 2.5 0 0 1 6.5 2z"/></svg><span class="rc-title">Document Catalog (/Root)</span><span class="rc-desc">The master object that sits opposite to the Info object inside the PDF Trailer hierarchy.</span></a> <a href="https://pdflyst.com/glossary-of-pdf-terms/pdf-encryption-dictionary" class="gloss-related-card"><svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="3" y="11" width="18" height="11" rx="2" ry="2"/><path d="M7 11V7a5 5 0 0 1 10 0v4"/></svg><span class="rc-title">Encryption Dictionary</span><span class="rc-desc">Determines whether or not the system is allowed to digitally blind the metadata stream.</span></a> </div> </section> <section id="cta" class="gloss-cta" aria-label="Call to action"> <h2>Clean Up Your Document Data</h2> <p>Ensure your PDFs rank perfectly on Google and don't leak sensitive author names. Edit your PDF metadata directly using PDFlyst.</p> <a href="https://pdflyst.com/pdf-editor" class="gloss-cta-btn" id="stream-cta-main"> <svg width="18" height="18" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5"><path d="M12 20h9"/><path d="M16.5 3.5a2.121 2.121 0 0 1 3 3L7 19l-4 1 1-4L16.5 3.5z"/></svg> Edit PDF Metadata </a> </section> </div></main> <footer><div class="container"><div class="footer-content"> <div class="footer-section footer-brand"><a href="https://pdflyst.com" class="footer-logo"><img src="/images/footer-logo.png" alt="PDFlyst Logo"></a><p class="footer-tagline">Free, fast, and secure PDF tools — right in your browser.</p></div> <div class="footer-section"><h4>Product</h4><ul><li><a href="https://pdflyst.com/pdf-editor">PDF Editor</a></li><li><a href="https://pdflyst.com/merge-pdf">Merge PDF</a></li><li><a href="https://pdflyst.com/compress-pdf">Compress PDF</a></li><li><a href="https://pdflyst.com/sign-pdf">Sign PDF</a></li></ul></div> <div class="footer-section"><h4>Convert PDF</h4><ul><li><a href="https://pdflyst.com/word-to-pdf">Word to PDF</a></li><li><a href="https://pdflyst.com/powerpoint-to-pdf">PowerPoint to PDF</a></li><li><a href="https://pdflyst.com/excel-to-pdf">Excel to PDF</a></li></ul></div> <div class="footer-section"><h4>Solutions</h4><ul><li><a href="https://pdflyst.com/business">For Business</a></li><li><a href="https://pdflyst.com/education">For Education</a></li><li><a href="https://pdflyst.com/developers">For Developers</a></li><li><a href="https://pdflyst.com/enterprise">Enterprise</a></li></ul></div> <div class="footer-section"><h4>Resources</h4><ul><li><a href="https://pdflyst.com/blog/">Blog</a></li><li><a href="https://pdflyst.com/glossary-of-pdf-terms/">PDF Glossary</a></li><li><a href="https://pdflyst.com">All PDF Tools</a></li></ul></div> </div><div class="footer-bottom"><div class="copyright"><p>© 2026 PDFlyst. All rights reserved.</p></div><div class="footer-bottom-links"><a href="https://pdflyst.com/privacy">Privacy</a><a href="https://pdflyst.com/terms">Terms</a></div></div></div></footer> <script src="/js/script.js"></script> <script src="/glossary-js/pdf-info-dictionary.js"></script> </body> <script>'undefined'=== typeof _trfq || (window._trfq = []);'undefined'=== typeof _trfd && (window._trfd=[]),_trfd.push({'tccl.baseHost':'secureserver.net'},{'ap':'cpsh-oh'},{'server':'p3plzcpnl507333'},{'dcenter':'p3'},{'cp_id':'10001799'},{'cp_cl':'8'}) // Monitoring performance to make your website faster. If you want to opt-out, please contact web hosting support.</script><script src='https://img1.wsimg.com/traffic-assets/js/tccl.min.js'></script></html>