University of Cambridge Home Computing Service
University of Cambridge  > Computing Service >  Web Support

Metadata and PDFs

When you generate a pdf, information may be inserted into the description, keywords and title slots from the original file, unless you tell the processor otherwise. You can change these values in the pdf by using the full version of Acrobat (Document properties>Summary), but increasingly users generate their own pdfs and do know know that these values are there, or how to influence them before the pdf is generated.

PDFs from Word

By now there are many versions of Word about in the world and this advice does not apply to all of them. In addition to versions of Word being a difficulty, the process is handled differently by various means of producing an pdf file. In the main, the most information is passed from Word if a postscript file is distilled, rather than a pdf file written directly. Word may take the information from the following fields in the 'Properties' information, and use it as metadata:

Original File PropertiesMapped Meta Tags
Titletitle
Subjectsubject
Authorauthor
Keywordskeywords
Commentsdoccomment
Last Saved bylastsavedby
Revision Numberrevisionnumber
Categorycategory
Abstractdescription

PDFs from scans

PDFs created from scans, either as a graphic, where no metadata will be present, or OCR, where the metadata will probably be faulty, need careful checking before being made available on the web.

Changing metadata

Edit the metadata using Acrobat and reindex the document (if you are able to) as soon as possible.

Further info


Contents / Previous / Next