The Metadata in Your Files Is Telling People More Than You Realise
You carefully draft an anonymous report, save it as a PDF, and send it from a burner email. Except the PDF contains your name in the author field, the registered organisation of your Office installation, a unique document identifier, and a revision history that includes the original filename.
Metadata — data about data — is embedded in virtually every file type. Most of it is generated automatically, without your awareness. For anyone thinking about privacy or operational security, it represents a significant and frequently overlooked exposure.
What Metadata Contains
Documents
Microsoft Word files typically contain: author name, company/organisation, creation and modification dates, last modified by, revision number, total editing time, template name, and sometimes previous filenames. PDFs carry similar metadata plus the creation tool used (e.g., “Microsoft Word 2021” or “Adobe Acrobat Pro DC”).
Images
JPEG photographs from smartphones typically contain: GPS coordinates (if location services are enabled), date and time, camera make and model, lens information, editing software used, and an embedded thumbnail that may show the original uncropped image.
The GPS risk is well-known but still catches people. In 2012, John McAfee’s hiding location in Guatemala was revealed through EXIF data in a journalist’s photo.
Audio and Video
Audio files carry recording software and encoding settings. Video files contain creation dates, GPS coordinates, camera model, and encoding tool — screen recordings embed OS version, software, and display resolution.
Real-World Consequences
The Rathergate scandal (2004) involved memos allegedly from the 1970s. Metadata analysis revealed they were created in Microsoft Word, contributing to their exposure as fabrications.
NSA contractor Reality Winner (2017) was identified partly through printer tracking dots — tiny yellow dots embedded by colour laser printers encoding serial number, date, and time. The EFF maintains a list of printers known to embed these dots.
Criminal investigations routinely use photo EXIF data to establish location, timeline, and device ownership.
How to Strip Metadata
Documents
Microsoft Word: File > Info > Check for Issues > Inspect Document. The Document Inspector finds and removes personal information, comments, and revision history.
PDFs: ExifTool strips metadata via command line. Adobe Acrobat Pro has a “Sanitize Document” feature. For a free option, qpdf --linearize strips most metadata while preserving the document.
Images
ExifTool is the gold standard. exiftool -all= image.jpg strips all metadata. Free, open source, cross-platform.
On mobile: iOS and Android include options to strip location data when sharing — on iOS, tap “Options” at the top of the share sheet and disable “Location.”
Social media: Most major platforms strip EXIF data from uploads automatically, though they retain the original metadata internally.
System-Level
MAT2 (Metadata Anonymisation Toolkit 2) is recommended in the Tails operating system and handles documents, images, audio, video, and archives.
Metadata You Can’t Strip
Printer tracking dots are embedded during printing at the hardware level. You cannot remove them from a printed document.
Network metadata — IP addresses, connection timestamps — is generated by sending the file, not by the file itself. Stripping file metadata doesn’t hide your IP address.
Stylometric analysis can identify authors based on writing patterns — word choice, sentence structure, punctuation — without metadata. No removal tool protects against this.
Practical Takeaways
- Check before sharing. Right-click any file and view properties before sending.
- Strip metadata from sensitive documents using ExifTool, Document Inspector, or MAT2.
- Disable location tagging in your camera app if you don’t want GPS in photos by default.
- Assume metadata exists until you’ve verified removal.
The gap between what people think they’re sharing and what they’re actually sharing is often filled with metadata they never knew was there.