Zero‑Install ExifTool in a GPT‑5 Sandbox
Executive Summary
This document describes a portable, zero‑installation method for running ExifTool entirely from a bundled Perl
distribution inside a GPT‑5 sandbox environment. The approach uses only the writable /mnt/data filesystem,
requires no CPAN, no system installation, and no PATH dependency, and is suitable for
forensic, compliance, and reproducible analysis workflows.
It also documents insights, outliers, anomalies, quirks, and environment‑specific considerations encountered when running ExifTool in this constrained execution model.
Objectives
- Run ExifTool without installing anything system‑wide
- Use bundled Perl modules shipped with Image‑ExifTool
- Operate fully within
/mnt/data - Produce complete, verifiable forensic outputs
- Ensure deterministic and auditable execution
Environment Model (GPT‑5 Sandbox)
Key Characteristics
- Writable filesystem limited to:
/mnt/data - No package managers (apt, brew, yum, CPAN)
- No persistent system PATH modifications
- Execution via controlled interpreters (Perl available, but unmanaged)
Design Principle
Treat /mnt/data as a self‑contained toolchain root, not merely a scratch directory.
All binaries, libraries, outputs, and manifests live under /mnt/data and are referenced via
absolute paths.
Why Zero‑Install ExifTool Works
ExifTool is distributed as:
- A pure Perl launcher script (
exiftool) - A local Perl module tree (
lib/Image/ExifTool/*.pm)
The launcher is intentionally written to work without installation as long as:
- The
exiftoolscript and thelib/directory remain siblings - Perl can resolve modules via relative paths
Critical Rule: If you move the launcher, you must move the lib/ directory with it.
This rule is the foundation of portable ExifTool execution.
Canonical Directory Layout
/mnt/data/ ├── Image-ExifTool-13.xx.tar.gz ├── exiftool_runtime/ │ └── Image-ExifTool-13.xx/ │ ├── exiftool │ └── lib/ │ └── Image/ │ └── ExifTool.pm ├── exiftool_outputs/ └── exiftool_run_manifest.json
No files are written outside /mnt/data.
Execution Binding Strategy
Explicit Perl Invocation (Mandatory)
perl /absolute/path/to/exiftool [flags] TARGET_FILE
Why This Matters
- Avoids reliance on executable bits
- Avoids shebang interpretation differences
- Avoids PATH resolution failures
- Ensures consistent behavior across sandboxes
Mandatory Forensic Extraction Matrix
The following execution modes are recommended to maximize coverage:
| Purpose | Flags | Notes |
|---|---|---|
| Human forensic (grouped) | -a -u -ee -g1 |
Readable, comprehensive |
| Fully grouped | -a -u -ee -g |
Maximum grouping detail |
| Machine ingest (JSON) | -a -u -ee -json |
SIEM / pipeline friendly |
| XML | -a -u -ee -X |
XMP‑style structure |
| Structured JSON | -a -u -ee -struct -json |
Nested metadata |
All stdout and stderr must be preserved verbatim.
Insights from Running in GPT‑5 Sandbox
1. Perl Is Present but Unmanaged
- Perl is available, but:
- No CPAN modules can be installed
- System
@INCis minimal and volatile
Implication: Bundled lib/ is non‑negotiable.
2. Can't locate Image/ExifTool.pm Is the Primary Failure Mode
This error always indicates one of:
lib/directory missinglib/not sibling toexiftool- Incorrect launcher path
It is not a Perl version problem.
3. Working Directory Is Irrelevant
ExifTool resolution is based on:
- Absolute launcher path
- Relative
lib/discovery
You may execute from any cwd without impact.
4. PDF Files Exhibit Non‑Intuitive Metadata Density
- XMP metadata often contradicts Document Info dictionary
- Incremental updates can hide prior metadata revisions
- Embedded images and objects surface only with
-ee
Recommendation: Always include -ee for documents.
5. Duplicate Tags Are Common and Meaningful
- PDFs and images often contain multiple instances of the same tag
- Without
-a, critical historical values may be suppressed
Forensics rule: -a is mandatory.
Outliers and Quirks
Quirk: Silent Partial Success
ExifTool may:
- Exit with code
0 - Still emit warnings on stderr
Mitigation: Capture stderr and summarize warnings in a manifest.
Outlier: XML Output Size Explosion
-Xoutput can be significantly larger than JSON- Especially true for PDFs with embedded streams
Operational note: Ensure adequate disk space in /mnt/data.
Quirk: Encoding Variability
- Some metadata fields contain mixed encodings
- UTF‑8 replacement may be required when writing text files
Best practice: Preserve raw bytes where possible; document encoding assumptions.
Failure Handling Strategy
Validation Failures
| Condition | Action |
|---|---|
| Missing Perl | Fail closed; recommend EXE build on Windows |
Missing lib/ |
Fail closed; document layout error |
| Unreadable target | Fail closed; no outputs invented |
Execution Failures
- Always write a manifest
- Never delete partial outputs
- Never suppress stderr
Manifest as a First‑Class Artifact
A run manifest should record:
- Tool version
- Absolute paths
- Exact argv for each invocation
- Exit codes
- Output file sizes and hashes
- Warning/error summaries
This enables auditability, reproducibility, and legal defensibility.
Security & Compliance Considerations
- No system modification
- No network access required
- Deterministic execution
- Ideal for:
- Digital forensics
- E‑discovery
- Regulated environments
- Air‑gapped analysis
Authoritative References
- ExifTool installation & portable usage:
https://exiftool.org/install.html - ExifTool command‑line manual:
https://exiftool.org/exiftool_pod.html
GPT‑5 Zero‑Shot Invocation (All‑in‑One)
Authoritative Download Link (tar.gz)
Always reference the official ExifTool distribution site:
- Image‑ExifTool Perl tarball:
https://exiftool.org/Image-ExifTool-13.32.tar.gz
(Replace version if needed; always prefer the official exiftool.org domain.)
Zero‑Shot Prompt (Canonical)
You are operating in a GPT‑5 sandbox with write access to /mnt/data only. GOAL: Run ExifTool with full forensic coverage WITHOUT installing anything. INPUTS: - TAR_GZ_PATH: /mnt/data/Image-ExifTool-13.32.tar.gz - TARGET_FILE: /mnt/data/<TARGET_FILE> - WORK_DIR: /mnt/data/exiftool_runtime CONSTRAINTS: - No CPAN, pip, apt, brew, or system installs - Do not rely on PATH or shebangs - Use Perl with absolute paths only - Preserve stdout and stderr verbatim - Fail closed if validation fails REQUIRED STEPS: 1. Extract the tar.gz into WORK_DIR (do not run Makefile.PL) 2. Locate the exiftool launcher and verify sibling lib/Image/ExifTool.pm 3. Execute ExifTool only via: perl /absolute/path/to/exiftool [flags] TARGET_FILE 4. Run the following commands: - perl exiftool -a -u -ee -g1 TARGET_FILE - perl exiftool -a -u -ee -g TARGET_FILE - perl exiftool -a -u -ee -json TARGET_FILE - perl exiftool -a -u -ee -X TARGET_FILE - perl exiftool -a -u -ee -struct -json TARGET_FILE 5. Write outputs to /mnt/data: - exiftool_full_human.txt - exiftool_full_groups.txt - exiftool_full_json.json - exiftool_full_xml.xml - exiftool_full_struct.json 6. Write a run manifest recording: - status - exact argv - exit codes - file sizes and SHA‑256 hashes SUCCESS CRITERIA: - No installation performed - All outputs non‑empty - Manifest written
GPT‑5 Zero‑Shot Invocation (Variant)
Authoritative Download Link (tar.gz)
- Image‑ExifTool Perl tarball:
https://exiftool.org/Image-ExifTool-13.32.tar.gz
(Replace version if needed; always prefer exiftool.org.)
Zero‑Shot Prompt (Canonical)
You are operating in a GPT‑5 sandbox with write access to /mnt/data only. GOAL: Run ExifTool with full forensic coverage WITHOUT installing anything. INPUTS: - TAR_GZ_PATH: /mnt/data/Image-ExifTool-13.32.tar.gz - TARGET_FILE: /mnt/data/<TARGET_FILE> - WORK_DIR: /mnt/data/exiftool_runtime - OUT_DIR: /mnt/data/exiftool_outputs CONSTRAINTS: - No CPAN, pip, apt, brew, or system installs - Do not rely on PATH or shebangs - Use Perl with absolute paths only - Preserve stdout and stderr verbatim - Fail closed if validation fails REQUIRED STEPS: 1) Extract TAR_GZ_PATH into WORK_DIR (do not run Makefile.PL) 2) Locate the exiftool launcher and verify sibling lib/Image/ExifTool.pm 3) Execute ExifTool only via: perl /absolute/path/to/exiftool [flags] TARGET_FILE 4) Run the mandatory matrix: - perl exiftool -a -u -ee -g1 TARGET_FILE - perl exiftool -a -u -ee -g TARGET_FILE - perl exiftool -a -u -ee -json TARGET_FILE - perl exiftool -a -u -ee -X TARGET_FILE - perl exiftool -a -u -ee -struct -json TARGET_FILE 5) Write outputs: - /mnt/data/exiftool_outputs/exiftool_full_human.txt - /mnt/data/exiftool_outputs/exiftool_full_groups.txt - /mnt/data/exiftool_outputs/exiftool_full_json.json - /mnt/data/exiftool_outputs/exiftool_full_xml.xml - /mnt/data/exiftool_outputs/exiftool_full_struct.json 6) Write /mnt/data/exiftool_run_manifest.json containing: - status - exact argv arrays - exit codes - output file sizes + SHA‑256 - stderr summary (first/last 2KB) while keeping full stderr in output files SUCCESS CRITERIA: - No installation performed - All outputs exist and are non‑empty - At least one output includes ExifTool Version Number - Manifest written and hashes present
Complete ExifTool Command Library, Recipes, and Advanced Techniques
This section is a practical, near‑exhaustive field reference.
Core Invocation Pattern (Never Deviate)
perl /absolute/path/to/exiftool [GLOBAL_FLAGS] [TAG_OPS] [OUTPUT_OPS] TARGET
Invariant rules: Always use absolute paths. Always prefer -a -u -ee unless you explicitly want reduced coverage.
Universal Baseline Commands (All File Types)
# Maximum forensic read (human) perl exiftool -a -u -ee -g1 TARGET # Maximum forensic read (machine) perl exiftool -a -u -ee -json TARGET # Structured machine output (nested) perl exiftool -a -u -ee -struct -json TARGET # Show everything including source groups perl exiftool -a -u -ee -G -g TARGET
Image‑Specific Recipes (JPEG / PNG / TIFF / RAW)
# Camera + lens fingerprinting perl exiftool -a -u -ee -EXIF:Make -EXIF:Model -EXIF:LensModel -EXIF:SerialNumber -MakerNotes:all IMAGE # GPS deep dive (numeric + human) perl exiftool -a -u -ee -gps:all -n IMAGE perl exiftool -a -u -ee -gps:all IMAGE # MakerNotes extraction perl exiftool -a -u -ee -makernotes:all IMAGE # Thumbnail and preview extraction perl exiftool -a -u -ee -b -ThumbnailImage IMAGE > thumb.jpg perl exiftool -a -u -ee -b -PreviewImage IMAGE > preview.jpg # Detect editing software chains perl exiftool -a -u -ee -Software -CreatorTool -History IMAGE
PDF‑Specific Recipes (High‑Value)
# Full PDF metadata + embedded content perl exiftool -a -u -ee -g1 PDF # Only timestamps (fast anomaly triage) perl exiftool -a -u -ee -time:all -pdf:CreateDate -pdf:ModDate PDF # Producer / creator chain perl exiftool -a -u -ee -Producer -Creator -CreatorTool -History PDF # Extract raw XMP packet perl exiftool -a -u -ee -b -XMP PDF > xmp.bin # Detect incremental updates perl exiftool -a -u -ee -IncrementalUpdate -Linearized PDF # Embedded files / attachments perl exiftool -a -u -ee -embedded:all PDF
Writing, Copying, and Controlled Mutation (Advanced)
Forensics note: writing modifies evidence. Only do this on working copies.
# Copy all metadata from one file to another perl exiftool -TagsFromFile SRC -all:all DST # Remove all metadata (sanitization test) perl exiftool -all= TARGET # Remove only GPS perl exiftool -gps:all= TARGET # Shift all timestamps (timezone correction) perl exiftool -AllDates+=01:00 TARGET
Obscure & Lesser‑Known Advanced Techniques (Hidden Gems)
# Show tag locations (where each value came from) perl exiftool -a -u -ee -G -s -all TARGET # Extract unknown binary blocks verbatim perl exiftool -a -u -ee -b -Unknown TARGET > unknown.bin # Disable print conversion (raw values) perl exiftool -a -u -ee -n TARGET # Large‑file hardening perl exiftool -api LargeFileSupport=1 -a -u -ee TARGET # Force UTF‑8 output perl exiftool -charset utf8 -a -u -ee TARGET # Diff two files (metadata only) perl exiftool -a -u -ee -json file1 > f1.json perl exiftool -a -u -ee -json file2 > f2.json # Argument files (underrated power feature) exiftool -@ args.txt TARGET
Recursive & Bulk Operations
# Recursive read perl exiftool -a -u -ee -r DIR # Recursive with CSV output perl exiftool -a -u -ee -csv -r DIR > out.csv # Parallel‑safe batch runs perl exiftool -a -u -ee -common_args -r DIR
Cheat Sheet (Quick Reference)
| Goal | Command |
|---|---|
| Everything | -a -u -ee -g1 |
| JSON ingest | -a -u -ee -json |
| XML | -a -u -ee -X |
| Duplicates | -a |
| Unknown tags | -u |
| Embedded | -ee |
| Raw numbers | -n |
| Large files | -api LargeFileSupport=1 |
Common Mistakes to Avoid (Hard‑Learned)
- Using
-fastin forensic work - Suppressing warnings with
-m - Running without
-a - Trusting exit code alone
- Forgetting embedded data (
-ee)
Mental Model (How to Think About ExifTool)
- ExifTool is not:
- a simple EXIF viewer
- a single metadata parser
- ExifTool is:
- a multi‑standard metadata correlator
- a historical metadata extractor
- a provenance discovery engine
If you can explain why a given flag is present, you are using ExifTool correctly. If you cannot, you are probably missing metadata.
Advanced Additions: Deep Dives, Automation, and Field Tools
Format‑Specific Deep Dives
MP4 / MOV (QuickTime) Atoms
Modern video files are container hierarchies, not flat metadata.
High‑value atoms ExifTool exposes:
moov– movie header (creation/modification times)mvhd– global timing (often rewritten by editors)trak/tkhd– per‑track timing (audio vs video mismatches matter)udta– user data (often contains software fingerprints)©xyzatoms – vendor/private tags
Core commands:
perl exiftool -a -u -ee -G -QuickTime:all video.mp4 perl exiftool -a -u -ee -time:all -QuickTime:CreateDate -QuickTime:ModifyDate video.mov
Forensic insights:
- Creation times may differ per track
- Audio often survives re‑encoding while video does not
- Some editors update
mvhdbut nottkhd
Signal: mismatched track times indicate recomposition or partial edits.
Office Documents (DOCX / XLSX / PPTX)
Office formats are ZIP containers with XML internals.
Metadata layers:
- Core Properties (
docProps/core.xml) - Extended Properties (
docProps/app.xml) - Custom Properties (
docProps/custom.xml) - Embedded media (images with their own EXIF)
Core commands:
perl exiftool -a -u -ee -g1 document.docx perl exiftool -a -u -ee -XMP:all -Office:all spreadsheet.xlsx
High‑yield patterns:
LastModifiedBydiffers fromCreator- Revision count resets after copy/save‑as
- Embedded images retain camera metadata
Hidden gem: embedded images often reveal more provenance than the document itself.
HEIC / AVIF Quirks
These modern image formats combine HEIF containers + ISOBMFF.
What’s unusual:
- Multiple images (bursts, depth maps)
- Auxiliary images (thumbnails, alpha, HDR gain maps)
- Metadata duplicated across items
Core commands:
perl exiftool -a -u -ee -G -ItemList:all image.heic perl exiftool -a -u -ee -EXIF:all -XMP:all image.avif
Forensic notes:
- Primary image may not be item 1
- Some tools edit only the display image, not auxiliaries
- GPS may exist in one item but not others
Automated Anomaly Scoring (JSON → Severity)
This script parses /mnt/data/exiftool_outputs/exiftool_full_struct.json and produces anomaly counts,
severity scores, and a bar chart for reporting.
Scoring heuristics (example):
| Signal | Weight |
|---|---|
| Timestamp conflicts | 3 |
| Toolchain mismatch | 2 |
| Duplicate critical tags | 2 |
| Encoding irregularities | 1 |
import json, collections
import matplotlib.pyplot as plt
with open("/mnt/data/exiftool_outputs/exiftool_full_struct.json") as f:
data = json.load(f)[0]
scores = collections.Counter()
# Timestamp anomalies
for k in data:
if "Date" in k and isinstance(data[k], list) and len(data[k]) > 1:
scores["timestamp_conflict"] += 3
# Toolchain mismatch
if data.get("Producer") and data.get("CreatorTool"):
if data["Producer"] != data["CreatorTool"]:
scores["toolchain_mismatch"] += 2
# Duplicate tags
for k, v in data.items():
if isinstance(v, list) and len(v) > 1:
scores["duplicate_tags"] += 2
# Plot
plt.figure()
plt.bar(scores.keys(), scores.values())
plt.title("Metadata Anomaly Severity")
plt.ylabel("Score")
plt.xticks(rotation=30)
plt.tight_layout()
plt.savefig("/mnt/data/metadata_anomaly_scores.png")
This produces a report‑ready graphic in /mnt/data.
Redaction Detection Playbook (PDF‑Focused)
ExifTool cannot prove redaction alone, but it can surface red flags.
Signals to correlate:
- Multiple
ModDatevalues - XMP history events without visible edits
- Producer switches (e.g., Word → PDF → Acrobat)
- Linearized PDFs with late modifications
- Embedded images with earlier timesta…
Reference Implementation (Python Runner, Zero‑Install, Deterministic)
Use this when you want a clean, reusable runner that:
- extracts the tarball
- validates the sibling
lib/invariant - runs the mandatory matrix
- writes a manifest with hashes
import os, tarfile, subprocess, json, hashlib, time
TAR_GZ_PATH = "/mnt/data/Image-ExifTool-13.32.tar.gz"
TARGET_FILE = "/mnt/data/yourfile.pdf"
WORK_DIR = "/mnt/data/exiftool_runtime"
OUT_DIR = "/mnt/data/exiftool_outputs"
MANIFEST = "/mnt/data/exiftool_run_manifest.json"
os.makedirs(WORK_DIR, exist_ok=True)
os.makedirs(OUT_DIR, exist_ok=True)
def sha256(p):
h = hashlib.sha256()
with open(p, "rb") as f:
for chunk in iter(lambda: f.read(1024 * 1024), b""):
h.update(chunk)
return h.hexdigest()
# 1) Extract (no install)
with tarfile.open(TAR_GZ_PATH, "r:gz") as tf:
tf.extractall(WORK_DIR)
# 2) Locate launcher + validate sibling lib/
exiftool = None
for root, dirs, files in os.walk(WORK_DIR):
if "exiftool" in files and "lib" in dirs:
pm = os.path.join(root, "lib", "Image", "ExifTool.pm")
if os.path.exists(pm):
exiftool = os.path.join(root, "exiftool")
break
if not exiftool:
raise RuntimeError("FAILED_VALIDATION: exiftool or bundled lib not found")
# 3) Mandatory matrix
runs = [
(["perl", exiftool, "-a", "-u", "-ee", "-g1", TARGET_FILE], "exiftool_full_human.txt"),
(["perl", exiftool, "-a", "-u", "-ee", "-g", TARGET_FILE], "exiftool_full_groups.txt"),
(["perl", exiftool, "-a", "-u", "-ee", "-json", TARGET_FILE], "exiftool_full_json.json"),
(["perl", exiftool, "-a", "-u", "-ee", "-X", TARGET_FILE], "exiftool_full_xml.xml"),
(["perl", exiftool, "-a", "-u", "-ee", "-struct", "-json", TARGET_FILE], "exiftool_full_struct.json"),
]
manifest = {
"status": "UNKNOWN",
"start_time": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
"tar_gz": TAR_GZ_PATH,
"target": TARGET_FILE,
"exiftool": exiftool,
"runs": [],
"outputs": {},
}
ok = True
for argv, name in runs:
out_path = os.path.join(OUT_DIR, name)
p = subprocess.run(argv, capture_output=True, text=True)
with open(out_path, "w", encoding="utf-8", errors="replace") as f:
f.write(p.stdout)
if p.stderr:
f.write("\n\n# STDERR\n")
f.write(p.stderr)
manifest["runs"].append({"argv": argv, "exit": p.returncode, "out": out_path})
if p.returncode != 0 or os.path.getsize(out_path) == 0:
ok = False
for fn in os.listdir(OUT_DIR):
p = os.path.join(OUT_DIR, fn)
manifest["outputs"][fn] = {"bytes": os.path.getsize(p), "sha256": sha256(p)}
manifest["status"] = "SUCCESS" if ok else "FAILED_EXECUTION"
manifest["end_time"] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
with open(MANIFEST, "w") as f:
json.dump(manifest, f, indent=2)
Visuals You Can Reuse (Docs, Training, Slides)
1) Portable Execution Flow (ASCII diagram)
┌─────────────────────────────┐
│ Image-ExifTool-13.xx.tar.gz │
└──────────────┬──────────────┘
│ extract
v
┌─────────────────────────────────────────────────┐
│ /mnt/data/exiftool_runtime/Image-ExifTool-13.xx │
│ ├─ exiftool (launcher) │
│ └─ lib/ (bundled Perl modules) │
└──────────────┬──────────────────────────────────┘
│ perl /abs/exiftool -a -u -ee ...
v
┌─────────────────────────────────────────────────┐
│ /mnt/data/exiftool_outputs/ │
│ ├─ human.txt ├─ groups.txt ├─ json.json │
│ ├─ xml.xml └─ struct.json │
└──────────────┬──────────────────────────────────┘
v
/mnt/data/exiftool_run_manifest.json
2) “Where PDF metadata comes from” (stack view)
PDF File ├─ Document Info Dictionary (legacy) ├─ XMP Packet (rich, can be stale) ├─ Incremental Updates (prior revisions persist) └─ Embedded Objects (attachments/images/streams)
3) Simple anomaly scoring chart (template)
You can graph counts of warning types or timestamp conflicts (example axes):
X‑axis: anomaly class (timestamp, toolchain mismatch, duplicates, encoding).
Y‑axis: count or severity score.
Recommended source: parse exiftool_full_struct.json and compute counts.
Comments
Post a Comment