A designer sends you a PDF. It's 50 megabytes. The body of the document is 12 pages of text and a few hero photos. Email rejects it, Slack truncates the preview, your client opens it on hotel wifi and gives up. You run it through an online compressor and it comes out 47 megabytes. You try a different compressor and it's 49. You start to suspect the tool is broken.
The tool is fine. The problem is that a PDF isn't really one file — it's a container holding fonts, images, and vector drawing instructions, and each of those layers has its own compression ceiling. To shrink a PDF you have to know which layer is fat, and most PDFs are fat for one of three reasons.
Reason one: somebody scanned it
If your PDF started life on a scanner — every page is a photograph of paper. A 12-page colour scan at 300 DPI is roughly 110 megapixels of image data, JPEG-compressed down to maybe 30-50 megabytes. The text you see in the document is pixels, not text. You can't even copy a sentence out of it without OCR.
These PDFs are the dramatic case for compression. Re-encoding the embedded JPEGs at 150 DPI and quality 75 — perfectly readable on any screen — reliably hits 80-90% reduction. A 50 MB scan becomes a 5 MB scan and looks identical at normal zoom levels. The downsampled file is what you should be sending. The 300 DPI version is for printing, and nobody is printing your contract.
Reason two: somebody embedded the entire font library
Every PDF carries the fonts it uses, so it renders identically on any device. Fonts come in two embed modes: full embed (the entire font file, often 200-500 KB per typeface) or subset (only the glyphs actually used, usually 20-50 KB).
Older versions of Word, InDesign with the wrong export preset, and a surprising number of corporate PDF generators full-embed by default. A marketing brochure that uses 12 different typefaces — three weights of the brand font, two display faces, a mono for code samples, a serif for pull quotes, a couple of icon fonts — can carry 4-8 MB of font data before the first pixel of content lands on a page.
Most modern PDF compressors (including the one on this site) automatically subset embedded fonts. A Word-exported PDF that shrinks 30-40% with no visible quality change is almost always a font-subsetting win, not an image win. There is also a design-time fix: stop using 12 typefaces. Two or three is plenty and cheaper everywhere.
Reason three: the images are unnecessarily huge
An InDesign user places a 24-megapixel hero photo on the cover. The print designer doesn't think about it — that's the file the photographer delivered, and InDesign keeps the full resolution in case the document gets printed at billboard size. PDF export inherits that full-size embed. Now the cover image alone is 8 megabytes inside a document that will only ever be viewed on a laptop.
The fix is downsampling at export, but if you've already received the bloated PDF you can do it after the fact. Two paths: extract the images, resize them, rebuild the PDF (overkill for most cases), or run a compressor that does per-image downsampling automatically.
When the PDF won't shrink — and that's the right answer
There's a fourth case worth naming. A 5 MB text-only PDF — say, a contract from a modern Word installation, or a LaTeX-generated thesis — won't compress more than 5-15% no matter what you throw at it. The fonts are already subsetted. The images don't exist. The content streams are already Flate-compressed. The file is at its floor.
Any tool that promises to shrink such a file by 80% is either lying about the result or silently downsampling embedded images you didn't realise were there (the company logo on every footer, an embedded chart that's secretly a JPEG, page background watermarks). A small saving on a small text PDF means the tool is being honest. Adobe Acrobat won't do better.
The decision tree
- Open the PDF and try to select a sentence. If only a rectangle highlights, it's a scan — aggressive compression will shrink it dramatically. If text selects normally, it's digital — expectations should be lower.
- If it's digital and over 10 MB, the bytes are probably in embedded images. Run it through a compressor with image downsampling, or extract the images to confirm.
- If it's digital, under 5 MB, and your compressor returned 'no significant savings' — that's the right answer. Send it as-is.
- If it's a scan, downsample to 150 DPI for screen-only use and 300 DPI only if it really will be printed. Most won't be.
PDF compression is one of those problems where understanding the file format is most of the job. The tool itself is mechanical: pick a level, click compress, wait. The skill is knowing what's reasonable to expect, and not blaming the tool when the file was already as small as it can be.
