In the news today is that the Vault series of leaks includes information about “Scribble”, a watermark beacon inserted into CIA docs so as to track leakers.
Well, not to rain on that parade, but frankly, anyone who gets some leaked docs from ANY agency (and most companies) ought to be bright enough to know they are likely to be watermarked (have a hidden unique pattern of bits in the binary part marking each copy and who checked it out) and take steps to block that.
This isn’t all that hard.
On a sterile machine (new install, no network connection) you ‘display’ each document and copy it. Note this is a copy of the TEXT not the BINARY. You can even go so far as to print it out and OCR (Optical Character Recognition) it back to a pristine binary. When done, scrub the machine. Personally, I’d go for print to paper and OCR into a separate sterile machine, but I’m like that ;-)
Now there are issues even with that. For example, you can put specific changes of text and / or spelling into a doc to mark it. Counter measures to that are a bit more complicated, but a good spell check is a start. Similarly, assure that there’s a change of some things like font and margins so the text reflows and repaginates. If lives depend on it, run it through a translator to another language and back, then proofread for translation errors.
I know, lots of folks still get caught just on Microsoft Metadata on docs and images. But if you are in the game of taking on TLAs (Three letter Agencies), folks really ought to be expecting watermarks and the need to remove them.