Removing Personal Information from MP3s bought on Amazon

When you buy MP3s on Amazon, it is likely that they contain a “unique purchase identifier” which can be used to link the MP3 file to your Amazon account.

Music file metadata contains unique purchase identifier.

Storage Format and Location

This special block of metadata is stored in private frames (PRIV). This makes it harder to be detected and removed, as most MP3 tag editors simply ignore these frames.

The files I analyzed had the private frame right in the beginning. They start with the identifier www.amazon.com, followed by some XML data (pretty-printed here):

<?xml version="1.0" encoding="UTF-8"?>
<uits:UITS xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:uits="http://www.udirector.net/schemas/2009/uits/1.1">
    <metadata>
        <nonce>XXXXXXXX</nonce>
        <Distributor>Amazon.com</Distributor>
        <Time>1970-01-01T00:00:00.000Z</Time>
        <ProductID type="UPC" completed="true">00889326362937</ProductID>
        <AssetID type="ISRC">GBK3W1000391</AssetID>
        <TID version="1">XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</TID>
        <Media algorithm="SHA256">95571b74ab373a50dea2981b58c6712cd7e667e1664c3e7d0c06ff6547f7056f</Media>
    </metadata>
    <signature algorithm="RSA2048" canonicalization="none" keyID="dd0af29b41cd7d6d82593caf1ba9eaa6b756383f">XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</signature>
</uits:UITS>

(In this listing, I have replaced all strings that were unique in my file with the letter X and altered the timestamp. Note that I did not apply those changes to my audio file.)

It is worth noting that there’s no clear-text information about you in it as it would be the case if you had acquired the file from the iTunes Store. Apple’s AAC files contain your name and Apple ID in unencrypted, human-readable form. Luckily, Amazon is less privacy-invasive here, but they of course still have enough information in it to link the MP3 back to your account.

Stripping Private Frames

There’s a Python-based command-line tool called eyeD3 that can remove any type of ID3 frame—also those evil PRIV frames. eyeD3 requires Python 2.6 or 2.7 and can be installed with pip:

❯ pip install eyeD3

Then you can run the following command strip off all PRIV tags and comments for each of the MP3s in the current folder:

❯ eyeD3 --remove-frame PRIV --remove-all-comments *.mp3

This leaves all the other meta data intact. Interestingly though, the --remove-all-comments is required here. If you omit it, the MP3 will still differ from the same MP3 bought with a different account (see section Verification).

Verification

To make sure that there’s no additional information hidden somewhere else in the file (for example as a watermark embedded into the audio data), I bought the same MP3 again from a different account and compared the two files after removing the PRIV frames with eyeD3.

Et voilà:

❯ shasum Account_1/02\ -\ Cool\ Blue.mp3
a968cde43019a9e3c6747d750c4f7a8d4cba26da  Account_1/02 - Cool Blue.mp3
❯ shasum Account_2/02\ -\ Cool\ Blue.mp3
a968cde43019a9e3c6747d750c4f7a8d4cba26da  Account_2/02 - Cool Blue.mp3

Exact same file! 🎉

Other Uses of PRIV Frames

There are also uses of PRIV frames that are less evil. For example, Traktor by Native Instruments uses them to store further meta data such as BPM, cue points, beat grids, etc. directly within your audio files. If you like that is another story, of course—but one can argue that it can be useful in many cases.

Further Reading