Bye, bye, Twitter

Executive summary

The bird site has, after being bought by the Space Man (a.k.a. Elon Musk), become a site I no longer wish to associate with. While my account, created in 2010 and at the end spotting over 9000 tweets and over 2000 followers, was still active at the time of this writing (I don’t know if it still is, as I stopped actively using that site), I decided not to explicitly remove the account myself. The problem is that shortly after deactivating/deleting an account, the handle is again available and could be taken by somebody elso to impersonate the previous account. So instead of deleting the account, I decided to delete all content created by me (or at least ask Twitter nicely to no longer display it, as I have few illusions of proper data cleanup processes at that company at this time).

Step 1: Download the archive

Thanks to GDPR, even Twitter still had to allow me to download content created by me. To download your archive, go to https://twitter.com/settings/download_your_data and request it. It took over 1 day for me the last time to get a link to the actual archive.

After downloading, unzip the archive to a directory of your choice.

Step 2: Fetch better versions of media files and convert to static markdown and HTML

Using Twitter-archive-parser (I used commit a6a84a5113ae27aa92cf041725b306f09832be0f), from within the unzipped archive, first call parser.py and then download_better_images.py to fetch higher-resolution versions of embedded media files from the Twitter server (which are for some reason not included in the archive).

Newer versions of the parser.py script seem to include the download functionality already.

If the markdown file format generated by twitter-archive-parser (with the already sanitized URLs and proper quality media) and simple monthly HTML pages is enough for your needs, jump to step 4.

Step 3: Convert to a more featureful static HTML archive with individual tweet permalinks

After downloading/cloning Tweetback, follow the usage instructions in that repo directory:

  • [with the nodejs v16 package installed from PPA, tested on Ubuntu 22.04] execute npm install
  • sed 's/window.YTD.tweets.part0/module.exports/' < /data/tweets.js > ./database/tweets.js
  • npm run import
  • edit _data/metadata.js, including old username and link to the new webpage
  • run npm run build

Step 4: Delete tweets from Twitter

[NOTE: This is still on my TODO list, I have not actually executed it yet as I am still tweaking my offline archive site and might need to execute some of the above tools again.]

Using TweetDelete, just follow the process to delete up to 3200 tweets. If you have more, then it seems applying for a Twitter developer account is the way to go. Luckily, I already had that active for access key for the Twidere Android client. Using that developer account, I used delete-tweets to delete all my old tweets. Following the instructions is a bit involved, but worked immediately.

Step 5: Preserve your exported archive for posterity

I pack the exported static site (from Tweetback and/or twitter-archive-parser) into a static nginx Docker container and make it available online as my personal Twitter archive.

René Mayrhofer
René Mayrhofer
Professor of Networks and Security & Director of Engineering at Android Platform Security; pacifist, privacy fan, recovering hypocrite; generally here to question and learn