Skip to main content

Detecting when your blog posts get censored by Google (or any search engine)

Goverments and companies keep approaching Google to "forget" certain URLs with the result of millions of URLs being removed from the search index per month, according to Google itself (see links earlier). Now if you happen to blog about a risky topic, your blog posts (or any other kind of web page) may be removed from the Google search index without early notice. So you may want to know if (some of) your content still can be found easily. My approach would be to

  1. Generate some random checksum (e.g. a SHA1, see below)
  2. Make sure that this checksum does not get any hits on Google, yet
  3. Embed the checksum in the post somewhere, maybe at the front or the very end
  4. Search for that checksum every few days
  5. If the result shows the post of yours it must be contained in the search index, i.e. it has not been censored
  6. (Automate the previous step)

On Linux I run

# cat /proc/sys/kernel/random/uuid | sha1sum
8f6a8cfc66bc3523eac19b1402568bc2ae7950ae -

to make a checksum for this very blog post. As it's part of the post already, I can omit adding it to the end once more, neat :-) I hope this technique works for someone. Good luck.