ExtortionLetterInfo Forums
ELI Forums => Getty Images Letter Forum => Topic started by: igotletter on December 10, 2013, 04:38:24 AM
-
Doing all I can to remove image (my site is removed, I believe I deleted Google cache correctly, etc.). I've emailed Wayback Machine - how long does it take for them to remove site? Very stressful. Do I need to keep emailing them? Any suggestions - I'm all ears.
Also, you're all incredible. I've read several years' worth of threads at this point. I just keep shaking my head at this.
-
Add a robots.txt to your site to block the Internet Archive. I use the following snippet to block the Archive and other robots currently:
User-agent: ia_archiver
Disallow: /
User-agent: *
Disallow: /
This will get them to stop coming in the short term. They took about a week to respond to my email. They put up a bit of an argument so it was another three days after that before they notified me that they had officially removed my backup.
So definitely use the robots.txt if you need to stop it fast. Then wait for the official removal.
-
Thank you fro your reply.
Photos of the past site are on the archive - good to know it takes awhile for them to remove the past images. I definitely see the value of the adding this snippet to prevent future archives (if I rebuild a new site). I'm not a techy - where would I add it?
Add a robots.txt to your site to block the Internet Archive. I use the following snippet to block the Archive and other robots currently:
User-agent: ia_archiver
Disallow: /
User-agent: *
Disallow: /
This will get them to stop coming in the short term. They took about a week to respond to my email. They put up a bit of an argument so it was another three days after that before they notified me that they had officially removed my backup.
So definitely use the robots.txt if you need to stop it fast. Then wait for the official removal.
-
robots.txt go on the root of the server, you can also add a of line of code to each of your pages in the head section as follows:
<META NAME="ROBOTS" CONTENT="NOARCHIVE">
-
As Robert said, robots.txt is a file on the root of your server. His point about the meta tag is a great one as well. I forgot that I put that in my files as well. In my case, I wanted to make sure all robot activity to my site, including Google, stopped so I used this variation:
<meta name="robots" content="noindex, nofollow, noarchive">
Also, there is a lot of information here about stopping PicScout and other robots that are looking to catch you infringing. I suggest you look up some of those topics as well. I used some of those techniques. A user here named lucia highly recommends this specific piece of code - http://www.spambotsecurity.com/zbblock.php
-
ok - I'll keep reading. Thank you for all of your help.
-
If you want google to index your site, don't include the "no-index" portion of the snippet JLorimer provided:
the "no-follow" part will also stop crawlers from following links within your site to other site or other internal pages.
<meta name="robots" content="noindex, nofollow, noarchive">
-
Good point, also this only stops the crawlers that respect the robots.txt file.
If you want google to index your site, don't include the "no-index" portion of the snippet JLorimer provided:
the "no-follow" part will also stop crawlers from following links within your site to other site or other internal pages.
<meta name="robots" content="noindex, nofollow, noarchive">
-
That makes sense.
After I had time to calm down from receiving this letter, I re-read the letter I received and noticed the website Getty indicates as "violating" is a domain name with no website on it.
Nonetheless, this is good info to have for other sites I make, as I need to remake the site I took down (in my panic over the letter I received - Geawd this sucks! Better ways to spend my time and resources. Vent over.).
-
It is rare that Getty finds the cached version on The Wayback Machine but it happens which is why in an exercise of caution we have always advised to clear those and the Google cache of your site. Even if Getty were to find those versions of your site, it would be hard for them to prove that this is a "use" under copyright law so you would have that defense in addition to all the others discussed on this site.