As the owner of a small web design / development company that has been innocently caught up in this mess, I would like to post some helpful information to anyone interested. I currently host and maintain over 200 sites and am fearful that some of the images purchased thru a template/image pack may be used on some of my clientrs sites. ( tho I have only found 2 possibilities thus far)
Taking the following steps is by no means a cure all, but are instead geared to several issues.
1. Not only remove the images from your pages immediately , but also be sure to delete the images from the server altogether!
2. Stopping Getty’s & Picscouts crawler from accessing your pages, and downloading all images.
( which I consider theft and hacking in and of itself not to mention using my bandwidth and server resources without permission).
3. removing any cached images and pages elsewhere on the web, as Getty can simply refer to these cached pages even after the image in question has been removed from your servers.
Add this code to your meta tags before the head section to prevent any spiders/crawlers from archiving your pages:
I would recommend re-submitting your site after adding this tag, so google and other engines/index pick-up the new tag.
Block these IP's and IP ranges as they are associated with Getty and Picscout: ( this can be easily done if you are running cpanel from your osting provider) or you can accomplish this by utilizing .htaccess code. The link below will help you generate this code.
http://tools.dynamicdrive.com/userban/82.80.248.0/24
82.80.249.0/24
82.80.250.0/24
82.80.251.0/24
82.80.252.0/24
82.80.253.0/24
82.80.254.0/24
82.80.255.0/24
62.0.8.0/24
206.28.72.1
62.0.8.2
82.80.249.195
82.80.249.196
82.80.249.197
82.80.249.201
82.80.249.202
82.80.249.203
82.80.249.204
82.80.252.130
66.147.242.156
207.241.229.39
82.80.249.199
206.28.72.1
The .0/24 entries are the wild card for 0 to 255 (.0/24 is how the proper
code is for wild card)
Remove your site from The Internet Archive ( wayback machine)
To remove your site from the Wayback Machine, place a robots.txt file at the top level of your site (e.g.
www.yourdomain.com/robots.txt) and then submit your site.
The robots.txt file will do two things:
It will remove all documents from your domain from the Wayback Machine.
It will tell us not to crawl your site in the future.
To exclude the Internet Archive’s crawler (and remove documents from the Wayback Machine) while allowing all other robots to crawl your site, your robots.txt file should say:
User-agent: ia_archiver
Disallow: /
Robots.txt is the most widely used method for controlling the behavior of automated robots on your site (all major robots, including those of Google, Alta Vista, etc. respect these exclusions). It can be used to block access to the whole domain, or any file or directory within. There are a large number of resources for webmasters and site owners describing this method and how to use it. Here are some:
•
http://www.robotstxt.org/ •
http://pageresource.com/zine/robotstxt.htm Once you have put a robots.txt file up, submit your site (
www.yourdomain.com) on the form on
http://pages.alexa.com/help/webmasters/index.html#crawl_site.
As of this post there is currently no way to ban picscouts crawler using the robots.txt method, it appears to be ignoring this file altogether, hence the blocking of the IP's above.