Click Official ELI Links
Get Help With Your Extortion Letter | ELI Phone Support | ELI Legal Representation Program
Show your support of the ELI website & ELI Forums through a PayPal Contribution. Thank you for supporting the ongoing fight and reporting of Extortion Settlement Demand Letters.

Author Topic: Is there a definitive IP range for PICSCOUT - i want to kill the spider!  (Read 46786 times)

GoGetter

  • Newbie
  • *
  • Posts: 10
    • View Profile
I am really enjoying getting to know all about GETTY IMAGES and their nasty practices (I have full respect for copyright - but none for this business model).

I do wish I knew earlier so I could have spent more of my life educating other people and trying to undermine them. The more people they piss off and the more ammunition those people get to fight them back the more we all win .. right? Do we have an up to date IP Range to block? If so I would appreciate it.

The sooner everyone starts defending themselves from this parasite the better. I know there is some info in other threads but I didn't find a 2013 list. Thanks.

Greg Troy (KeepFighting)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 1859
    • View Profile
    • Yeah, We Do That.
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #1 on: January 28, 2013, 05:47:29 PM »
Here is what I have currently on Pic-Scout and others.  I have not updated my file in a while so others may have additional info.  Lucia is who I would ask.

----------------------------------------------------------------------------------

Robot.txt and IP blocks

80 legs

80 legs is a fast Internet crawler which crawls approximately 40,000,000 URLs a day. Unlike a lot of web crawlers 80 legs cannot be stopped by blocking IP addresses as it constantly rotates between thousands of IP addresses. The only way of effectively stopping 80 legs is with the robot.txt file. The following needs to be added to the robot.txt file to stop 80 legs:
User-agent: 008
   Disallow: /

Archive.org

Archive.org also known as the way back machine crawls the web taking screenshots of everything it finds to keep is an Internet archive. The problem with this is web crawlers like Pic-Scout go and others will crawl archive.org and find images on webpages that may no longer even be active or have been changed but Pic-Scout that will pass this information along to Getty. Getty will then take this information and pass along a demand letter saying you were infringing on one of our images back on this date. For this reason I believe archive.org should be blocked from crawling your site. Archive.org is blocked with a robot.txt file, the following should be added to the file:

User-agent: ia_archiver
   Disallow: /

Pic-Scout

Pic-Scout was developed to crawl the web searching for copyrighted images, it can identify copyrighted images even if they have been modified as long as 5% of the original still remains. Getty started using Pic-Scout and liked it so much that they bought the company so they can control it. Unlike the majority of web crawlers Pic-Scout will ignore requests from robot.txt not to crawl a site so Pic-Scout must be blocked by IP range. My current information shows that Pic-Scout 's IP range operates under:

IP range 72.26.192.0 - 72.26.223.255

Also on May 13th of 2012in what appears to be an attempt to hide their activity Pic-Scout purchased a new domain name called 411images.org this site activity was traced back to Pic-Scout’s IP addresses. It is interesting that one of the locations for these IP addresses is traced back to Israel were Getty has just been hit with a class-action lawsuit for $12 million for sending out demand letters and attempting to collect on images that they have no legal right to collect on. Below is the IP address information for

411images.org:
DNS01.411IMAGES.ORG
IP Address 72.26.211.146
Location   UNITED STATES
Managed By  VOXEL DOT NET INC.
Domains  1

DNS02.411IMAGES.ORG
IP Address  82.80.249.150
Location  ISRAEL
Managed By  BEZEQ INTERNATIONAL-LTD
Domains  1


Pic-Scout also has a nasty little brother called Image Exchange. Image Exchange is an add-on that will work with Firefox Chrome and Internet Explorer and is apparently designed to be run if you find it image that you like to determine if it is a copyrighted image. If Image Exchange recognizes the image as a copyrighted image it will then take you to where you can license the image. There are two drawbacks with Image Exchange the first being it does not have a complete database of copyrighted images so should not be trusted as to the definitive answer FA images copyrighted or not. When this add-on was taken to Getty's website and run on a page of Getty's images only a few images came back as copyrighted. The most important reason why image exchange should be blocked is that when it does find a copyrighted image it immediately tattles back to Pic-Scout so they can notify the owner of the image to check the registration and possibly send out a demand letter. I do not recommend the use of the image exchange add-on or app as it will not guarantee your images copyright free and may end up putting somebody else on the infringement roller coaster.



TinEye.com

You may also wish to exclude TinEye.com. TinEye is a program like pic Scout that crawls the web taking samples of images off of page webpages. It then stores these images and you can go to the website and upload an image and it will show you all other instances where it has found this image on the Internet. Getty has also been known to use TinEye as a quick and easy method of locating webpages in which to send demand letters to. TinEye.com may currently be blocked by use of the robot.txt file. To block TinEye add the following to your robot.txt file:
User-agent: TinEye
   Disallow: /
Note: since writing this it has come to my attention that TinEye may sometimes ignore the robot.txt file. Useragentstring.com has identified the following IP addresses as tracing back to TinEye and should be blocked as an added layer of security.

It lists IPs as
204.15.199.142 - 142-199-15-204-static.prioritycolo.com
41.68.22.0 - 41.68.22.0
66.230.232.19 - mail.macrobright.com
67.202.44.125 - ec2-67-202-44-125.compute-1.amazonaws.com
67.202.48.109 - 0
75.101.176.194 - ec2-75-101-176-194.compute-1.amazonaws.com
75.101.238.112 - ec2-75-101-238-112.compute-1.amazonaws.com
Every situation is unique, any advice or opinions I offer are given for your consideration only. You must decide what is best for you and your particular situation. I am not a lawyer and do not offer legal advice.

--Greg Troy

GoGetter

  • Newbie
  • *
  • Posts: 10
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #2 on: January 28, 2013, 05:53:41 PM »
awesome...

Greg Troy (KeepFighting)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 1859
    • View Profile
    • Yeah, We Do That.
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #3 on: January 28, 2013, 06:00:59 PM »
This is all information I found here and most came from Lucia, she would know about any updates or changes since the original posts.
Every situation is unique, any advice or opinions I offer are given for your consideration only. You must decide what is best for you and your particular situation. I am not a lawyer and do not offer legal advice.

--Greg Troy

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #4 on: January 28, 2013, 08:38:44 PM »
Oddly-- I don't know much on updates because I block so many things at Cloudflare. So few things scrape my images in obvious ways, and now very few agent visit with "no referrer/no user agent" pairs (which was a symptom of Image Search).  I think you have to ban lots of stuff because I think image groups are now likely using accounts on many popular servers (Go Daddy, BlueHost and so forth.)


Greg Troy (KeepFighting)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 1859
    • View Profile
    • Yeah, We Do That.
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #5 on: January 28, 2013, 09:14:15 PM »
Wow, sounds like you really have your set up dialed in! 


Oddly-- I don't know much on updates because I block so many things at Cloudflare. So few things scrape my images in obvious ways, and now very few agent visit with "no referrer/no user agent" pairs (which was a symptom of Image Search).  I think you have to ban lots of stuff because I think image groups are now likely using accounts on many popular servers (Go Daddy, BlueHost and so forth.)
Every situation is unique, any advice or opinions I offer are given for your consideration only. You must decide what is best for you and your particular situation. I am not a lawyer and do not offer legal advice.

--Greg Troy

Robert Plant

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #6 on: January 28, 2013, 09:20:27 PM »
all of these are going into .htaccess

GoGetter

  • Newbie
  • *
  • Posts: 10
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #7 on: January 29, 2013, 04:52:03 PM »
some of them will go in .htaccess and some in robot.txt

where you see structure as follows add them to lines in robot

User-agent: 008
   Disallow: /

The IP address you will add to htaccess like so

order allow,deny
deny from 82.80.249.150
deny from 72.26.211.146
allow from all

etc

Of course the intention of blocking image scrapers would not be to allow you to perform copyright infringement, but it would alllow you to be better defended from band width eating image scrapers and devious companies like those sending extortion letter to people who should be getting take down notices. I have never heard of a more disturbing business model than this and I encourage everyone vaguely concerned to take up arms to defend themselves.
« Last Edit: January 30, 2013, 03:14:11 PM by GoGetter »

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #8 on: January 30, 2013, 03:27:55 PM »
There is no method of blocking scrapers that would be sufficient to protect you if you were violating copyright.  If a resource is on the web, you can't be certain no one can get to it. 

The main reasons to block are (a) to raise their costs of scraping, (b) to lower you costs of hosting, (c) just to keep them off because they irritate the heck out of you or similar.

Robert Plant

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #9 on: January 31, 2013, 11:27:55 AM »
This isn't about copyright. These scumbags don’t care about copyright infringement. They're just interested in extorting cash under the guise of copyright infringement.

We stopped buying from them (as I'm sure most others like us have) due to their recent ridiculous cost increases. With this behaviour of theirs, they are not even worth the risk of buying and using their stock photos for any purpose. I don't even feel very safe buying and using pictures from bigstockphoto because of them. I'm encouraging customers to avoid bulk outlets like this as traps, and use custom art from local freelancers.

Artists trying to make a living should take notice.

I digress... would blocking these IPs in .htaccess prevent them from impacting server resources?  I assume they would completely ignore robots.txt.


Robert Krausankas (BuddhaPi)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 3354
    • View Profile
    • ExtortionLetterInfo
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #10 on: January 31, 2013, 12:06:58 PM »
This isn't about copyright. These scumbags don’t care about copyright infringement. They're just interested in extorting cash under the guise of copyright infringement.

We stopped buying from them (as I'm sure most others like us have) due to their recent ridiculous cost increases. With this behaviour of theirs, they are not even worth the risk of buying and using their stock photos for any purpose. I don't even feel very safe buying and using pictures from bigstockphoto because of them. I'm encouraging customers to avoid bulk outlets like this as traps, and use custom art from local freelancers.

Artists trying to make a living should take notice.

I digress... would blocking these IPs in .htaccess prevent them from impacting server resources?  I assume they would completely ignore robots.txt.

you assume correctly, robots.txt is a waste of time in terms of picscout and other scrapers/bots..it will be a never ending process..blocking the IP range picscout uses via htaccess will work, until they switch up the ip's they use, and this also won't do anygood if they use proxies..
Most questions have already been addressed in the forums, get yourself educated before making decisions.

Any advice is strictly that, and anything I may state is based on my opinions, and observations.
Robert Krausankas

I have a few friends around here..

jot

  • Jr. Member
  • **
  • Posts: 25
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #11 on: January 31, 2013, 06:30:13 PM »
I have noticed that Getty Images over the last few days has been using proxies.  Though I have a thorough listing of thier IP addresses, the other day I noticed I could still access there site though it is blocked by my firewall and using nslookup, I saw that they were using different IP addresses.  5 minutes later, it was back on the usual IP addresses.  Now using domain name to block, but they will figure a way around that too.  :(

Robert Krausankas (BuddhaPi)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 3354
    • View Profile
    • ExtortionLetterInfo
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #12 on: January 31, 2013, 06:37:50 PM »
I have noticed that Getty Images over the last few days has been using proxies.  Though I have a thorough listing of thier IP addresses, the other day I noticed I could still access there site though it is blocked by my firewall and using nslookup, I saw that they were using different IP addresses.  5 minutes later, it was back on the usual IP addresses.  Now using domain name to block, but they will figure a way around that too.  :(

Did you mean picscout or Getty Images??  Getty doesn't crawl sites as far as I know, picscout is the culprit there,  but Getty does own picscout..
Most questions have already been addressed in the forums, get yourself educated before making decisions.

Any advice is strictly that, and anything I may state is based on my opinions, and observations.
Robert Krausankas

I have a few friends around here..

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #13 on: January 31, 2013, 11:41:46 PM »
I digress... would blocking these IPs in .htaccess prevent them from impacting server resources?  I assume they would completely ignore robots.txt.
If blocked in .htaccess, blocking in robots.txt becomes superfluous. However, if you are blocking by IP and you miss an IP block or a robot changes IP ranges, it won't work. robots.txt might-- if the bot obeys it (which it may not.)

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Is there a definitive IP range for PICSCOUT - i want to kill the spider!
« Reply #14 on: January 31, 2013, 11:50:05 PM »
I have noticed that Getty Images over the last few days has been using proxies.  Though I have a thorough listing of thier IP addresses, the other day I noticed I could still access there site though it is blocked by my firewall and using nslookup, I saw that they were using different IP addresses.  5 minutes later, it was back on the usual IP addresses.  Now using domain name to block, but they will figure a way around that too.  :(
Yes. That's why it is very difficult to block Getty. To an extent, if you really want to block Getty, you have to decide to block lots and lots and lots of stuff.  You'll end up wanting to block nearly all the serverbased seo/reputation management groups, hosting companies that welcome spammers, the amazon range-- used by lots of script kiddies-- and loads of other stuff.   For many, many, many sysadmis, blocking these is a win/win situation because very little of that stuff has any great benefit to <i>most</i> web sites. (Oh.. you'll find people who tell you they do. But those people either a) don't know what they are talking about, b) are seriously over-rating the level of benefit of things like ... of for example, "shopping bots" to the vast majority of sites which list nothing for sale,  or c) are lying.)   

But a few web sites do benefit from some of those visits and those web sites need to know which of the server-supported sites visit them.

 

Official ELI Help Options
Get Help With Your Extortion Letter | ELI Phone Support Call | ELI Defense Letter Program
Show your support of the ELI website & ELI Forums through a PayPal Contribution. Thank you for supporting the ongoing fight and reporting of Extortion Settlement Demand Letters.