Click Official ELI Links
Get Help With Your Extortion Letter | ELI Phone Support | ELI Legal Representation Program
Show your support of the ELI website & ELI Forums through a PayPal Contribution. Thank you for supporting the ongoing fight and reporting of Extortion Settlement Demand Letters.

Author Topic: Picscout sighting.  (Read 11651 times)

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Picscout sighting.
« on: November 15, 2012, 09:15:38 AM »
I thought some of you would be interested in reading a log in my "kill" file:

Quote
#: 111122 @: Wed, 14 Nov 2012 06:13:24 -0800 Running: 0.4.10a1
Host: mailptr.picscout.com
IP: 62.219.119.15
Score: 1
Violation count: 1 INSTA-BANNED
Why blocked:   ;   Image scraper, sharing or copyright enforcing host      INSTA-BAN. You have been instantly banned!  |-|   (1: pics )  check host   || ( ax=0)    [IL]  ; ( 0 )
Query:
Referer: http://stackoverflow.com/questions/11215963/how-to-block-picscout-bot
User Agent: Mozilla/5.0 (Windows NT 6.0; WOW64; rv:16.0) Gecko/20100101 Firefox/16.0
Reconstructed URL: http:// rankexploits.com /protect/2011/12/four-steps-to-slow-down-image-scrapers/

This is definitely picscout. Most likely, it's someone at picscout wanting to read my advice on how to slow down image scrapers.

Of course, one of the steps is: Block anything on a host containing the word 'pics' in it from your logs.  I do this-- which is why the host 'picscout' was blocked.

Other useful information: This IP 62.219.119.15 is on bezeq servers. Specifically, it's on

Quote
inetnum:        62.219.110.0 - 62.219.155.255
netname:        BEZEQINT-BROADBAND
descr:          FIXED-IP
country:        IL
admin-c:        BNT1-RIPE
tech-c:         BHT2-RIPE
status:         ASSIGNED PA
remarks:        please send ABUSE complains to abuse@bezeqint.net
remarks:        INFRA-AW
mnt-by:         AS8551-MNT
mnt-lower:      AS8551-MNT
source:         RIPE # Filtered

role:           BEZEQINT HOSTMASTERS TEAM

I block many bezeqint sites-- mostly because way back around the time I received a getty letter, I coincidentally had my site absolutely hammered by an IP on bezeqint. It took my blog to it's knees-- crashing and restarting all day. Naturally, I will now be blocking the full range above-- at Cloudflare.

(Unfortunately, I have concluded it's very difficult to block image scrapers. It can be done-- but it can't be done by people with near zero programming skillz. It also needs to be custom based on one's subject matter. But there are some things that really help.)

Robert Krausankas (BuddhaPi)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 3354
    • View Profile
    • ExtortionLetterInfo
Re: Picscout sighting.
« Reply #1 on: November 15, 2012, 10:08:55 AM »
Always good to see Lucia appear!!, yup blocking scrapers is an never-ending  issue.. for those without "programming skiilz" or those that don't care if they get traffic from Israel, you can always take my approach and block then entire country.. ( not practical for Lucia )...Lucia please do tell if you ever notice anything from picscout coming from the states..that could be a game changer in many regards..
Most questions have already been addressed in the forums, get yourself educated before making decisions.

Any advice is strictly that, and anything I may state is based on my opinions, and observations.
Robert Krausankas

I have a few friends around here..

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Picscout sighting.
« Reply #2 on: November 15, 2012, 11:23:00 AM »
Blocking Israel is also not enough.   At a minimum, one must also do this:

Block all image requests with blank user agents.  Picscouts firefox add on will visit with blank user agent after someone uses it. But also, I see *many* visits to images using blank user agents only.  So images should never be served to these things.  This  should very, very rarely interfere with any honest visits because everything should present a user agent.  (Some people who are both stupid and paranoid buy privacy software that presents a blank user agent. But it's very, very rare. And those people should be told to turn on their user agent.)

These visits do not come from Israel. They are generally coming from larger servers.

Oh...today... I saw something try to visit

rankexploits.com/protect/wp-content/themes/images

This is definitely an attempt to find images that might have been uploaded to the 'theme' folder. There is nothing at that location.  Because I partially 'broke' my wordpress redirects, that attempt was sent to the "file missing" bin (404). I made my 404 page dynamic, and also run a script to ban things that are looking for <i>missing</i> things in 'wp-content/themes/'. So, that IP got banned.   

Here's the record of the 'ban'

Quote
#: 111365 @: Thu, 15 Nov 2012 07:09:38 -0800 Running: 0.4.10a1
Host: 91.229.125.213
IP: 91.229.125.213
Score: 2
Violation count: 1
Why blocked:   ; It looks like you are trying to call the theme directly.   (404) Fingerprint, scrape or hack behavior.    (2: wp-content/theme )   || ( ax=0)   [GB]   ( 404=1 )  ; ( 0 )
Query:
Referer:
User Agent: Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A293 Safari/6531.22.7
Reconstructed URL: http:// rankexploits.com /protect/wp-content/themes/images/arrows-ffffff.png

That image is not on my site.  But the fact that something tried to load it means either
a) They were trying to figure out if I have a particular image directory at that location or
b) They were hunting for a vulnerability -- that would be a threat unrelated to image sniffing.

I can't know which. But those worried about image scrapers should be worried about both. (To some extent, it is pointless to worry only about image scraping. You simply cannot block those things without blocking all the other stuff too. Leaving avenues for other types of scraping or hacking open will leave avenues for scraping open. Period.)

Anyway, here's are some details on the IP:

IP=91.229.125.213 is associated with
netname:        ITLAB-NET
descr:          IT Lab Limited

That looked like an interesting name, so I googled and discovered:

http://www.itlab.co.uk/

That is
a) a cloud service and
b) one that provides custom coding http://www.itlab.co.uk/services/consultancy-services/codelab/

I think the probability that IP was someone was devoting a lot of effort to hunt for images is near 50%. But that's a guess.  The alternative is it's a penetration tester-- which is just as bad.

Whois gives their domain range as
91.229.124.0 - 91.229.127.255
They are located in Great Britain and an Israel ban would not keep them out. 

Based on watching my logs have every reason to believe that crawlers looking for images are:
1) using a vast number of proxies-- both anonymous and transparent.  I often see blocked israeli IPs using non-transparent proxies.
2) using specialized cloud services located all over the world.

I see lots of hits from "The Planet", "Hostgator", "BlueHost", "voxility"  and all sorts of dedicated servers all over the world.  These are cheap services picscout or any picscout-like entity (e.g. tineye, idee etc.) could chose to run a scraper.  I suspect they do so.  I'm reasonably certain that if you think blocking Israel is enough, you are likely wrong.

I have a blog. I post lots of graphs-- most created by me-- but some by co-bloggers. All are constantly visited by agents with
a) referrers I know to be wrong.
b) blank or weird user agents.

To catch the scrapers, I'm doing rather complicated stuff. Specifically:
1) In .htaccess, I divert image requests with whacked out referrers or user agents to a script.  I also send requests for any images that are more than 3 months old to the script.
2) That script processes the request. If it fails quality checks, that IP is banned-- at Cloudflare. That means that IP cannot crawl anything at my site. If it passes quality checks, it is shown the image. If it's in between, it is shown a substitute image of a cat.

(Note the potential downsite of blocking a Hostgator/Bluehost/etc. IP at cloudflare is that if a blog on those services sent me a 'ping' I wouldn't see it. I'm willing to sacrifice that-- as it's really the only reasonable incoming connection from those services my blog might expect. )

This system is:
1) resource intensive because you have to run a script rather than just deliver an image.
2) requires willingness to fiddle with your .htaccess and customize in a way that makes sense. (In fact, you must edit each month if you want to check all requests for images older than 3 months.)

In addition, you need to make some decisions about what you are going to permit. For example: My /feed/ addresses do not display links to images if one loads those addresses directly in the browser. So, things that try to load images from http://myblogdomain/blogpost/feed  are banned. I've been doing this for months. I ban at least 20 things a day with that referrer and you know what? I have never, ever, ever had a human complain. That tells me those requests are not people.

I don't think there is any other way to block image scrapers. Moreover, I doubt if I even succeed in blocking all of them.  If someone or something wants to visit each post promptly providing a not obviously wrong user agent, and a not-obviously-wrong referrer, that someone or something will be indistinguishable from a real visitor. My system will let them view the images.   So, an image scraper could see every single image. (You might wonder why their bot doesn't do this?  It's because it takes more resources to monitor a blog for posting, and load a whole page rather than just try to watch what shows up on google image search and then load that image without knowing what the "right" referrer would be.   My guess is they show some obviously wrong referrers because they don't know the right one for the image.)

Oh-- and I'm doing more stuff. For example: I'm blocking all connections from TOR, and I"m blocking lots of connections from free public proxies listed on various web sites.  Doing the latter has drastically cut into the rate at which I see requests for images with blank user agents. (This is why I feel rather certain that lots of the image scrapers are using public proxies.)

Anyway... unfortunately, I can't give good easy advice on how to block picscout and picscout like scrapers. I know I've made it expensive for them to operate on my site-- that's the best anyone could do.


Greg Troy (KeepFighting)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 1859
    • View Profile
    • Yeah, We Do That.
Re: Picscout sighting.
« Reply #3 on: November 15, 2012, 11:44:59 AM »
Great info as always Lucia!  Great to see you here again too!
Every situation is unique, any advice or opinions I offer are given for your consideration only. You must decide what is best for you and your particular situation. I am not a lawyer and do not offer legal advice.

--Greg Troy

jot

  • Jr. Member
  • **
  • Posts: 25
    • View Profile
Re: Picscout sighting.
« Reply #4 on: November 29, 2012, 09:31:02 PM »
From some of my research so far, I have found that PicScout has used a server at BlueHost.

I really hate how these annoying spiders and bots are eating up my bandwidth on our web server...and all this time I thought we were getting more valid traffic.  These "trolls" are nothing more than hackers bypassing security measures for monetary gain.

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Picscout sighting.
« Reply #5 on: December 04, 2012, 06:06:52 PM »
My impression is picscout-- or other image scrapers-- are now using servers all over the place.  Blocking them from images would involve a heavy investing in time watching things that load images only. It can be done-- but it's not easy. It also resource intensive for any blogger.

I do it. But.. nope. Not easy.  It's sufficiently difficult that I would have a difficult time sharing my method with anyone who isn't extremely motivated to keep the picscout like scrapers off.  (And I probably fail anyway.)

Oscar Michelen

  • ELI Legal Warrior
  • Hero Member
  • *****
  • Posts: 1301
    • View Profile
    • Courtroom Strategy
Re: Picscout sighting.
« Reply #6 on: December 05, 2012, 06:50:56 PM »
Great information Lucia (I almost understood 30% of it!) Thanks for the post!

meontheweb

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
Re: Picscout sighting.
« Reply #7 on: December 07, 2012, 02:29:55 PM »
Lucia - very interesting; I have a background in software development so what you're doing does make some sense but alas never kept up with it (I'm in a different career path now).

I did find this - slightly old and not sure if it is still relavant:

http://rankexploits.com/protect/2011/12/four-steps-to-slow-down-image-scrapers/

The post goes into quite a bit of detail on what you need to do to modify your HTACCESS file (if you have access to it), much of it is simply cut & paste.

I will spend some time this weekend to implement this on the few blogs that I do have left...

Robert Krausankas (BuddhaPi)

  • ELI Defense Team Member
  • Administrator
  • Hero Member
  • *****
  • Posts: 3354
    • View Profile
    • ExtortionLetterInfo
Re: Picscout sighting.
« Reply #8 on: December 07, 2012, 09:45:50 PM »
Lucia - very interesting; I have a background in software development so what you're doing does make some sense but alas never kept up with it (I'm in a different career path now).

I did find this - slightly old and not sure if it is still relavant:

http://rankexploits.com/protect/2011/12/four-steps-to-slow-down-image-scrapers/

The post goes into quite a bit of detail on what you need to do to modify your HTACCESS file (if you have access to it), much of it is simply cut & paste.

I will spend some time this weekend to implement this on the few blogs that I do have left...

hahhaa  ranked exploits is none other than our very own lucia!!!    ;D
Most questions have already been addressed in the forums, get yourself educated before making decisions.

Any advice is strictly that, and anything I may state is based on my opinions, and observations.
Robert Krausankas

I have a few friends around here..

meontheweb

  • Jr. Member
  • **
  • Posts: 28
    • View Profile
Re: Picscout sighting.
« Reply #9 on: December 07, 2012, 10:01:41 PM »
Hah!  You know I saw a reply to a comment with the same name, and thought "naaah, to much of a coincidence".

lucia

  • Hero Member
  • *****
  • Posts: 767
    • View Profile
Re: Picscout sighting.
« Reply #10 on: December 07, 2012, 11:05:07 PM »
Yep.  That's me. :)

But I can tell you that initially those things helped. But I'm pretty sure the humans programming the bots are doing more and more to make it more difficult to slow the scraping down. But last November, my site was *really* getting scraped like crazy! It was insane!  Things would just rip through and load every single image at my blog (at rankexploits.com/musings ) :)




 

Official ELI Help Options
Get Help With Your Extortion Letter | ELI Phone Support Call | ELI Defense Letter Program
Show your support of the ELI website & ELI Forums through a PayPal Contribution. Thank you for supporting the ongoing fight and reporting of Extortion Settlement Demand Letters.