ExtortionLetterInfo Forums

ELI Forums => Getty Images Letter Forum => Topic started by: Moe Hacken on May 27, 2012, 03:21:41 PM

Title: How is this different from PicScout?
Post by: Moe Hacken on May 27, 2012, 03:21:41 PM: It's now all over the media, and I really don't see that much difference between this and the PicScout business model:

http://tinyurl.com/7rykvhf

http://tinyurl.com/78vf6ro

How is this different from the Google Street View people storming into these people's houses, sticking a USB stick in their laptop, and downloading anything they please?

How is PicScout's rudebot any different when they bang into your server to suck out information that's none of their business and it has been made clear that they are not to go there?

VKT loves to say that just because he leaves his keys in the car's ignition, that doesn't give you the right to climb in and drive off. Really? So just because our servers can't be easily secured from PicScout's gross intrusions, that gives THEM the right to come in and take whatever they want on our bandwidth nickel and sell it to anyone for their trollish purposes? The guy's trying to have it both ways and I'm not buying it for a second.

How is it that gathering "evidence" against a person in this manner does NOT require a warrant or court order, and how can it be argued that any evidence thus collected could be admissible in court?

Even the Department of Homeland Security has to get a warrant or court order to be able to collect electronic information, or any kind of private information, for that matter. That doesn't necessarily stop them from doing surveillance, but they are well aware of what would stick in court and what would get thrown out.

This needs to be made specific: PicScout evidence collected without an appropriate court order or warrant is NOT admissible in court. That won't stop them from rudely snooping, but it would make their business model change DRASTICALLY. They would only be worth hiring in the US if there is a very specific case where the surveillance can be justified and is approved by judicial oversight. No more gill netting for the trollbot.

Fair is fair. We have a Constitution and a Bill of Rights for some very good reasons. I'm not done using my civil rights so I think I'l stand up for them.
Title: Re: How is this different from PicScout?
Post by: Robert Krausankas (BuddhaPi) on May 27, 2012, 03:36:33 PM: I think what you're missing here is that Picscout is not operating from the US, if it was, this would be a different ballgame, as they would have to play by our rules and laws. Even though Getty Images owns Picscout I doubt they will ever move that portion to US soil.. As web hosts, and developers, it is part of our ethical and moral responsibility to educate and protect our clients and any future clients from the trolls.
Title: Re: How is this different from PicScout?
Post by: SoylentGreen on May 27, 2012, 05:43:21 PM: I recall that this issue came up a little ways back. I guess that the controversy is still ongoing.

My opinion is that Picscout spiders content that's intentionally published for "public consumption".
So, it's practically impossible to nail them for that.

There have been reports of the Picscout ignoring robots.txt, etc, and spidering hidden directories.
That's not technically illegal to my knowledge. However, one could litigate on the basis of "trespass to chattels".
But, a judge would have to decide on the merits, and Getty would defend it's 20 million investment in Picscout (and ongoing revenue from it) vigorously.

S.G.
Title: Re: How is this different from PicScout?
Post by: lucia on May 27, 2012, 06:07:53 PM: The other difficulty is even if the judge thinks image scraping and ignoring robots.txt violates something or another, you still have to prove they did it. Many people just don't keep access logs forever-- and if picscout spoofs user agents and uses a range of IPs they might not be able to prove the access was picscout.
Title: Re: How is this different from PicScout?
Post by: Moe Hacken on May 28, 2012, 10:37:14 PM: Buddhapi, I understand what you're saying, and I agree you can't stop them from trolling all the want from outside the fence. However, the cases are not being brought to courts by PicScout, it's their clients using PicScout's "evidence" who are taking people to court in the U.S.

I don't know how it works in other countries, but my layman's understanding is that illegally collected evidence will usually get thrown out of court here. If you have a precedent that makes PicScout's evidence illegal except when supported with a proper warrant, then any evidence they collect for a customer in the United States without one will be useless. Again, they would only be hired in the US in cases where the legal bases are covered properly, such as when a warrant has been issued.

I understand you can't shut down a foreign corporation, but you can regulate its products here as we see fit. France punished Google for their inadvertent snooping, for example. They're more concerned with individual privacy that with Google's right to create a better user experience by reading their email.

Soylent Green, I agree that website content is intended for "public consumption", and anything you publish on the internet where the public can see it is fair game. However, PicScout goes too far when they ignore the robots.txt convention because that is exactly where you are saying "this is no longer public, no looking". Most people are not savvy enough with their servers to password secure directories or block access by IPs, so the robots.txt method is the most likely solution they'll use to keep information private on their server. I don't think it's fair play for PicScout to say "just because my spybot is really studly I have the right to crash into your private directories." That's like saying that if someone burglarizes your house because you didn't invest in a really fancy security system, it's your bad and you shouldn't have anything to complain about. And if they happen to be the FBI and they find something illegal in your house, they can now use this evidence to bring criminal charges. I don't think it works that way at all.

Lucia makes a good point about how difficult it can be to even track these trollbots because they go around faking their identity and switching IP addresses. I'm sure we haven't even started to see their dirty little tricks. They're fundamentally hacking on a level that's currently a legal gray area.

However, a person being sued for copyright infringement doesn't need to prove PicScout did it. They just have to ask how the evidence was obtained and let the plaintiffs explain. If I were the defendant, I would vigorously encourage the plaintiff to be VERY specific about how PicScout got the evidence and would even ask for receipts, contracts, and even communications between the plaintiff and the provider of the evidence. Also, I would ask for a lot of details about the actual crawl, such as dates, specific logs, filenames, file sizes, time spent, bandwidth sucked, technician's name and qualifications, and a very specific record of how many false positives the PicScout has generated and how they arrived at these performance benchmarks.

Again, I'm just a layman and I don't know if any of this could hold any water in our court system. Just throwing some ideas around. The Google cases got me thinking because there are some analogies about the civil rights issues involved.
Title: Re: How is this different from PicScout?
Post by: SoylentGreen on May 28, 2012, 11:33:03 PM: Many good points made in this discussion. There are several prior discussions along these lines as well.

The issue is that ignoring "robots.txt" is not illegal under any US law.
A person could make an argument to a judge that "robots.txt was ignored", if it ever went that far. But, that's kind of weak.
But, there are much better front-line defenses that work better than this.

Furthermore, it's my personal opinion that Picscout is not necessarily masking its presence.
I have recent evidence of visits using the same old IP addresses and provider.
I don't think that we can say with certainty who is actually visiting our sites without further evidence.

S.G.
Title: Re: How is this different from PicScout?
Post by: Moe Hacken on May 28, 2012, 11:48:23 PM: Quote from: SoylentGreen on May 28, 2012, 11:33:03 PM
The issue is that ignoring "robots.txt" is not illegal under any US law.

Exactly. I'm saying it should be, but that's probably an ACLU-size case. It's not a recourse at present, but it's something to think about if this tort abuse epidemic continues. Don't forget it's not just about pretty pictures: Music, software, fonts, video, and even text are items that are already being trolled from every angle imaginable.

I'll give you another example: if you have a secure database with people's medical records and PicScout is able to defeat the security and decrypt it, do they have to right to sell it to insurance companies that want to know who they should turn down? Is it legal just because they CAN break into it and decode it?

The notion of asking PicScout for performance accountability is analogous to challenging a speed ticket by asking for the radar gun's maintenance records, model, technology, age of equipment, record of false positives, etc. The idea is to diminish the credibility of the technology they're using as much as possible, in order to weaken the weight of the evidence it produced.
Title: Re: How is this different from PicScout?
Post by: SoylentGreen on May 28, 2012, 11:56:04 PM: Breaking into a system/breaking passwords or even bypassing captcha protections is against the law in the US.

Here's an interesting link:
http://www.wired.com/threatlevel/2011/11/anti-hacking-law-too-broad/

S.G.
Title: Re: How is this different from PicScout?
Post by: Moe Hacken on May 29, 2012, 12:22:40 AM: That's a great link. Thank you Soylent Green!

So there we have a possible tactic. Password secured image directories that only the website's pages are authorized to access for display. The bot can't get evidence legally from the directory even if it can defeat the security.

On the other hand, if someone is committing an infringement on a public page, then you can look for it on the public part of the web like the Google image search engine does, or you can visit the page manually to collect the evidence legally.

The ideal goal is to have a level playing field. Those of us who own intellectual property need to be able to protect it and seek relief when we have been infringed upon, but I strongly disagree that these bully tactics should be legal.

For the most part we are seeing a whole lot of innocent infringement instances being totally overblown in these threatening letters (and calls!) This copyright troll approach is like treating dandruff by decapitation.
Title: Re: How is this different from PicScout?
Post by: SoylentGreen on May 29, 2012, 10:41:20 AM: I think that anything that's displayed on a web page (even dynamically) can be picked up by a bot.
Anything that's "publicly" displayed is hopelessly "fair game".

S.G.
Title: Re: How is this different from PicScout?
Post by: Moe Hacken on May 29, 2012, 11:21:24 AM: Quote from: SoylentGreen on May 29, 2012, 10:41:20 AM
I think that anything that's displayed on a web page (even dynamically) can be picked up by a bot.
Anything that's "publicly" displayed is hopelessly "fair game".

S.G.

Yes, I believe that any public page, even dynamic ones, can be picked up by a bot. Google has gotten very good at indexing dymanic pages, which used to be a problem for search engine optimization purposes when Google's crawlers weren't so good at it.

Google's crawlers will only crawl the pages that are public, though. There are many ways in which a dynamic pages can exist and still not be intended for public view. Maybe one keeps a page private in one's blog because one doesn't want pictures of one's children on the free internets, as an example.

PicScout will go there too, and even if there was an infringement in the page, I have a problem with calling that a "published" page. Google's crawler would not have seen it. No human could have seen it except by ... well, HACKING into the server. Which IS against the law.
Title: Re: How is this different from PicScout?
Post by: Jerry Witt (mcfilms) on May 29, 2012, 12:55:00 PM: I would be curious to learn if any stock agency is pursuing claims against people with their images in a protected directory. I always presumed that Picscout would hunt down the images, pass the relevant "hits" to the stock company. From there the stock company would have an employee actually visit the public-facing page where these images appear, take screenshots and any other notes.

If that is the workflow, I think you would have a hard time making the case that the images were in a private directory if you were publishing them on an HTML page for the world to see. On the other hand, if you are producing a password protected page and the images are in a protected directory, I don't think PicScout is able to spider those images. They are not "hacking" into your system. They are "just" ignoring your robots.txt directive. And unfortunately at this stage, robots.txt is seem as more of a suggestion than an enforceable rule.
Title: Re: How is this different from PicScout?
Post by: SoylentGreen on May 29, 2012, 01:00:24 PM: Thanks for clarifying this, McFilms.
I wasn't going to add any more to the topic, as it seems to have gotten to the "herp-a-derp" stage.
Picscout and Getty et al are not "hacking" into anyone's "servers".

S.G.
Title: Re: How is this different from PicScout?
Post by: Robert Krausankas (BuddhaPi) on May 29, 2012, 01:34:46 PM: exactly correct, int he nearly 15 years I have been in business, I have ALWAYS put jobs being developed into a password protected directory, and given access to my clients only. By looking thru my log files I can confirm no bots, good or bad have been in there..

Quote from: mcfilms on May 29, 2012, 12:55:00 PM
I would be curious to learn if any stock agency is pursuing claims against people with their images in a protected directory. I always presumed that Picscout would hunt down the images, pass the relevant "hits" to the stock company. From there the stock company would have an employee actually visit the public-facing page where these images appear, take screenshots and any other notes.

If that is the workflow, I think you would have a hard time making the case that the images were in a private directory if you were publishing them on an HTML page for the world to see. On the other hand, if you are producing a password protected page and the images are in a protected directory, I don't think PicScout is able to spider those images. They are not "hacking" into your system. They are "just" ignoring your robots.txt directive. And unfortunately at this stage, robots.txt is seem as more of a suggestion than an enforceable rule.
Title: Re: How is this different from PicScout?
Post by: lucia on May 29, 2012, 01:38:03 PM: Moe--
I don't know if the legal solution you seek is feasible. I think it's likely impractical and prefer to simply focus on what is required to keep a particular bot out of my images. With that in mind, I think there are a few things worth clarifying.

If you want to use technology to keep picscout or the public away from something on your server, you need to learn the difference between robots.txt and .htaccess. robots.txt is sort of like a suggestion. "Bad" robots won't obey it. Some robots won't read it. Some robots aren't programmed to let you tell them "I mean YOU". As far as I am aware, there is no law requiring anything to obey it. (That said, should you ever wish to pursue some legal theory that the bot should not have violated, I'm sure placing "stay out" language in robots.txt is wise.)

.htaccess is completely different from robots.txt more like a lock. (Unfortunately, it can also be a bit difficult to handle and you can lock yourself out too.)

1) If you images are in a directory that is password protected by .htaccess, picscout, other bots and people aren't going to load those unless it discovers your password. As far as I am aware, piccout's bot is not programmed to try to hunt down passwords, but getting access by hunting down a password might be illegal. (I think it is in fact.) If you monitored your logs, saved them and picscout tried to guess your password, you might be able to get proof. If you had sufficient proof, maybe you could get a judge to get them for hacking. But I don't think picscout does this.

2) There are other ways using .htaccess to keep picscout out of a directory. You can limit display to only certain IPs (like your own, your company's and so on.) If you do this correctly and thoroughly, picscout isn't going to see those images. If you do it incorrectly, picscout might get in anyway and likely their getting in would not be illegal.

3) If you put images in a .htaccess protected directory and them display them in a publicly visible web page, those images will only display to people who have permission to view the images. Other people and picscout will see broken images. You will rarely want to do this because site visitors will be by your broken web page.

4) If you create a password protected (i.e. private) web page which displays images in unprotected directories, people and bots (like picscout) will be able to see the images if they guess the image urls. This may be possible depending on how you have your server set up. It is wise to set things to not display the directory contents for directories with no index.html file. Otherwise, the bot will find the urls and see the images even though your web page is "private". Their viewing these images will not be "hacking" because your making the web page "private" is not the same as making the image "private". (Note: some services may be programmed to let you keep your whole micro-blog somewhat private-- as Facebook does. But that's because the service just applies the same level of privacy to each thing individually.)

If you are worried about picscout, don't imagine that posting pictures on a "private" web page necessarily means the pictures are private. If you really want images to be private you need to make the images private as well. To do this: password protect the images-- not the web page. ( Better yet-- password protect both. See above for the problem associated with protecting the images but not the web page.)

As I wrote, given current laws I don't know if it's possible to device a legal mechanisms that will "get" picscout if the load your images. But if it is possible, the method is going to require you to understand the difference between things like "robots.txt" and ".htaccess"
Title: Re: How is this different from PicScout?
Post by: Moe Hacken on May 30, 2012, 10:50:07 AM: Thanks everybody for your insights. Lucia, especially your great summary of options we have to protect our server data from unwelcome intrusions.

I'm trying to think of reasonable ways for a website developer to be able to continue a useful workflow without exposing themselves to the trolls. We have seen, for example, that even commercially licensed Wordpress templates can get a body in trouble.

I guess one practical way to protect oneself during the development stage of a website project is to completely password-protect the directory that the website is in so that only a select group of people, e.g. the clients and designers, can view the project online. In theory the bots won't be able to break into an .htaccess-protected directory. This gives the developer time to research the images well and make the best effort to clear all images (and other content, for that matter) before actually "publishing" the page. That would work, wouldn't it, Lucia?

A lot of the people getting caught in this were in the development stage or had employees or designers make an innocent infringement during development. The question of what constitutes "publishing" is also a gray area, but one can drown in that gray sea. The best protection is, of course, to only use images that are under full licensing and legal control of the responsible parties.
Title: Re: How is this different from PicScout?
Post by: Robert Krausankas (BuddhaPi) on May 30, 2012, 11:09:45 AM: Quote from: Moe Hacken on May 30, 2012, 10:50:07 AM
I'm trying to think of reasonable ways for a website developer to be able to continue a useful workflow without exposing themselves to the trolls.

I guess one practical way to protect oneself during the development stage of a website project is to completely password-protect the directory that the website is in so that only a select group of people, e.g. the clients and designers, can view the project online.

This gives the developer time to research the images well and make the best effort to clear all images (and other content, for that matter) before actually "publishing" the page.

The best protection is, of course, to only use images that are under full licensing and legal control of the responsible parties.

Fully agree with password protecting directories for projects that are under developement.

The only time a developer should have to research any images is if the developer is supplying the images, which doesn't always work, we have seen examples of images being properly licensed by the developer, but the trolls still go after the end user.

What I have done in this regards is to educate my clients on the issues at hand, and I require my clients to supply me with images and content, knowing full well what the issues are. If they then choose to grab images from google, it's not my problem. This is clearly stated in my contract, as well as my hosting policy I also go so far as to include in my contract a "hold harmless" clause, so my clients cannot come back at me for something they supplied to me AFTER it has been explained to them how they should acquire images /content.

You'd be shocked to hear how many of them say "I'll take my chances" or "what are the odds of getting nabbed"...after hearing this a few times is when I added the clause in my contract. Just like here on ELI we can't help everybody, especially those that are thick headed and don't "get it". I always do what in the best interest of my clients, but sometimes you just have to raise your arms in surrender..
Title: Re: How is this different from PicScout?
Post by: SoylentGreen on May 30, 2012, 11:31:39 AM: Save your time and worry.
Just don't use any products from Getty, Masterfile, or anyone else that uses Picscout.
Saved everyone some time... you're welcome!!

S.G.
Title: Re: How is this different from PicScout?
Post by: Moe Hacken on May 30, 2012, 12:26:09 PM: Quote from: buddhapi on May 30, 2012, 11:09:45 AM
You'd be shocked to hear how many of them say "I'll take my chances" or "what are the odds of getting nabbed"...after hearing this a few times is when I added the clause in my contract. Just like here on ELI we can't help everybody, especially those that are thick headed and don't "get it". I always do what in the best interest of my clients, but sometimes you just have to raise your arms in surrender..

Buddhapi, I wouldn't be shocked at all. I knew a hippie silkscreen artist in San Diego, CA, who was the ultimate Dead Head and a very talented graphic artist and silkscreener. He made these beautiful tie-dye shirts with the famous Grateful Dead "Steal Your Face" skull on the front, but his shirts had an interesting twist: He had one version with the San Diego Padres script overlapping the top of the skull, and another one with the Chargers Bolts on the skull, kind of like they wear it on the helmets. They were very cool. He approached me at a bar in Ocean Beach and offered to sell me one. I asked him why they didn't have his shirts at the ballpark. He laughed and said they don't want him using the logo, but he felt he had a constitutional right to free commerce, and the MLB is too big to go after a little silkscreener, yada yada. He also said "if the Dead don't care, why should the Padres."

Long story short, I later heard he was arrested inside the concourse of Qualcomm stadium while shamelessly hawking those shirts.

I don't know if it's true that he was arrested, but I can guarantee you I personally saw on the OUTSIDE of the concourse at Qualcomm stadium, right outside the gate and in plain view of security personnel, totally hawking his shirts like a carnival barker.

My point is that I do believe there is a serious cultural problem with copyright-related issues. There are all kinds of misconceptions, misinformation and outright myths out there, and no shortage of outright defiance of copyright law.

I very much understand the frustration of people whose property is being taken. No one likes to have their stuff stolen and it's just not fair or just or right in anyway. Taking an image that belongs to someone else is tantamount to shoplifting with your right-click button.
Title: Re: How is this different from PicScout?
Post by: lucia on May 30, 2012, 04:54:25 PM: <blockquote>In theory the bots won't be able to break into an .htaccess-protected directory. This gives the developer time to research the images well and make the best effort to clear all images (and other content, for that matter) before actually "publishing" the page. That would work, wouldn't it, Lucia?</blockquote>
It should. Make sure the images are in the same protected directory as the web page. This minimizes the potential for screwups. The default for wordpress is to store the images in their own directory-- and blog posts and pages are protected differently. But you can protect the images-- you just need to remember to do so.

It's best to protect stuff during development for all sorts of reasons. Even aside from copyright-- you probably don't want competitors snooping around. (Ok... I know for lots of sites this is a silly idea. But really, it's best to protect during development.)