John Chow dot Com Online Investment Review - Make Money Online Investing in Businesses
 

Scraping The Scrapers With Feed Footer WordPress Plugin

written by John Chow on July 17, 2007

How to make $593 in less than one hour

This is an update about Ankesh Kothari’s RSS Feed Footer WordPress plugin. The plugin allows you to add messages under all your new blog posts in your RSS feed. While the number one use of the plugin would be to sell RSS ads, I’ve found another great use for it – sending a message to all those lazy asses who try to scrape my RSS feed.

Because I offer a full feed RSS, it’s very simple for a scraper to take the feed and reproduce the information onto another site. Scrapers are really the lowest form of Internet life. They don’t want to put in the work it takes to build a successful site so they just steal the works of others by scraping their feeds. There are even get rich quick scams that will sell this type of service.

To help prevent some of the damage scrapers do to my blog, I’ve used the Feed Footer plugin to add this message at the end of the post:

Attention: Unless you are reading this from a RSS reader, you are reading a scraped feed. This site has violated copyright laws by stealing the content of John Chow dot Com. Please let us know where you read this so we can take legal action against the scraper.

The above message only shows up in the RSS feed (subscribe to it if you want to see it) – you will never see it on this blog. However, that message will show up on a scraper blog because they steal content with RSS. Once the scraper read the above message in ever new posts, he’ll think twice about keeping the feed up. If he keeps the feed up, hopefully someone reading the message will alert me so I can take more action against the scraper.

You can download the Feed Footer plugin here or read more information about the plugin here. Ankesh did a great job on this plugin and I sent him a pitcher of beer for writing it.

Did you enjoy this post? Get John Chow Dot Com updates via email...

Stay up to date with all of John Chow’s tips for making money online and blog posts by subscribing via email. Your email will be kept private and never shared with anyone.

{ 56 comments }

Stephen July 17, 2007 at 11:47 am

Burn in hell scrapers! :twisted:

Kevin July 17, 2007 at 12:39 pm

A-men to that. Even though some of them link back to your blog which helps with page ranking and bringing in readers, most of them don’t which annoys bloggers very much.

Debo Hobo July 17, 2007 at 1:15 pm

Outstanding! Stop them in thier tracks. It would be pretty stupid for someone to do tht now. Beside your work is so distintive why would they think they can just put thier name on it and expect to be believable. They are idiots.

a video a day July 17, 2007 at 1:48 pm

the fact is that it’s bad for your seo if they scraped your contents

simon July 17, 2007 at 5:00 pm

If they even don’t give a backlink that it’s very very bad.

lionstarr July 18, 2007 at 3:19 am

IT’s not only bad for SEO, it’s simpy not ok. It’s you content, you took the time to write it, so it’s not ok, if anybody simply scrapes it!

Ashwin July 17, 2007 at 1:39 pm

Damn Straight!!! You tell em man!!

Real Estate Editorial July 17, 2007 at 11:53 am

That is a great idea John, I actually would have not thought to do that. Thanks

a video a day July 17, 2007 at 1:49 pm

well, no one would steal my content, as there is none if you know what i mean?

web development blog July 17, 2007 at 9:43 pm

Yeah, thanks for the idea John. That’s a great way to use the plugin.

ouchs July 17, 2007 at 11:53 am

useful plugin!

Freebies July 17, 2007 at 12:01 pm

Scrapers really are a pain in the ass. They are exactly what John says they are and yet you see them being sold all the time. People just buy them and think they are going to cash in with no work involved. Good luck with that. :roll:

Bloggeries July 17, 2007 at 2:12 pm

Agreed it’s also funny how alot of these people seem to wonder why they aren’t getting any traffic… Maybe it’s because you have NOTHING original? :idea:

webd360 July 17, 2007 at 5:30 pm

Yea, I love when they ask questions like that, another popular one being “Why can’t I make any money with adsense?”

Dave July 17, 2007 at 12:17 pm

I actually welcome scrappers (sometimes). Helped me build massive inlinks launching my site to high rankings. The internal linking within posts carries over to the scrapper sites creating more inbound links.

Freebies July 17, 2007 at 12:31 pm

The problem comes when they start passing off your work as their own. I don’t mind people linking to my work, but when they simply copy my site onto theirs, I stop being thankful for the few links I get out of it.

Steven Smethurst July 17, 2007 at 4:06 pm

Some scrappers even take it a step farther by removing all internal links in a post.

So they steal your content then remove all links that point back to you so you get no benefit and there is no way for there readers to find out where the content came from in the first place.

web development blog July 17, 2007 at 9:46 pm

Yeah, unfortunately it’s pretty easy to remove links with programming. Scraping a page/feed is so easy, and I wish there was some way to stop it. I know there are some services out there that will proactively search out scraped content for you, but have never used any of them.

Matt July 17, 2007 at 12:30 pm

You might want to vary that message every once in awhile, since the scraper could just write a line or 2 of code to take that out…

a video a day July 17, 2007 at 1:50 pm

hmm,good point

they’re clever asses

Steven Smethurst July 17, 2007 at 4:10 pm

Not only that but I have seen scraper pages only take the first part of the post in the rss feed.

The footer would never make it in to the post in the first place.

Maybe alternating it from the header to the footer and a random place after a page brake in the middle… all thou that might be over kill.

Johan Cyprich July 17, 2007 at 12:32 pm

Thanks for another great plugin, John. I wonder how many plugins I can install in WordPress before it starts affecting performance? :)

Scrapers aren’t necessarily a bad thing. The plugin is useful because you can include your web site URL which would help your rank in Google and Technorati.

a video a day July 17, 2007 at 1:52 pm

johan, a very good point indeed

i think john should write a post about plugins slowing blogs down

duplicate contents are hated by google

bob cobb July 17, 2007 at 12:33 pm

This is one plugin that I really dont have any interest in. Not a fan of scrapers

pressmicro.com July 17, 2007 at 12:37 pm

Dont you mean you ARE interested in this plugin if you are not a fan of scrapers?

Rhys July 17, 2007 at 12:40 pm

I got scraped once by a Christian Promotion Group, they demanded that I’d blog better or else I’d “Live In Sin” for the rest of eternity or something.

Read more about it here

Steven Smethurst July 17, 2007 at 4:13 pm

LOL, thats funny stuff.
Damn the heathen christian

Dev July 17, 2007 at 2:04 pm

Thats a very nifty way to protect your IP John :razz:

Bloggeries July 17, 2007 at 2:10 pm

Awesome tool! Look who’s laughing now!! :twisted:

yuga July 17, 2007 at 2:53 pm

I don’t think this plugin will help to a major extent. You see, scrapers use softwares to pull content form your blog via your RSS feeds. However, these softwares can be programmed to ignore links or only take the first,say, 50 words in the blog entry. In that sense, it will totally ignore the links at the footer of the feed.

yuga July 17, 2007 at 2:53 pm

I don’t think this plugin will not help to a major extent. You see, scrapers use softwares to pull content form your blog via your RSS feeds. However, these softwares can be programmed to ignore links or only take the first,say, 50 words in the blog entry. In that sense, it will totally ignore the links at the footer of the feed.

cooliojones July 17, 2007 at 4:11 pm

The web is always changing, so you just gotta wonder how long it will be before they come up with something to combat this too!

But in the meantime, get yer own content! :)

simon July 17, 2007 at 5:15 pm

get yer own content!
That’s right :smile:

Zlatan July 17, 2007 at 4:15 pm

And to check if someone already stole your content just enter :

“Attention: Unless you are reading this from a RSS reader, you are reading a scraped feed. This site has violated copyright laws by stealing the content of John Chow dot Com. Please let us know where you read this so we can take legal action against the scraper.”

in google.

:mrgreen: :mrgreen:

Oz July 17, 2007 at 4:24 pm

Hey, good tip. Might I suggest that you put a auto-generated alphanumeric bit-o-text in the middle (or somewhere) in that statement so that it is different everytime. So that the spammers can’t just regex-out that statement…

If there was a “49djhsdj5″ somewhere, they wouldn’t be able to remove it as the regex will fail…

…or something like that!

web development blog July 17, 2007 at 9:51 pm

You’d still be able to make a regex to match the statement most of the time:

“~Attention:.*scraper.~”

Dog Information July 17, 2007 at 4:24 pm

I’ve found scraper software before. It’s ridiculous.

Dani July 17, 2007 at 4:31 pm

Found you through the Blogger’s Choice Awards website. Seems you’re in the running for Best Blog about Blogging. Looks like you put out a lot of good info. You should totally add a Brag Badge code. So your thankful subscribers can vote for this site. Good luck! :smile:

simon July 17, 2007 at 4:57 pm

Thanks John, I’ve used this plugin. I put my coryright notice on the feed footer. That really help to stop the scrapes or at least add backlink to my blog :mrgreen:

Carla July 17, 2007 at 5:22 pm

This is interesting. I’ve seen a few blogs that I read add this and wondered about it. For the most part I’m a knit blogger (there are thousands of us!) and many have problems with scrapers stealing copyrighted content either words, knitting patterns or images.
Thanks for the heads up. I kept forgetting to ask fellow knit bloggers about it.

Sagar July 17, 2007 at 9:02 pm

Good one John :) the footer msg is really good :wink:

Willie July 17, 2007 at 11:12 pm

Awesome post. Thanks for the advice.
To bad the internet has crashed.
Damn scrapers.

Lincoln July 18, 2007 at 12:15 am

Don’t Text Link Ads have a similar plugin for their feedvertising feature? Would it conflict with the aforementioned plugin if both are installed?

FWIW, I use a cool plugin called ©Feed which adds a digital fingerprint and the logged IP address of a possible scraper in the footer of the feed. It also aggregates searches for the fingerprint ID you use from various search engines, so you can find out pretty quickly if someone is scraping your content. Orginally written in German, there’s an English version that I managed to get working. I LOVE it.

http://wordpress.org/extend/plugins/copyfeed/

:mrgreen:

lionstarr July 18, 2007 at 3:28 am

I will certainly try out that plugin!
Thanks for mentioning it!

Tylerdurden July 18, 2007 at 9:43 pm

This is no different from companies trying to DRM stuff and then people breaking them the next day. As smart as you think you are, believe me, the people in the black hat circles are smarter.

The good scrappers will use multiple proxies with different IPs thereby defeating the purpose of a digital signature and footprint.

Mike Zak July 18, 2007 at 1:03 am

I don’t understand why you care so much about scrapers.
So yeah, they steal your content, in nearly every post you make, you have a few links to oter posts you made, so here you get some links.
Plus, you have a few posts with affiliate links, so more people see your affiliate links.

I only wish my blog was scraped more often, I can use it in so many ways.

lionstarr July 18, 2007 at 3:26 am

I have no problem with people scraping, if they tell me about it and scrape only the first x words, with a link to the full article following!

Mybloggo July 18, 2007 at 2:31 am

Still no time to try it out this plugin

lionstarr July 18, 2007 at 3:24 am

I think, the readers of your feed won’t like it, if this message appears below every post.

sandossu July 18, 2007 at 4:55 am

When you first said about the plugin, it seemed useless to me, but now i understood how can it be used.

SEO Web Design July 18, 2007 at 7:14 am

Sounds like a good idea but probably doesn’t deter anyone from scraping your site.

Why?

Because most scraper sites contain 3 sets of google ads above the fold so I’m sure no one would even see your disclaimer.

What I think it does do is make your readers feel like they’ve done something wrong because you are making them read your “Rant” – even if they are reading it from their rss reader you are totally distracting people from your purpose by trying to police a few.

I think it’s a bad idea.

property investment blog July 18, 2007 at 8:14 am

i’m getting some backlinks to some of these wierd sites that have just seemed to have nicked loads of content.

now i understand it, they’ve taken it from the rss feeds. i’m gonna install this plug in right away. thanks for this article.

leslie July 18, 2007 at 9:30 am

Scrapers are the lowest form of life. How true!!! John, you are a genius in coming up with this term to describe these people

Tylerdurden July 18, 2007 at 9:38 pm

That’s not going to do anything for anyone with even a little bit of regex knowledge.

For example, in php

preg_replace(‘/Attention:(.*)scraper./’,”,$string) will remove that quite easily.

You’re wasting your time, slowing down your rss feed (if only by a little) and adding in random characters isn’t going to help either.

Leslie: john didn’t come up with the term. It’s been around for ages.

Wahlau.NET July 19, 2007 at 2:37 am

nice trick to prevent people from copying your content for their blog

Saiful June 6, 2009 at 4:32 am

You are a genius!
The scrapper will scrapped by himself !
A nice Idea!
I will install the plug in in my blog. How can I?

Saiful’s last blog post: Secrets to Online Money Vault using Blogger Blog