Watch John Chow on the latest episode of MarketLeverageTV!
 

Scraping The Scrapers With Feed Footer Wordpress Plugin

written by John Chow on July 17th, 2007

This is an update about Ankesh Kothari’s RSS Feed Footer Wordpress plugin. The plugin allows you to add messages under all your new blog posts in your RSS feed. While the number one use of the plugin would be to sell RSS ads, I’ve found another great use for it – sending a message to all those lazy asses who try to scrape my RSS feed.

Because I offer a full feed RSS, it’s very simple for a scraper to take the feed and reproduce the information onto another site. Scrapers are really the lowest form of Internet life. They don’t want to put in the work it takes to build a successful site so they just steal the works of others by scraping their feeds. There are even get rich quick scams that will sell this type of service.

To help prevent some of the damage scrapers do to my blog, I’ve used the Feed Footer plugin to add this message at the end of the post:

Attention: Unless you are reading this from a RSS reader, you are reading a scraped feed. This site has violated copyright laws by stealing the content of John Chow dot Com. Please let us know where you read this so we can take legal action against the scraper.

The above message only shows up in the RSS feed (subscribe to it if you want to see it) – you will never see it on this blog. However, that message will show up on a scraper blog because they steal content with RSS. Once the scraper read the above message in ever new posts, he’ll think twice about keeping the feed up. If he keeps the feed up, hopefully someone reading the message will alert me so I can take more action against the scraper.

You can download the Feed Footer plugin here or read more information about the plugin here. Ankesh did a great job on this plugin and I sent him a pitcher of beer for writing it.

Tweet This Tweet This Post!
English flagItalian flagKorean flagChinese (Simplified) flagChinese (Traditional) flagPortuguese flagGerman flagFrench flagSpanish flagJapanese flagArabic flagRussian flagGreek flagDutch flagBulgarian flagCzech flagCroat flagDanish flagFinnish flagHindi flagPolish flagRumanian flagSwedish flagNorwegian flagCatalan flagFilipino flagHebrew flagIndonesian flagLatvian flagLithuanian flagSerbian flagSlovak flagSlovenian flagUkrainian flagVietnamese flag
  1. Burn in hell scrapers! :twisted:

  2. That is a great idea John, I actually would have not thought to do that. Thanks

  3. Scrapers really are a pain in the ass. They are exactly what John says they are and yet you see them being sold all the time. People just buy them and think they are going to cash in with no work involved. Good luck with that. :roll:

  4. Dave

    I actually welcome scrappers (sometimes). Helped me build massive inlinks launching my site to high rankings. The internal linking within posts carries over to the scrapper sites creating more inbound links.

    • The problem comes when they start passing off your work as their own. I don’t mind people linking to my work, but when they simply copy my site onto theirs, I stop being thankful for the few links I get out of it.

    • Some scrappers even take it a step farther by removing all internal links in a post.

      So they steal your content then remove all links that point back to you so you get no benefit and there is no way for there readers to find out where the content came from in the first place.

    • Yeah, unfortunately it’s pretty easy to remove links with programming. Scraping a page/feed is so easy, and I wish there was some way to stop it. I know there are some services out there that will proactively search out scraped content for you, but have never used any of them.

  5. You might want to vary that message every once in awhile, since the scraper could just write a line or 2 of code to take that out…

    • hmm,good point

      they’re clever asses

    • Not only that but I have seen scraper pages only take the first part of the post in the rss feed.

      The footer would never make it in to the post in the first place.

      Maybe alternating it from the header to the footer and a random place after a page brake in the middle… all thou that might be over kill.

  6. Thanks for another great plugin, John. I wonder how many plugins I can install in WordPress before it starts affecting performance? :)

    Scrapers aren’t necessarily a bad thing. The plugin is useful because you can include your web site URL which would help your rank in Google and Technorati.

  7. This is one plugin that I really dont have any interest in. Not a fan of scrapers

  8. I got scraped once by a Christian Promotion Group, they demanded that I’d blog better or else I’d “Live In Sin” for the rest of eternity or something.

    Read more about it here

  9. Dev

    Thats a very nifty way to protect your IP John :razz:

  10. Awesome tool! Look who’s laughing now!! :twisted:

  11. I don’t think this plugin will help to a major extent. You see, scrapers use softwares to pull content form your blog via your RSS feeds. However, these softwares can be programmed to ignore links or only take the first,say, 50 words in the blog entry. In that sense, it will totally ignore the links at the footer of the feed.

  12. I don’t think this plugin will not help to a major extent. You see, scrapers use softwares to pull content form your blog via your RSS feeds. However, these softwares can be programmed to ignore links or only take the first,say, 50 words in the blog entry. In that sense, it will totally ignore the links at the footer of the feed.

  13. The web is always changing, so you just gotta wonder how long it will be before they come up with something to combat this too!

    But in the meantime, get yer own content! :)

  14. And to check if someone already stole your content just enter :

    “Attention: Unless you are reading this from a RSS reader, you are reading a scraped feed. This site has violated copyright laws by stealing the content of John Chow dot Com. Please let us know where you read this so we can take legal action against the scraper.”

    in google.

    :mrgreen: :mrgreen:

  15. Oz

    Hey, good tip. Might I suggest that you put a auto-generated alphanumeric bit-o-text in the middle (or somewhere) in that statement so that it is different everytime. So that the spammers can’t just regex-out that statement…

    If there was a “49djhsdj5″ somewhere, they wouldn’t be able to remove it as the regex will fail…

    …or something like that!

  16. I’ve found scraper software before. It’s ridiculous.

  17. Dani

    Found you through the Blogger’s Choice Awards website. Seems you’re in the running for Best Blog about Blogging. Looks like you put out a lot of good info. You should totally add a Brag Badge code. So your thankful subscribers can vote for this site. Good luck! :smile:

  18. Thanks John, I’ve used this plugin. I put my coryright notice on the feed footer. That really help to stop the scrapes or at least add backlink to my blog :mrgreen:

  19. This is interesting. I’ve seen a few blogs that I read add this and wondered about it. For the most part I’m a knit blogger (there are thousands of us!) and many have problems with scrapers stealing copyrighted content either words, knitting patterns or images.
    Thanks for the heads up. I kept forgetting to ask fellow knit bloggers about it.

  20. Good one John :) the footer msg is really good :wink:

  21. Awesome post. Thanks for the advice.
    To bad the internet has crashed.
    Damn scrapers.

  22. Don’t Text Link Ads have a similar plugin for their feedvertising feature? Would it conflict with the aforementioned plugin if both are installed?

    FWIW, I use a cool plugin called ©Feed which adds a digital fingerprint and the logged IP address of a possible scraper in the footer of the feed. It also aggregates searches for the fingerprint ID you use from various search engines, so you can find out pretty quickly if someone is scraping your content. Orginally written in German, there’s an English version that I managed to get working. I LOVE it.

    http://wordpress.org/extend/plugins/copyfeed/

    :mrgreen:

    • I will certainly try out that plugin!
      Thanks for mentioning it!

    • This is no different from companies trying to DRM stuff and then people breaking them the next day. As smart as you think you are, believe me, the people in the black hat circles are smarter.

      The good scrappers will use multiple proxies with different IPs thereby defeating the purpose of a digital signature and footprint.

  23. I don’t understand why you care so much about scrapers.
    So yeah, they steal your content, in nearly every post you make, you have a few links to oter posts you made, so here you get some links.
    Plus, you have a few posts with affiliate links, so more people see your affiliate links.

    I only wish my blog was scraped more often, I can use it in so many ways.

  24. Still no time to try it out this plugin

  25. I think, the readers of your feed won’t like it, if this message appears below every post.

  26. When you first said about the plugin, it seemed useless to me, but now i understood how can it be used.

  27. Sounds like a good idea but probably doesn’t deter anyone from scraping your site.

    Why?

    Because most scraper sites contain 3 sets of google ads above the fold so I’m sure no one would even see your disclaimer.

    What I think it does do is make your readers feel like they’ve done something wrong because you are making them read your “Rant” – even if they are reading it from their rss reader you are totally distracting people from your purpose by trying to police a few.

    I think it’s a bad idea.

  28. i’m getting some backlinks to some of these wierd sites that have just seemed to have nicked loads of content.

    now i understand it, they’ve taken it from the rss feeds. i’m gonna install this plug in right away. thanks for this article.

  29. Scrapers are the lowest form of life. How true!!! John, you are a genius in coming up with this term to describe these people

  30. That’s not going to do anything for anyone with even a little bit of regex knowledge.

    For example, in php

    preg_replace(’/Attention:(.*)scraper./’,”,$string) will remove that quite easily.

    You’re wasting your time, slowing down your rss feed (if only by a little) and adding in random characters isn’t going to help either.

    Leslie: john didn’t come up with the term. It’s been around for ages.

  31. nice trick to prevent people from copying your content for their blog

  32. You are a genius!
    The scrapper will scrapped by himself !
    A nice Idea!
    I will install the plug in in my blog. How can I?

    Saiful’s last blog post: Secrets to Online Money Vault using Blogger Blog

Trackbacks

  1. Getting Rich By Stealing Contents | Charles Lau dot Com - July 17, 2007 at 12:56 pm
  2.   Formas de Combatir el Robo de Contenido por ProWeblogs - July 18, 2007 at 7:57 am
  3. Blog pirates aren’t anything new | WinExtra - July 31, 2007 at 3:03 pm
  4. Blog pirates aren’t anything new — Shooting at Bubbles - May 30, 2009 at 8:03 pm

Got an opinion?

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>