The Tangled Web of Link Spam

In my last column, you were warned to “Never watch sausage being made,” lest you find the process so unappetizing you’d never eat it again. But even if you find sausage links tasty, you’ll want to spit out those spam links every time.

Last time, we explored the consequences of content spam, which include bad publicity and getting banned from the search engines. This time around, we’ll explore link spam techniques so you can avoid them or notice when your competitor stoops to them.

Before we do, let’s review why legitimate links are so important to your organic search rankings. Suppose you have a page that you’d love to be the No. 1 result for the search query “digital cameras.” Tens of millions of Web pages contain the words “digital cameras,” with millions of those pages featuring those words in the title. Search engines distinguish the quality of each of these pages by checking how many other pages link to them. Think of each link as a vote for the quality of the content. To get your page ranked No. 1, you’d need to get as many links to your page from as many other high-quality pages as possible.

Links are extremely important in determining search rankings for “digital cameras” and other highly competitive queries. So it’s no surprise that spammers have come up with a bag of tricks to fool search engines about their link strength. Link farms are the most popular technique, so we’ll tackle them first.

Link Farms

Link farms are the name for a spam technique in which spammers set up dozens or hundreds of ersatz sites to be crawled by search engines. Spammers create link farms just so they can put in thousands of links to other sites that they want to boost in search rankings. Search marketers need to be able to tell the difference between link farms and legitimate directories, so they can spend their time soliciting real directories for links, rather than sites that will do them no good.

Here are a few ways you can spot a link farm:

Links R Us. Each directory category has dozens and dozens of links – more than any visitor could ever use. Your suspicions should grow if the URLs seem to be strings of hyphenated words. Or if an IP checker reveals that many of those URLs come from the same “C” block (the same set of IP addresses in the network). Or if the pages from these sites are all from companies you’ve never heard of, and those pages resemble each other.

Odd Lot. The sites linked seem irrelevant to the directory topic or seem like a set of odds and ends with no central idea. You see links about baby care and the petroleum industry on the same page. Link farms are often thrown together haphazardly, most often by automated programs that spew the links onto pages with no rhyme or reason. A cousin of a link farm, a “free for all” site, allows anyone to post a link on any topic. It’s similarly worthless for improving your search rankings.

Dollar Store. None of the links seem very valuable. They consist of pages with nothing but advertisements, or content that makes no sense. Don’t be fooled if these pages have high Google PageRank values shown in the Google toolbar. Some spammers can artificially inflate a site’s PageRank for a while, but Google eventually catches on and adjusts the value.

Before requesting a link to your site from a directory, look it over to see if it exhibits the tricky business listed above. If it does, it’s probably a link farm. Search engines recognize more and more link farms every day. When they do, they stop counting those links toward a page’s ranking, so there’s no point in you getting your site listed there.

More Spammy Links

Although link farming is the most prevalent tactic for link spam, many other tricky techniques abound:

Hidden links. In my last column, we discussed hidden text, a spam technique that hides words from people but shows them to the search engines. Spammers hide links the same way, such as overlaying the links with other content, allowing them to boost the search rankings of pages with hundreds or thousands of invisible links.

Blog and guest book spamming. Some spammers use programs to automatically add links to blog comments and trackbacks, or to guest books. Most sites have eliminated guest books in response. Many bloggers now block readers from posting comments, or they approve each comment and trackback manually.

Tricky two-way links. Some spammers try to trick you instead of the search engines. When people agree to trade links with you (linking to your site if you in turn link to theirs), make sure they are playing fair. Some spammers add the link to your site, but code that link using JavaScript to hide the link back to you from the search engines. So you see the link back to your site, but the search engines don’t. Why do spammers go to all that trouble? Because the search engines believe that you’ve added a far-more-valuable one-way link to the spammer’s site. Check out the linking site with JavaScript turned off to make sure the search engines see the link back to you.

While not strictly a spam technique, search engines are not big fans of paid links, where a site sells links to other sites. Search engines ask that those links be tagged with a “nofollow” attribute, telling the search engines that these links are not unbiased votes for the quality of the content. My advice is that paying for links is fine, but you should do so for the traffic only. Pay for a link when the visitors that click on that link are worth the cost. (This is exactly the same calculation you make with paid placement ads.) Search engines work harder and harder each year to recognize paid links and to devalue them, so I don’t recommend buying links to improve your search rankings.

This wraps up our three-part series on spam. If your site has been banned or penalized for using these techniques, you can clean up your site and request reinstatement, which is usually granted (although reinstatement sometimes requires an extended period of explanation and begging).

MIKE MORAN is an IBM Distinguished Engineer and product manager for IBM’s OmniFind search product. His books (Search Engine Marketing, Inc. and Do It Wrong Quickly) and his Biznology blog are found at