• All Things SEM / SEO Blog
  • What Is SEO?
  • What is SEM?
  • About Marios Alexandrou
  • Article Archives
  • 742

  • Follow Me:
    Twitter
    LinkedIn
    Sphinn


  • Popular Posts
    • 55 SEO Interview Questions
    • I Won't Be Buying XRumer 5.0
    • CPM vs. CPC vs. CPA
    • SEO Clients: The Good, The Bad, The Ugly, and Umm... 2 More
    • WordPress Plugin Reveals a Black Hat's Network
    • Getting The Most From Your SEO Agency
    • Twitter is like... Metaphors for Newbies and Veterans Alike

  • Recent Comments
    • Brian commented on SEO Clients: The Good, The Bad, The Ugly, and Umm... 2 More This is a great post. I can't tell you how frustrated I get when someone who heard about SEO or had
    • StevenH commented on I Won't Be Buying XRumer 5.0 I can understand your thining and it really is a conundrum. We live in a day and age where automatio
    • Francis commented on Is BrowseRank The New PageRank? Well in the paper they state that they had to get the users to agree to let their browsing data be u
    • Barry Plaskow commented on I Won't Be Buying XRumer 5.0 Here's the problem with automated software... On the one hand they are certainly spammy and one woul

  • SEO / SEM Categories
    • Black Hat SEO
    • Link Building
    • Search Engine Marketing
    • Search Engine News
    • Search Engine Optimization
    • Search Results Analysis
    • SEM and SEO Careers
    • SEO Contests
    • SEO Experiments
    • SEO Theories and Research
    • Social Media Optimization

  • SEO Experiment Pages
    • Marios Alexandrou on Facebook
    • Technorati Mentions - 1
    • Technorati Mentions - 2
    • Marios Tziortzis
    • Greatest Living SEO

SEO Theories and Research

A lot of search engine optimization (SEO) sites talk about what what you should do to optimize your site with such conviction that you assume their theories are based on solid analysis. However, this isn’t always the case as there are many theories that don’t quite have enough supporting evidence to be considered fact while other theories turn out to be outright myths. This section covers some of these theories and myths along with actual empirical research.

10 Graphs Reveal Web Spam Patterns

They say a picture is worth a 1,000 words. So here's 10,000 words of web spam data from a research paper titled Detecting Spam Web Pages through Content Analysis by Alexandros Ntoulas et. al.

Detecting Cloaking Algorithmically Is Not Easy

Recently Matt Cutts posted a question on his blog asking for what people thought Google's Web Spam team should focus on next. Mixed in amongst the answers were requests to eliminate cloaking. Some even went so far as to list offending sites. What's interesting to me is that since cloaking isn't new, is there something tricky with its detection that has kept Google from eliminating it from their results?

Detecting Splogs with Self-Similarity Analysis

A while back Google's search results were littered with splogs (spam blogs). It was common to search for a term, click on the first result, and land on a page that had advertising fill the area above the fold and useless content below. I'm not sure when, but the problem must have reached critical mass because Google cleaned up the results. And as with most things Google, the cleanup was very likely algorithmic and automated. Ever wonder how they might have accomplished this feat? A research paper from May 8, 2007 titled Splog Detection Using Self-Similarity Analysis on Blog Temporal Dynamics may be the answer.

Using Link Structures to Classify Web Spam

In an earlier post I summarized content from a research paper that provided a Web Spam Taxonomy. That paper is a few years old, but I believe it still provided a good foundation for discussions regarding web spam. In this post, I'm going to walk through a document titled Improving Web Spam Classifiers Using Link Structure written by Qingqing Gan and Torsten Suel of the CIS Department at Polytechnic University in Brooklyn, NY. In the world of information retrieval research, this research paper is quite current having been publishing in the May, 2007.

Web Spam Taxonomy

I came across an interesting research paper the other day titled Web Spam Taxonomy. How could I resist that title!? The paper was written by Zoltan Gyongyi and Hector Garcia-Molina while in the Computer Science Department of Stanford University. The authors also acknowledge many discussions with an anonymous collaborator at a major search engine as a source of information.

Is BrowseRank The New PageRank?

I've been on a research paper reading binge recently. I've got about 10 or so under my belt now in just the last couple of weeks. I've discovered they make great reading on my train ride to work. Relatively short and to the point. Sure they're often full of crazy math formulas, but those are easy to gloss over and instead concentrate on the discussion. Many of the papers were written years ago. Despite their age, the information is... ummm you know... informative. I mean that. My most recent reading, BrowseRank: Letting Web Users Vote for Page Importance, is actually from 2008 which makes it both informative and relevant to future SEO efforts.

Statistically Speaking, That Page Is Spam

In a previous post I covered a Microsoft Research paper that discussed how static factors could be used to improve search ranking results above and beyond what PageRank alone could do. Rolled together, these factors formed what the authors of the paper called fRank to measure the quality of a web page. In this post I'm going to cover another research paper that looks at the other end of the quality spectrum. That is, what can be done algorithmically to identify a given page or domain as spam? Note that the basis of this post is from a 2004 SIGIR Paper titled Spam, Damn Spam, and Statistics.

fRank Takes on PageRank

Where have I been? That's the question that my readers (both of you, not including my brother) may have asked in the last couple of months. I've been where I've always been, but I've been reading much more than I've been writing. Some of that reading has been research papers of the sort put out by the International World Wide Web Conference Committee or the Special Interest Group on Information Retrieval. Fancy names for fancy groups putting out fancy research papers.

Updating Links: An SEO Red Flag?

Following on the heels of Eric Lander's NoFollow: An SEO Red Flag?, I thought I'd pose the question of whether updating inbound links may also be a red flag.

In You, Google Trusts

As the debate goes on about the decreasing importance of PageRank, another measure continues to gain traction in the SEO world commonly called TrustRank. The idea behind TrustRank is that Google (and other search engines) assign a level of trust to a web site or maybe even a web site owner which in return can help with index inclusion and rankings.

The History of Latent Semantic Indexing

It's sometimes fun (well, if you're involved with SEO) to look at how theories sometimes form and seem to be truthful, but even years afterwards are still being discussed. Such is the case with Latent Semantic Indexing or LSI.

Entries (RSS) | Privacy Policy | Contact | Sitemap
Copyright © 2010 All Things SEM / SEO Blog