Folder depth comes up every now and again as a ranking factor (albeit a small one). The most recent mention is in a list of 22 considerations for improving natural search performance from Search Insider (written by Rob Garner of iCrossing). Some of the items he listed are accepted by the majority, but one in particular mentions folder depth as being an obstacle to search engines without additional support or detail. And so I'm taking it upon myself to see just how search engines handle folder depth.

Ideally I'd like to set up an experiment to measure the ranking benefits of a root-level page vs. one buried within sub-folders, but with so many variables to hold constant that wouldn't be easy. So to start, I thought I'd just focus on finding out if search engines will give up on files buried too deeply. My current take is that folder depth, within reason, won't stop a search engine from indexing content and that folder depth, again within reason, won't impact rankings all that much. I do believe that I once read that Yahoo cared about folder depth, but more current commentary suggests that major search engines are more concerned about click-depth i.e. the number of clicks away from the home page or some other authoritative page.

Some notes about the setup of this SEO experiment:

  • I've created several pages that I've linked to from here. Having the links all on one page should eliminate some confounding variables.
  • The folders are of varying levels with the deepest being 10. By the way, if you've got a site architecture that needs this many levels, you should really rethink things.
  • The deepest folder structure is listed first. I did this because I know that a 3-level deep folder structure will get indexed so if this one does, but the others above it don't I'll be able to say that the search engines didn't index just the first couple of links in the list.
  • I've removed the usual related links that appear at the bottom of my posts.

And here are the test links. There's really no point in you clicking on these, but I know some of you won't be able to resist.

I'll post updates about the indexing of the above URLs in Google, Yahoo, and MSN in a week or two, but I'll let things sit for longer than that before making any conclusions.

Results

Eleven days after this post went live, Google and Yahoo had indexed and cached my test pages including the one 10 folders deep. Live hasn't cached any of them, but who cares about Live, right?

If you liked this post, please consider sharing it with others. These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Sphinn
  • StumbleUpon
  • Mixx
  • Reddit
  • bodytext
  • del.icio.us

11 Responses to “Do Deep Folders Stop Search Engines?”
  1. Cesar says:

    Great experiment. I would've liked you to have included a root level page just to make a comparison between the rest of the folder depths. Also, is there a reason you varied the word count on the examples? Wouldn't have made more sense to make the word count as close as possible?

  2. Marios Alexandrou says:

    Cesar,

    I didn't bother with a root level page because we all know those get indexed.

    As for the word count, that may effect rankings, but I don't see it having any effect on whether a page is included in a search engine's index (except possibly for really, really short pages).

  3. Jordan Kasteler says:

    Excellent test. I look forward to the results.

  4. Marios Alexandrou says:

    @Cesar
    @Jordan

    Just wanted to let you both know that Google and Yahoo have indexed and cached the pages including the one 10 folders deep.

  5. Mathieu says:

    Thx for letting us know about this :)

  6. Ahmed Ibrahim says:

    It's a really good idea to do this test, because i spoked a lot with co-workers and no one give a prove if it's good or bad to the folders depth but now after your test i have a prove,
    thanks

  7. Mr. SEO Expirment says:

    I have found that the lower level pages are indexed just as well.

  8. passionflowers says:

    I appreciate your experiment; however, it doesn't address one of the points I read (with a degree of confusion, as it's a bit ambiguous) in Matt Cutt's (Google) blog. http://www.mattcutts.com/blog/subdomains-and-subdirectories/

    I can't quite figure out from his post HOW MANY pages within a subdirectory/folder will be indexed–he refers to Google's prior approach versus "new"(written in Dec. 07) algorithm. There's an implication that Google may only index a limited number of pages WITHIN the folder. So I am interested in not only whether A PAGE your tenth-level folder is being indexed, but whether multiple/how many pages are being indexed?

    Thanks for the info…

  9. Marios Alexandrou says:

    I don't believe Google places a limit on the number of pages within a sub-folder it will crawl. Instead, I think Matt's article is saying that the number of sub-domains that appear in SERPs will be limited whereas in the past they weren't.

  10. Paul says:

    …what if for some reason I decided/needed to have a noindex on 1 of the folder levels - an example could be the a noindexed archive and the indexed single post the level below.

    …does noindexed pages pass PR/linkjuice (or what ever you want to call it)?

  11. Marios Alexandrou says:

    Paul,

    There's an interview with Matt Cutts that answers your question (so you don't have to take my word for it):

    Eric Enge: Can a NoIndex page accumulate PageRank?
    Matt Cutts: A NoIndex page can accumulate PageRank, because the links are still followed outwards from a NoIndex page.
    Eric Enge: So, it can accumulate and pass PageRank.
    Matt Cutts: Right, and it will still accumulate PageRank, but it won't be showing in our Index.

    Here's the original article: http://www.stonetemple.com/articles/interview-matt-cutts.shtml

Leave a Reply