In a previous post, I described the PageRank formula and showed how PageRank flows in various linking structures, including the positive impact that breadcrumbs can have on your home page PR. In this post, I am responding to a question I received asking about the impact on the overall distribution of PageRank in a website of putting a link to the sitemap in the footer. Since having a sitemap link in the footer is a common thing to do, I decided to prove or disprove the conventional wisdom that says this is a good idea. The result may surprise you like it did me.
I start with the same 17-page website structure. The yellow boxes represent web pages, the arrows in the diagram represent links and the number in each yellow box is the linear PageRank of that page (see previous post for an explanation of linear PageRank). Notice that the pages with the most inbound links have the highest PR values.
Now let’s add a sitemap to the site-wide footer and see what this does to the linear PR for the site. In the diagram below the new sitemap links are shown in red.
Sitemap Link In Footer = Bad Idea
Wow! Adding a sitemap to the footer draws a bunch of the PageRank into the sitemap page and redistributes the PageRank away from the home page and top pages. Now let’s suppose that we don’t want this to happen. One way to mitigate the sitemap effect is to add a home link to every page – another common practice. I know the next diagram is a bit hairy, but bear with me. The new home page links are shown in green. Note that I am assuming that Google only counts the first link from page A to page B and ignores all the other similar links on page A that go to page B. That’s why there are only 13 green links (the remaining pages already link to the home page).
We can see that putting the home link on every page really helps balance things out. The home page PR is now 2.29 instead of 1.22. But the second level pages suffer and with a PR of 2.64 (highest on the site) there is still a lot of PageRank being wasted on the sitemap page. So the next thing I tried was linking to the sitemap page only from the home page. Since I believe that the most crawled page on a website is the home page, I don’t think this will hurt too much in terms of getting indexed (especially if you also put a pointer to an xml sitemap in your robots.txt file).
Now this looks much better. We have a PR=6.46 for the home page, higher PR for nearly all the other pages and the sitemap page is only PR=1.52.
So the conclusion from this little exercise is that following the conventional practice of putting a sitemap link in your footer can adversely impact the PR distribution for the rest of your website. Using linear PageRank calculations, I have shown that putting a single link from the home page to the sitemap produces a much more favorable PageRank distribution.
This brings up the question of whether the footer links are somehow discounted by Google when it comes to distributing PageRank within a website. If not then every link in your footer (except the home link if you have one) is sucking PageRank away from the important pages on your site. So for the purposes of this post, let’s assume that Google attaches the same weight to footer links as other links on the page. Now, since Matt Cutts has recently said not to use the rel=”no-follow” attribute on internal links, then we aren’t left with many choices. We can go ahead and use the no-follow attribute on the footer links anyway but that means they won’t be crawled too. Ah, but that is okay since these pages will also be linked from the sitemap without the no-follow attribute on those links. So this might be one of those exceptions that Matt refers to in his statement. Having a single link to the sitemap from the home page and selectively using no-follow on footer links seems like the best way to address the issue I have uncovered with footer links.
What do you think? We’d love to hear your comments!
Author’s addendum: A clever reader pointed out that there is a problem with my assumption that putting a no follow attribute on a link will cause that PageRank to be redistributed to the rest of the links on that page. Read all about my subsequent analysis and conclusions in Part 3 of this series!