Contact Inceptor Today


SEO Tips for XML Sitemaps

A sitemap is nothing more than a list of a website’s pages that are meant to be accessible to users or crawlers.  Yet, despite its simplicity, it’s actually quite rare to come across one that’s been done correctly.  The problem is that most folks, especially those who manage large websites, simply use tools to automatically generate their sitemaps.  Unfortunately, if these tools aren’t properly configured, the resulting sitemap is going to cause problems for any search engine spiders attempting to process it.

A proper XML sitemap can be a key factor in ensuring that the search engines have access to the content that you want to rank.  In his article on Search Engine Watch, Ben Goodsell examines the guidelines and limitations of search engine sitemaps and offers some tips for resolving common crawling and indexation issues.

Sitemap Guidelines and Limitations

The sitemap protocol in common use today was adopted as a search engine standard back in 2006.  Since then, Google and Bing have released Webmaster Tool dashboards to help the owners of websites locate and troubleshoot sitemap errors.

Bing is a bit vague when it comes to sitemap requirements, although they have publicly stated that they will devalue any sitemaps where 1 percent or more of the URLs return errors (anything other than a status code 200).

Google, on the other hand, provides more specific guidelines and robust error handling for those using their webmaster dashboard.  In particular, Google recommends the following hard limits:

  • Sitemaps should contain no more than 50,000 URLs
  • Sitemap file size should be under 50MB
  • Each account should have no more than 500 sitemaps

Both Bing and Google support sitemap index files, which allow you to group multiple sitemaps together for submission rather than submit each one separately.

Optimizing Sitemaps

Basic sitemap optimization should include checking for pages that are returning status code errors (3XX, 4XX, and/or 5XX), as well as duplicate URLs within a single sitemap.  You should also check for any pages with noindex meta robot tags, or any with meta rel canonicals that aren’t self-referential.

According to Goodsell, “There are tools to quickly parse URLs contained within XML files and find this information like the Screaming Frog SEO crawler.”

Once you’ve submitted the sitemaps to the search engines, Goodsell recommends using Google Webmaster Tools to break the sitemaps up into categories to spot any additional crawling and indexation issues.

“Sitemaps can be diced up in any way that makes sense to provide feedback,” Goodsell writes, “the main goal being to expose any areas of a site with a low indexation rate.”

Once you identify the problem area, you can use Fetch as Googlebot to identify uncrawlable content or multiple sitemap indexation identification to spot excess indexation.  For example, if you realize that your website is having a problem getting posts indexed, you can create an XML sitemap that contains only blog posts.  From this, you can calculate the indexation rate and use an advanced search to find out what other pages on your site are hindering the indexation of your post pages.

About Chris Irby

Chris is a professional writer who has been focused on search engine optimization and marketing since 2006. Over the past five years, Chris has done work for companies such as KeyRelevance, Koshkonong, and Idearc/SuperMedia, and his articles have appeared on the Multichannel Merchant and Practical eCommerce websites.

Leave a Comment

Comments are moderated and rel="nofollow" is in use. Please no link dropping, no keywords or domains as names; do not spam, and do not advertise!

*


800.307.6709

Is It Time to Target New Keywords?

The basic strategy behind keyword research is to find a keyword that is relevant to your website, that doesn’t have …read more

Four Tips for Increasing Conversions

You never get a second chance to make a first impression, and those few seconds that visitors spend on your …read more