Making Sitemaps Easier to Manage and Scale
- Posted February 27th, 2008 at 3:23 pm by Yahoo! Search
- Categories: Search
Today, Yahoo!, along with Google and Microsoft, is announcing Cross-host submission for Sitemaps, which will make it easier for webmasters to manage their Sitemap submissions to the major search engines. With this announcement, webmasters can now submit Sitemaps that correspond to several differently-hosted websites using a single mechanism.
For background, a Sitemap file contains the URLs for the pages on a site along with meta-data, such as priority, last crawled date and change frequency for the content. To ensure validity of this metadata, Sitemaps have previously been required to be on the same host and path as the URLs they contain. This requirement forced the Sitemaps files to be hosted on the same servers as the actual site content.
With today’s announcement, a Sitemap can now be hosted on a different host and path than the URLs it contains. For example, say you have a Sitemap (sitemap-www.xml) for the URLs on http://www.example.com but you want to put that Sitemap on http://sitemaps.example.com. That is now possible. To make the Sitemap valid and preserve data security you need to refer to it from the robots.txt file on the site where the URLs it contains are located. For example, add the following line to http://www.example.com/robots.txt:
Sitemap: http://sitemaps.example.com/sitemap-www.xml
Our collaboration with Google and Microsoft began back in November 2006 when we announced joint support for the Sitemaps protocol. Since then, we’ve learned a lot about how webmasters and site owners manage their sites and feeds. We know that segregating user facing content from feeds, like Sitemaps, is important. We’ve also learned that managing feeds for large websites or websites using third party feed publishing services is critical. We hope this enhancement helps address those needs.
We’ll continue to work on addressing the needs of our webmasters through new standards and protocols. If you have other thoughts about how we can collaborate with other search engines on standards such as robots.txt, we’d love to hear from you. Leave us a comment below, or give us your feedback here.
Details about the Sitemaps protocol, including our recent addition, are available on the protocol website. Or, if you’re at SMXW this week, bring your questions to our panelists and speakers.
Priyank Garg
Director, Product Management, Yahoo! Search
Sean Suchter
Vice President, Engineering, Yahoo! Search
- 22 Comments
- Subscribe
that’s a really good thing cause it was pretty long to submit sitemaps to all search engine. I hope other search engine will join this!!
OK, but when are you going to allow the sitemap entry in “/robots.txt” to be a RELATIVE URL – where the domain is taken from the one used to fetch the robots file in the first place?
Sitemaps.org still says that a sitemap declaration in “/robots.txt” SHOULD (not MUST) be an absolute URL, so you’ve still left this open.
Sounds like a good deal. We shall wait and see where the charges come from.
This seems to solve one of the problems I have been confronting, but I would prefer to have the sitemap entry in “/robots.txt” to be a URL that I get directly from the robots file.
This certainly simplifies sitemap management for larger sites that are typically powered by many servers. But I still find sitemaps hard to manage for larger sites. It just does not see to scale well.
This is great news as it will save much of webmaster’s time and energy.Secondly, hosting sitemap on separate servers sounds good.Has this been implemented or we have to wait a bit to make use of this feature????
This is a great developement, at least in terms of allowing site maps to be hosted on different channels if you have more than one site. I am doing some Search Engine Optimisation for medical plasters and my worry is that having a sitemap located in a different place might compromise ranking.
I recently added my site map to Live webmaster tools and when I checked the robots.txt I was invited to add the site map “tag” in the file. It makes a lot of sense and I hope this will help all the SE spider my site faster and better!
If the pages are linked together do we still need a site map.
These new sitemaps are so easy to create and implement. Excellent solution.
I just wonder if there’s any way that we can make such a multiple sitemap in the same domain withing different folder to make the sitemap file size smaller and more efficient. I think if we can implement this, it would be a great benefit for larger domain in managing its domain.
Thanks for making a sitemap easier. You guys are on the ball.
It is easier said than done. I appreciate the great article.
I just wonder if there’s any way that we can make such a multiple sitemap in the same domain withing different folder to make the sitemap file size smaller and more efficient.
This is a great developement, at least in terms of allowing site maps to be hosted on different channels if you have more than one site.
This is a great developement,we can make such a multiple sitemap in the same domain withing different folder to make the sitemap file size smaller and more efficient.Excellent solution.
I hope other search engine will join this!!We shall wait and see where the charges come from.
thanks.
One Word “MasterPiece” I really hope all othe SE’s follow suit…Keep up the good post..
I have to agree this is a sensible move, Googles sitemap setup is rather complicated
Obviously this makes sitemap management for larger sites easier that are typically powered by many servers. However I discover some sitemaps hard to manage for comparatively complex sites.
I have to agree this is a sensible move, Googles sitemap setup is rather complicated
I have to agree this is a sensible move, Googles sitemap setup is rather complicated