Yahoo! Search Support for X-Robots-Tag Directive to Simplify Webmaster’s Control and Weather Update
- Posted December 5th, 2007 at 10:44 am by Yahoo! Search
- Categories: Site Explorer
Today we’re announcing support for tags that give webmasters even more flexibility over which pages and documents are crawled and indexed by Yahoo! Search. Specifically, we’re extending our support of page level exclusion tags — NOINDEX, NOARCHIVE, NOSNIPPET, NOFOLLOW — to provide additional control for archiving and summarization of ANY file type. Previously, these page level tags could only be expressed within html pages through the META directive (for e.g. <META NAME="Slurp" CONTENT="NOARCHIVE">), but based on feedback from our webmasters, Yahoo! now enables these tags to be expressed through X-Robots-Tag directive in the http header, giving webmasters the flexibility to achieve exclusions on PDF, Word documents, PowerPoint, video, and other file types, including html files, and increasing their coverage through a simplified process. Additionally, webmasters no longer need access to html templates in order to express exclusions for html files. To take advantage of this feature, simply add the following page level tags to the X-Robots-Tag directive in the HTTP Header. Here are a few examples:
- X-Robots-Tag: NOINDEX — If you don’t want to show the URL in the Yahoo! Search results.
Note: We’ll still need to crawl the page to see and apply the tag, so if you don’t wish to have the page crawled, use robots disallow on robots.txt.
- X-Robots-Tag: NOARCHIVE — If you don’t want to display cache link in the search results page.
- X-Robots-Tag: NOSNIPPET — If you don’t want to display summary in the search results page.
- X-Robots-Tag: NOFOLLOW — If you don’t want Yahoo! to crawl links in the page.
Along with this change, we’ll be rolling out additional changes to our crawling, indexing and ranking algorithms over the next few days. We expect the update will be completed early next week, but you may see some changes in ranking as well as some shuffling of the pages in the index during this process.
We’re at SES in Chicago and WebmasterWorld’s PubCon in Las Vegas, participating in a few different panels this week. Please find us if you have any questions or suggestions or drop us your feedback here.
Sharad Verma
Yahoo! Search
- 24 Comments
- Subscribe
Silicon Valley, in weather updates we typically deploy our enhanced algorithms AND also add high quality sites that are relevant to the query into the search results. In short, we do both.
So… you’re saying that the tags need to be placed in the headers, and they can be displayed in other PDF documents. How would this work for PDFs?
Is there a difference between the weather reports that say there’s fresh data vs. the ones that just say the algorithm has been updated?
Is this Update finished yet, because the UK search results seem to have a lot of .ae, .nz, .ca, and .au sites showing up for a lot of search terms.
In fairness, the pages are some of the most relevant I’ve seen for a long time from Yahoo in terms of the information, there just seems to be a problem with regard to the geo targeting.
If a page is served with this tag at the HTTP level and also contains the META tag in its HTML content specifying a dfferent set of directives, which set takes priority?
So according to you. Now web master have to write meta tags which you have mention above in their web page so that the yahoo robots crawl the pages.
Will the pages get indexed with no instructions to the robots?
Hello sir, you want to say that we have to include X-Robots-Tag in robots.txt file or we have to make extra file for yahoo spider
Please help me to write proper syntax. Is it some thing like below which i have written or its wrong.
If wrong please provide me the sample formate thanks
header(‘X-Robots-Tag: index, follow’, true); southalltravel.com
Cheap Flights Dubai
I’m trying to put meta tag inside my index.html instead of creating another html. How I do that and which should served faster, also, if in my index.html have another confirmation meta tag (google), will it work, which one should go first on top of other?
Sorry for quiet a long question, I hope you don’t get confused.
what you’re saying is that we must include X-Robots-Tag in robots.txt file,otherwise,we have to make extra file for yahoo spider ? thank you!
It would be desirable to learn more in detail about a practical technique of application of all of it.
“X-Robots-Tag: NOINDEX — If you don’t want to show the URL in the Yahoo! Search results.
Note: We’ll still need to crawl the page to see and apply the tag, so if you don’t wish to have the page crawled, use robots disallow on robots.txt.”
————————————————–
Does it mean you’ll crawl the content too or the robot will stop at the headers level?
Also (I forgot to mention) why suddenly a complete new name “x-robots-tag”?
What’s wrong with the old “robots” name?
And do you then support custom “robots” header which would provide old values, noindex, nofollow,..etc?
I hope that this news will give further help to those who, like me, hopes to see his sites indexed on yahoo, because until now it is not easy.
Hopefully good
Very convenient feature. I think providing an option to look for recipes in local languages would make this even better
what you’re saying is that we must include X-Robots-Tag in robots.txt file,otherwise,we have to make extra file for yahoo spider ? thank you!
And do you then support custom “robots” header which would provide old values, noindex, nofollow.
So according to you. Now web master have to write meta tags which you have mention above in their web page so that the yahoo robots crawl the pages. Please Answer!!
No spamming please. This is no-follow. Yeah. Yahoo always consider such stuff for indexing.
Also (I forgot to mention) why suddenly a complete new name “x-robots-tag”?
What’s wrong with the old “robots” name?
And do you then support custom “robots” header which would provide old values, noindex, nofollow,..etc?
It’s a shame there aren’t special secret tags to tell Yahoo to visit a website more often. Are there ???
Will the pages get indexed with no instructions to the robots? What do you say?
And do you then support custom “robots” header which would provide old values, noindex, nofollow