February 26, 2007
Keeping Ad Tracking and Dead URLs out of Yahoo! Search
We’re often asked how Yahoo! Search determines which pages get indexed and which pages are left un-crawled. First and foremost, we honor the industry-standard robots.txt file format, which gives Webmasters several layers of control over which sites, pages and specific URLs should be indexed. Lately we’ve heard from a number of Webmasters asking how best to prevent ad tracking URLs and dead URLs from getting indexed, so we thought we’d respond via this post.
Ad tracking URLs
Ad tracking URLs are used by Webmasters to help determine what traffic is coming in from advertisements (e.g., Yahoo! Sponsored Search and Yahoo! Publisher Network) but aren’t necessary to include in the Yahoo! Search index. Sometimes you might notice that these URLs still appear in the index. That’s because they’ve appeared on pages that are “crawlable” or may have been copied over to crawlable pages by users. If you don’t want Yahoo! Slurp, our Web crawler to index these URLs you can use wildcards in robots.txt. For example, if you are using the parameter ‘ref’ to track ad sources, you can use a rule like the one below to keep your tracking URLs from being Slurped:
User-Agent: Yahoo! Slurp
Disallow: /*ref=YahooPublisherNetwork
Dead URLs
The best way to remove dead URLs from the Yahoo! Search index is to return an HTTP Error 404 when our crawler requests the page. If you want to act before the 404 discovery and URL removal process completes, you can use Site Explorer to quickly delete the URLs from the index. One advantage to using Site Explorer is that you can delete multiple URLs including an entire subpath so long as the URL prefix is the same. As Danny Sullivan points out in his deep-dive post on the delete function, if you delete http://domain.com/subarea1/, then all the pages that begin with ?domain.com/subarea1? will get removed. E.g.:
http://domain.com/subarea1/page1.html
http://domain.com/subarea1/page45.html
We’ll continue to visit the Yahoo! Search blog to give Webmasters like you pointers on how to better manage your sites in the Yahoo! Search index. Be sure to visit us at the Site Explorer Suggestion Board if there are specific areas that you’d like us address in more detail.
Thanks,
Priyank Garg
Yahoo! Search
January 30, 2007
Yahoo! Site Explorer: Authenticate your site via a META tag and more goodies
We spend a lot of time listening to our users, and I am happy to say we’ve gotten better at it. We’ve been using feedback forms and message boards, and finally at the Chicago SES last December, we launched our new Site Explorer Suggestion Board. This is a new user based ranking feedback tool, which was first introduced at an internal Yahoo! Hack day and is currently being deployed across the Yahoo! network. It allows you to make suggestions for the product, vote for existing suggestions or simply comment on them.
Today, we launched a new version of Site Explorer that addresses some of the Top Rated suggestions from our users. The key features are:
‘ Site Authentication using META tags: For those of you who cannot upload an authentication file to your site, such as a blog, you will now be able to authenticate your site in Site Explorer by including an authentication key as part of a META tag on the home page of your site. This is in addition to the existing mechanism of putting a file on your site home directory.
‘ Detailed Authentication Errors: We now provide detailed errors on authentication failures, making it much easier to diagnose possible problems.
‘ Delete URLs: For your authenticated sites, you can now delete any URLs from the index. Simply locate the URL in Site Explorer and click on the ‘Delete URL’ button. The URL and all its subpaths will be deleted shortly thereafter. This is meant to work in conjunction with the robots.txt file while providing greater responsiveness. Please continue to use the robots.txt protocol to ensure that our crawler does not crawl pages you want to keep out of our index.
‘ Site Explorer Badge: Get a Site Explorer badge for your Website and retrieve the count of live links from the whole web. Go ahead, watch as your site becomes more popular, and show off your link wealth to your visitors.
These features address some of the most popular suggestions that we received on our new board. The full list of suggestions we will be able to address with this release is:
a) Allow removal of invalid or malformed URLs
b) Verification for blogs
c) Authentication Problem
d) More than 25 sitemaps
e) better labeling of TSV files
f) Site explorer should identify itself in the user agent string
g) https / ssl
h) Wait 1 day? (Speed of authentication)
Hope you’ll enjoy the improvements. Please share with us your experience using these features and continue to send us your feedback. It’s very valuable to us!
Priyank Garg, Amit Kumar, Apostolos ‘Lakis’ Karmirantzos, Di Chang, Judy Johnson
Yahoo! Search
November 15, 2006
Yahoo!, Google and Microsoft join forces (really !!) behind Sitemaps
The best part about to-do lists is when you get to cross something off, and today we can cross one more from the list of feedback we have collected from webmasters. You have asked us to support a single format for submission and today we want to talk about how we are teaming up with Google and Microsoft to support Sitemaps 0.90.
Together we’re announcing www.sitemaps.org, which provides details of the current release of the Sitemaps protocol and will include future updates as we continue to collaborate on this common protocol. By offering an open standard for web sites, webmasters can use a single format to create a catalog of their site URLs and to notify changes to the major search engines. This should make is easier for web sites to provide search engines with content and metadata. And in turn, search engines can spend less time crawling unchanged pages and can update indexes faster as new content is discovered. This will help us reflect the changes more quickly, and improve our ability to provide more timely and relevant search results for users. Sitemaps is available to any site owner who wishes to communicate more easily with participating search engines. Simply create and upload an XML Sitemap and submit the URL of the file to search engines.
You can submit Sitemaps to Yahoo! Search through Site Explorer, just like you could add RSS feeds up to now. Just add the site to which the feed belongs, to your list of sites, and then add the feed for that site. We will retrieve the sitemap and use the data you provide us.
We are open to feedback and ideas on what more we can do with Site Explorer and Sitemaps. Share your thoughts in our forum, we?d love to hear from you.
Thanks and keep the list growing,
Priyank Garg
Product Manager, Yahoo! Search
October 30, 2006
Site Explorer Update: Authenticating Yahoo! Stores
First, thank you to those who are using Yahoo! Site Explorer to keep tabs on how your site is indexed by us, and especially to the folks who are using the forum to ask questions and suggest new features. Your feedback helps us prioritize features as well as come up with new features to add to the roadmap. One of these requests was for the ability to authenticate Yahoo! Stores in Site Explorer, which we have just introduced. Basically, that means you now you have the ability to add your Site Explorer key to your Small Business site. For more, please head on over to the Yahoo! Store Blog for the complete rundown.
As always, please send us your feedback on Site Explorer, or visit our forum to share your thoughts with other users.
Thank you!
Priyank Garg
Product Manager, Yahoo! Site Explorer
October 05, 2006
Site Explorer Authentication – Some Improvements and Notes
We have had phenomenal response to the new version of Yahoo! Site Explorer we launched two months ago. Thanks to the many of you who have come by and used the new interface, authenticated your site, and asked us questions on the forum. We have been answering many questions on the board, and there are a few common themes that we want to respond to in more detail.
- Many of you have noted that the authentication filename is too long. We are changing the prefix we use so that the full filename is much shorter, 27 characters only, within the limits for filenames that we came across.
- For those who are unable to upload our authentication key as a text file, we have updated our key file to be HTML with a .html extension.
- Note that these changes are backward compatible, so if you have the old authentication file name, you don’t need to change or re-authenticate. We will look for both key files.
A few other tips we wanted to share regarding authentication:
- Do not remove the authentication key after the first site authentication. We periodically check for the presence of the key and your site will be unauthenticated if we can’t find the file.
- Authenticate at the level at which you have control over the content. For example, if you have a site on geocities, say http://www.geocities.com/my_site, authenticate within your site directory. To do this, add your site’s root path: http://www.geocities.com/my_site – not the site root (http://www.geocities.com). So your key file would be available at the path: http://www.geocities.com/my_site/y_key_abyahooauthkeyyz.html.
We have also made other minor updates to the interface designed to make Site Explorer easier to use.
We appreciate your feedback and are doing our best to address it. One of our goals is to make Site Explorer even more easy to use. So please let us know if our tweaks help make the tool a bit more webmaster friendly and continue to share your thoughts with us!
Priyank Garg, Amit Kumar, Apostolos ‘Lakis’ Karmirantzos, Di Chang
Site Explorer Team
August 10, 2006
Pointing Webmaster Queries to Site Explorer
A lot of webmasters use Yahoo! Search to get page and inlink data about their site, using ’site:’, ‘link:’, ‘linkdomain:’ queries. Starting last night, we are redirecting all queries of this nature to the Site Explorer results pages, so that you can benefit from this tool’s additional features.
To reiterate, the following types of queries will be redirected:
- site:ysearchblog.com
- link:http://www.ysearchblog.com/archives/000341.html
- linkdomain:ysearchblog.com
All other queries, such as the ones below, will not be redirected:
- ysearchblog.com
- ysearchblog
- site:ysearchblog.com webmasters (looking for ysearchblog posts mentioning webmasters)
- link:http://www.ysearchblog.com/archives/000341.html Danny Sullivan (looking for links to the article mentioning Danny Sullivan)
- linkdomain:ysearchblog.com site:yahoo.com (looking for links to ysearchblog from within yahoo.com)
Site Explorer, since its launch last year, has had various features geared to serve webmaster needs for data about their web site, such as data downloads in TSV format and more accurate counts of results. On Tuesday we launched an upgraded version of the Site Explorer with several new features.
For those of you who will be seeing Site Explorer for the first time, we hope that you will find that these features make your lives easier.
If you want to extract this data programmatically, please use our Web Service APIs. The APIs provide the same data and will be more stable and easier to parse than our search page, which we regularly change to make user experience improvements for our users.
A hearty thank you to the many webmasters who have tried out Site Explorer’s new functionality since the Tuesday update. If you haven’t visited yet stop by to register your site and let us know your thoughts on Site Explorer in our forum.
Priyank Garg
Product Manager, Yahoo! Search
August 08, 2006
Site Explorer Update
We opened a little window into Yahoo! Search last year, when we
href="http://www.ysearchblog.com/archives/000191.html">launched Site
Explorer. We hoped it would be useful to webmasters–providing you
with information about the links to and from your site, neatly
categorized and displayed in an easy-to-use interface. We’ve listened
to your feedback, and are now ready with the next version of Site
Explorer–our biggest update since
December.
We’re now organized around sites you’d like to track. You can explore these,
and add feeds to each site. Once you authenticate your site, you can see much
more information about your URLs as you explore your site, and monitor feeds
you’ve submitted.

So what’s new?
- More
information about sites you own, including:
- Last Crawled Date and Language for your Site URLs
- Subdomains of your site
- Feed
submissions are much smoother. You can submit RSS, Atom and URL lists,
and manage all of them from one place. For authenticated sites, you can
also track when they were submitted and processed.
-
href="http://developer.yahoo.com/search/siteexplorer/V1/updateNotification.html">UpdateNotification
Web Service to notify us of feed or site updates, part of the suite of
Site Explorer
APIs you already know and love. Since these return the same data as
the tool, we recommend using them for automated applications.

We hope you’ll like our new interface, with a lot of little
details sprinkled all over, such as the expandable results to reduce clutter, the
ability to download more URLs from sites you own, and robust authentication.
Share your comments through our
href="http://add.yahoo.com/fast/help/us/ysearch/cgi_siteexplorer_feedback">feedback
form or see what others are saying on the new Site Explorer
href="http://messages.next.yahoo.com/next/forumview?bn=SEA-YahooSiteExplorer">forum.
We welcome you through the doors, and hope you’ll forgive our tacky metaphors! :-)
Amit Kumar, Priyank Garg
and the entire Yahoo! Site Explorer Team
July 26, 2006
It?s Search. It?s Site Explorer. It?s Webzari!
As Searchblog readers may remember, we launched a tool called Site Explorer last year that you can use to see what pages from a site are indexed in the Yahoo! Search engine. You can also use Site Explorer to see page links.
The Site Explorer interface is based on the search results page experience and returns lists of pages that are indexed, and inlinks to your site, as you can see for the Searchblog.
But the Yahoo! Korea team took the basic functionality and gave it an entirely new look ? as you can see in the Webzari for the Searchblog. Sorry I can?t translate it for you. Here?s one screenshot that explains partly what the tool is showing:

If you mouse over the planets in the Webzari, it gives you more information about the links and clicking on the planets returns the corresponding blog entry or other text. Try clicking around on it ? even though you might not understand Korean, you?ll get the gist of things.
You can even save Webzari searches in My Hub, the Korean version of My Web.
Give Webzari a spin and leave us a comment to let us know what you think!
Arah Cho & Priyank Garg
Yahoo! Search
June 02, 2006
Reaching the Weatherman at Yahoo! Search
At conferences, webmasters always ask me how they can connect with us here at Yahoo! Search.
I usually have the time to flash a slide that shows a list of URLs to different forms and services for webmaster support and webmaster feedback. I also talk about Site Explorer, which webmasters can use to explore the Yahoo! Search index. I know most people don?t have the time to write these down, and hope that the information is disseminated via presentations made available to conference attendees.
I?ve also promised to make this information available via the Yahoo! Searchblog.
Information about Yahoo! Search can be obtained via a very handy URL – http://help.yahoo.com/search. Memorize, bookmark, email or copy it, whatever works…
We?ve also added a new link to this page, Webmaster Resources. This includes the list of resources that I usually pop up on screen at conferences. The URL is http://help.yahoo.com/search/resources. Save it and tag it on Delicious or MyWeb, whatever works…
Oh – and as we mentioned, we?re retiring the old feedback email address and replacing it with a simple form. It is included in the webmaster resource list noted above, and the URL is http://help.yahoo.com/search/feedback.
We hope this will make it easier for you to get answers and provide feedback. Although you may not hear from us directly, we do pay attention and the product changes for the better based on it.
Thanks,
Rajat Mukherjee
Yahoo! Search
December 06, 2005
Submitting Site Feeds and other Site Explorer updates
Two months ago we launched Site Explorer, a tool to explore the pages from your site in the Yahoo! Search index and the inlinks to those pages. Many of you have been using the tool actively and we appreciate the positive response and feedback we have received. It was gratifying to see the panelists at Webmaster World using the tool for site reviews. We have now launched it as a Beta on our International destinations, including Argentina, Australia & NZ, Brazil , Canada (English and French), France, Germany, India, Italy, Mexico, Singapore, Spain and UK as part of the Services and Tools.
Site Explorer also tries to make it easy for you to tell us what we don’t know about your site. To make it even simpler, we now accept site submissions in the following formats.
Note that for any URL (submitted directly or obtained from a feed), we will extract links from it and find pages we have not discovered already.
We’ve also added something many of you have asked for, the ability to filter out internal inlinks when exploring the inlinks to your site or to particular pages. Please try out these new features and let us know, as many of you already have, what you think about Site Explorer. Even though we can’t respond to all your emails, every piece of feedback is appreciated.
Enjoy exploring!
Priyank Garg
Product Manager