October 15, 2009

Accessing SearchMonkey Structured Objects via BOSS

SearchMonkey and the structured Web

We’ve just announced an all-new Yahoo! Search experience, with many new features powered by SearchMonkey data.  Since launching our open developer platform in May 2008, Yahoo! Search has enabled thousands of developers to shape the search experience for millions of Yahoo! users. If you are interested in building semantic applications similar to what we’ve come up with at Yahoo! Search, here are some details to get you started.

What structured objects are available?

All of the objects listed on the SearchMonkey homepage are available to you. With the new feature “object refiners,” users can now restrict the search results to specific object types. Site owners contribute data of these objects by marking up their pages with RDF or microformats, or by providing dataRSS feeds. If you’re interested in the actual data of these objects, use the Yahoo! Search BOSS API to request the SearchMonkey data as part of the search request.

How can I access these structured objects?

The SearchMonkey team has been encouraging developers to use our structured data to build semantic Web applications ever since we partnered with BOSS.  Using the BOSS API, you can access SearchMonkey structured objects.

To restrict the result set to pages with SearchMonkey objects, just add “searchmonkey:<objectType>” to your query. The result set from BOSS will only contain URLs that have objects of that type.

For example, the following query returns all of the documents in the Yahoo! Web index that has the words “Sunnyvale” and “pizza” – about 3 million pages.

http://boss.yahooapis.com/ysearch/web/v1/sunnyvale+pizza?appid=wX7OZ3zV34Fy2Y4W4in_vsjFmRhruQNgCxdxn6RUke2c2JVDZdw6bfc1rcEjVnw-&format=xml

But if you only want pages with local business objects on them, you can add “searchmonkey:local” to the query:

http://boss.yahooapis.com/ysearch/web/v1/sunnyvale+pizza+searchmonkey:local?appid=wX7OZ3zV34Fy2Y4W4in_vsjFmRhruQNgCxdxn6RUke2c2JVDZdw6bfc1rcEjVnw-&format=xml

This query returns about 25,000 pages.

Yes, we’ve just thrown out over 90 percent of the result set – but we are after the most relevant results, not simply the greatest number of results. Our new object refiners use SearchMonkey’s structured data to narrow your query from “pizza+Sunnyvale” to actual local business listings within those results. You can use BOSS to retrieve the same structured data and construct any presentation you like.

You can take it a step further and add any of these terms to the query:

  • searchmonkey:video – restricts the result set to videos.
  • searchmonkey:product – restricts the result set to products.
  • searchmonkey:local – restricts the result set to local businesses.
  • searchmonkey:event – restricts the result set to events.
  • searchmonkey:document – restricts the result set to presentations, spreadsheets, and similar document formats.
  • searchmonkey:discussion – restricts the result set to blogs and forums.
  • searchmonkey:game – restricts the result set to Flash games.

What don’t I get?

Not all structured data we’ve collected is part of the BOSS API.  For example, some third parties who provide us with feeds have elected to keep that data outside of BOSS. Structured data annotations from technologies built by Yahoo! Research are also not available to third party developers via BOSS. However, we aim to include all data we find embedded in web pages that deploy microformats or RDFa.

Our goal is a successful semantic Web where we extract the semantics as we process Web content. Every page marked up with semantic data makes that much easier for us to extract meaning from that page. And it’s not just us! Google Video Search has recently adopted the same video markup (RDFa and Facebook Share) that SearchMonkey supports.

What’s next?

We will make many more object types available to you soon. In the mean time, you can learn more about SearchMonkey and how we acquire structured data annotations from this new from this post on the YDN Blog.

Kevin Haas

Senior engineering manager, Yahoo! SearchMonkey

August 28, 2009

See More SearchMonkey in Your Search Results

Want a little more SearchMonkey in your Yahoo! search results? Starting today, more enhanced results for product, local, entertainment, reference, social, and tech sites will appear automatically in your results, putting more information and answers right at your fingertips.

First, we’d like to thank everyone who deployed microformats, RDFa, and feeds in response to our blog post in May. Thanks to your efforts, we’ve finished user testing for the new enhanced results templates and have deployed these templates in production. This means that in addition to Video, Documents, and Games, you can now add Products, Local Businesses, Event, Discussions, or News items to your pages. Anyone who provides structured data according to the specified format will automatically gain SearchMonkey default-on status, as long as it adheres to our terms of use.

For example, here are some results you’ll see when you search for products:

Pop Art Toaster Enhanced Yahoo! Search Results

Lil Wayne Tha Carter Enhanced Results

Here are some results you’ll see when you search for local businesses:

The Capital Grille Tampa FL Enhanced Results

Gochi Reviews Enhanced Results

You can see the ratings, number of reviews, phone number, and address for local businesses right on the search results page. You can also see the ratings, price, and number of reviews for products, helping you decide which page fits your interest most, and find what you’re looking for faster.

In addition to these SearchMonkey templates, we’re also releasing a number of custom SearchMonkey default-on applications. Entertainment buffs will be particularly excited about a few of the Entertainment apps, including RottenTomatoes, Netflix, IMDB, and Yahoo! Movies.

The following are just a couple of examples you’ll see when you search for entertainment-related info:

Milk Movie Reviews Enhanced Results

Gran Torino Enhanced Results

Beyond entertainment, we’re also automatically turning on results for high performing sites in as the social networking, reference, and download categories. The list includes Friendster, Britannica, and FileHippo.

We laud the efforts that developers everywhere have put into developing the SearchMonkey ecosystem and their contribution toward improving the search experience for users.

Great apps are built every week so look forward to even more SearchMonkey in your search results in the future – we’re looking forward to it, too.

Yi-An Lin and Nick Cox

Senior Product Manager

Yahoo! Search

June 25, 2009

VoCampers Converge at Yahoo! Headquarters in Sunnyvale

An enthusiastic group of data geeks and Semantic Web enthusiasts met last week at our Sunnyvale headquarters where we hosted the latest edition of VoCamp. VoCamps are a series of informal events that provide a small setting where the Semantic Web community can discuss issues related to semantic interoperability and creating, managing, and publishing vocabularies.

The format of VoCamp was conceived by Talis’ Tom Heath and Yahoo!’s Peter Mika, with the first installment organized in Oxford, England, in September, 2008. Since then, VoCamps have grown into a real movement, with events organized in Galway, Ireland; Austin, Texas.; Ibiza, Spain; and Washington, D.C., with more planned in New York and Bristol, England.

In Sunnyvale, we spent the first afternoon discussing three broad issues: ways of finding vocabularies on the Semantic Web, tools for mapping vocabularies and executing data transformations, and methods for lifting relational databases into the RDF world. Over pastries and pizza the next day, the campers worked in small groups on more specialized topics, including creating methodologies for vocabulary development, and developing a microformat for code documentation. (Many thanks to the microformat admins Tantek Çelik, Kevin Marks, and Ben Ward for bringing their perspectives to this discussion.) Other topics discussed included the Common Tag format and vocabulary visualization.

As Yahoo! Search moves toward a Web of Objects, we know that the developer community will be a critical component for creating a more robust Semantic Web. We were proud to play host to VoCamp Sunnyvale and look forward to future VoCamp gatherings.

Yahoo! Search

June 18, 2009

SearchMonkey Updates: New Enhanced Results and Support of Google Base Formatting

Today, we are announcing two updates that make it easier for site owners and developers to share and use structured data within Yahoo! Search:  new enhanced results and the support of Google Base formatting for structured data feeds. Let’s take a look at these two updates.

New Enhanced Results – Products, Events, News and More

Back in March 2009, we announced a simple way for site owners to embed video, games, and documents in Yahoo! Search results. Starting today, we are expanding this capability by giving site owners the power to display enhanced results for product pages, local information, events, news, and discussions.

If your site’s data falls into one of these categories, add a few lines of markup to your pages, and SearchMonkey will do the rest of the work. After we recrawl your page, we’ll extract the structured data and use it to display your data as an enhanced result.

For example, a retail website could add a few lines of code so that its product pages display as an enhanced result that includes the overall rating, price, reviews, and product photo directly on the search results page. Let’s say we have a fictional store called Sytore.com and the site owners have added the following code to their product pages:

<div typeof="product:Product"
xmlns:product="http://search.yahoo.com/searchmonkey/product/"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">

<span property="product:listPrice">49.99</span>
<span property="product:salePrice">39.99</span>
<span property="product:currency" content="USD" />

<span property="rdfs:label">Pinball Maven : Video Games : Electronics</span>

<span rel="rdfs:seeAlso media:image">
<img src="http://www.sytore.com/product.jpg"/>
</span>

<div rel="review:hasReview">
<span typeof="review:Review">
<span property="review:rating">4</span>
<span property="review:totalRatings">17</span>
</span>
</div>
</div>

Sytore.com’s product pages (such as its product page for “Pinball Maven”) would then display as an enhanced result:

SearchMonkey Enhanced Results
(*Example only)

Enhanced results bring users the information they need while helping site owners stand out on the search results page. You can add code to display local information as enhanced results with phone numbers and addresses. You can also display location and date for festivals, concerts, and other events.

A news website can take advantage of the SearchMonkey news object type and add code to enhance how their pages display in search. For example, a news website such as the (fictional) Thenewsy.com could add a few lines of code to its news article pages to display a photo and publication date. A query on “Obama Iraq” could display an enhanced result from Thenewsy.com:

SearchMonkey news enhanced result example
(*Example only)

If your site contains a forum, blog, or other types of online discussion, you can add some markup to display the number of comments and thread date.  You can learn about how to get started with each of these object types on our overview page.

Enhanced results for these new data types will appear in Yahoo! Search results a few weeks after you add the markup, and after we’ve crawled your pages to extract the necessary structured data. There is no sign-up process, so we encourage you to begin adding markup to your sites now so that your results can be visible to users.

From the beginning, SearchMonkey has been powered by open formats, which is why we are continuing to support the use of RDFa, microformats, and now NewsML for these additional object types.  With the help of site owners and developers, we are moving more rapidly towards structuring the Web and enabling new search experiences.  As we mentioned a few weeks ago, RDFa structured data collected by SearchMonkey has increased by 413% since October 2008. With the release of these new object types, we look forward to seeing that figure continue to climb.

Google Base

Since its launch in late 2005, there has been a growing community of tools and partners for Google Base, Google’s online repository for user-contributed structured data. Today, Yahoo! Search will accept five popular Google Base feed item types: Event, Product, Review, Job, and Personals.

Why is this important?  First, site owners who have Google Base feeds containing Event and Product information can now automatically have their enhanced results displayed in Yahoo! Search by submitting their existing feed through Yahoo! Site Explorer.

In addition, for all five item types, it’s now easier to use your Google Base feed within Yahoo! Search.  Site Explorer will convert your existing feed to DataRSS XML, allowing your data to be stored within Yahoo! and accessible to developers through BOSS for building third party search engines and the SearchMonkey Developer Tool for building applications.

For detailed instructions, refer to the full documentation within the Yahoo! Developer Network site.  For information about how to build your Google Base feed, refer to the Google Base feed documentation.

Please let us know if you have any questions or comments.  We welcome your feedback.

Yahoo! Search

February 26, 2009

Let SearchMonkey Feed Your Facebook Addiction

Starting today, Facebook enhanced results will automatically appear in search results. This means users can add a friend, poke, send a message, and view a person’s friends from the deep links on the search results page. Facebook shared the structured data for this SearchMonkey app by adding semantic markup to their public profile pages.

Here’s an example of the Facebook enhanced result with Alex Moskalyuk, a key Facebook engineer on this project.

Facebook Enhanced Result - Alex Moskalyuk

See the SearchMonkey app for Facebook in action yourself and try a search for your friends. Here at the Yahoo! Search Blog, we had fun checking out the Facebook profiles of marketing VP Raj Gossain and senior product marketing manager Graham Mudd.

We care about privacy as much as you do, so you’ll only see results for Facebook users who have enabled their profiles to be publicly searched and viewed. If you’re interested in “social-izing” your search results page further, check out other SearchMonkey apps for social networking sites such as StumbleUpon, Delicious, and MyBlogLog.

We hope the SearchMonkey app for Facebook and our other social apps make finding and connecting with friends on the Web easier than ever. Let us know what you think.

SearchMonkey Team

February 11, 2009

BOSS Update: Open Monetization, Pricing, Structured Data, and More

Today, we’re announcing a handful of new features for Yahoo! Search BOSS as well as important updates on our terms of service and pricing.

Three New Features
Perhaps the most important component of what we’re releasing today is access to SearchMonkey structured data through the BOSS API. The primary way in which SearchMonkey acquires structured data is by using the Yahoo! Web Crawler to scour the web for embedded semantic markup such as microformats or RDF. Starting today, all this data is available to BOSS API users.

BOSS site traffic

The structured data that site owners share with us through feeds will be openly available in the near future if site owners opt to participate. You can read more about how it all works here, but it’s pretty straightforward – just add the “view=searchmonkey_feed” parameter to your API request and we’ll return all available structured data name-value pairs in DataRSS XML. You can also return semantic data in RDF XML using view=”searchmonkey_rdf”.

Here’s an XML example of structured data from President Obama’s LinkedIn page:

BOSS XML example

We’re excited to see what can be built with this data, so please tag your mashups and products with bossmashup on Delicious.

Second, building on our release of Key Terms last November and SearchMonkey structured data today, we’re also making Long Abstracts available. This is all part of an effort to provide a rich set of document-level data to BOSS developers – in this case a longer description of the page (up to 300 characters compared to 170 previously). You can access these by appending the “abstract=long” parameter to your API request.

Lastly, for years Site Explorer has been a valuable tool for webmasters to understand how Yahoo! Search is indexing their site. Site Explorer also allows users to obtain inlinks for domains and URLs, which are now available through two new BOSS services called se_inlink and se_pagedata.

Open Monetization & Pricing
Effective immediately, we have changed our terms of service to allow developers to use third party monetization platforms (ad-based or otherwise). For obvious reasons monetization is critical to the BOSS ecosystem, so to provide as many opportunities as possible we have decided to adjust our terms to provide developers with more flexibility.

Today we’re also announcing our plans for implementing usage fees for BOSS. We’re introducing fees for a couple of reasons. First and most importantly, we’re hard at work on a number of technologies that will enhance both the functionality and performance of BOSS, and usage fees will help support this development. For example, once we introduce pricing, developers will be able to request 1000 results in a single API call (instead of the current 50). We’ll also be introducing an SLA to ensure BOSS is a robust and stable service for developers. Second, we believe that introducing the proposed pricing structure will improve the ecosystem by optimizing capacity for our serious developers.

You can find all the details on how the fee structure will work here on the BOSS Usage Fees page. Instead of focusing on the particulars, we’ll share the principles we used in developing it. Our goal is to encourage adoption and usage with a low, but fair price – so as not to maximize revenue at the expense of trial and innovation. That is also why we’re going to provide up to 10,000 search queries per day (depending on the type of API call) free of charge to all developers. You’ll notice that the cost to developers is dependent not just on the number of queries requested, but also the type (i.e. how deep your query is). Rather than go with a simplistic “one size fits all” model, we feel that a “pay for what you use” approach is fairest for all types of users.

We’re announcing the fee structure months in advance of it taking effect (likely late Q2 of this year) because we want to give our developers as much advanced notice as possible, and also because we’re as interested as ever in your feedback – so feel free to comment below or on the BOSS developer forum.

Ashim Chhabra
Yahoo! Search BOSS Team

July 31, 2008

Hola / Ol&aacute! SearchMonkey

Getting an international passport can be tough for the average monkey. But that’s not the case with this SearchMonkey. Today we’re proud to announce we’ve launched the Yahoo! Search Gallery in Latin America, so our friends in Argentina, Mexico and Brazil can now join the party.

Since our initial launch two short months ago, the team has been busy releasing major enhancements to the Gallery, supporting new microformats, putting on developer events in Sunnyvale, London and Paris, and picking the winners of the SearchMonkey Developer Challenge from hundreds of novel submissions. Now that the Gallery is extended to Latin America, we’re getting closer to finally seeing the output of an infinite group of monkeys with an infinite number of keyboards… well, maybe not quite, but it’ll certainly be fun to see the innovative enhancements they build.

And keep your eyes out for where SearchMonkey travels to next.

Nick Cox
Senior Product Manager
Yahoo! Search

June 24, 2008

Yahoo! Chats with Semantic Web Expert, Ben Adida

Yahoo!’s plans to “open up” really started circulating at the beginning of this year. Not long after, Yahoo! Search announced its plans to support semantic mark-ups, specifically our crawler support for markups like RDFa and eRDF, as well as provided a glimpse into our open approach to search.

As Yahoo! prepares to support standards, like RDFa for example, we’ve continued to work closely with the best and brightest in the semantic markup community. We were thrilled to have Ben Adida visit the Sunnyvale campus a few weeks ago. Ben is a member of the Faculty at Harvard Medical School and at the Children’s Hospital Informatics Program, as well as a research fellow with the Center for Research on Computation and Society with the Harvard School of Engineering and Applied Sciences. He is also the Creative Commons representative to the W3C and chair of the RDF-in-HTML task force, focusing on bridging the semantic and clickable webs.

Ben was kind enough to submit himself to a barrage of questions on RDFa, its development and the opportunities it provides. Take a look and feel free to drop questions you have in the comments. We’ll do our best to cycle them through to Ben.

Lawrence Kim, Yahoo! Search &
Peter Mika, Yahoo! Research

Yahoo! (Y!): RDFa has been long in the making… is it ready now?
Ben Adida (BA): Indeed it has been long in the making, and for good reason. We had to make sure we didn’t step on other specifications’ toes, that we respected existing design and uses of HTML, that we enabled the expression of enough flexible data to be useful in a number of current and future use cases, and that we had a valid processing model with test cases to help implementors.

We have all of that now. So yes, RDFa is ready. It has just been approved by the W3C as a Candidate Recommendation, with the specific text of the specification and a brand new Primer published on June 20th.

Y!: What can I do with RDFa?
BA: You can tell the world what various components on your web page mean by marking up things like:

  • The title of a photo
  • Your name and contact information
  • The license under which you’re distributing your latest MP3
  • The ingredients of a cooking recipe
  • The price of an item
  • A gene on which you recently wrote a paper
  • … Anything that you want to make more machine-readable

With RDFa, you can reuse existing concepts, e.g. the title and price of an item, no matter what that item is. If there’s a field you need that doesn’t exist, you can create it.

This level of granularity encourages you to mark up your content as fully as possible, while letting applications consume only as much of the data as it needs.

Y!: Who is supporting RDFa?
BA: Creative Commons and Digg are two early adopters of RDFa, and there are a number of smaller web publishers who have begun adding RDFa markup to their pages. We’ve also just heard that the UK National Archives are committed to adopting RDFa.

Y!: What advantages does RDFa provide compared to microformats, eRDF and AB Meta?
BA: Microformats, eRDF and RDFa share a common goal: to make it easy for HTML authors to add machine-readable tags to express the meaning of their web data. So before we get into a fight, it’s important to realize that all three share this important common goal.

Microformats work well for well-defined items, such as contact information (hCard) and calendar items (hCal). They tend to become more complicated when the data gets more varied. Fields can’t easily be shared across microformats, and all microformats must go through a centralized approval process to make sure no conflicts arise.

RDFa doesn’t have vocabulary conflicts: data fields, e.g. “title” can be reused by anyone, and there’s never any confusion as to what a given field means, since fields are, in fact, URLs. Entirely different types of data can share fields, which is exactly what applications need for extensibility. Multiple data items can be published on a single web page and, in contrast with microformats, relationships between the data items can be easily expressed.

eRDF has a similar vocabulary approach to RDFa, but it cannot express nearly as much data as RDFa. In particular, expressing relations between multiple items on a page is more complicated, and describing inline PDFs or images is not always possible. Also, eRDF is not quite as modular: vocabularies can only be imported in the HEAD of a document, so a widget-ized page would have an easier time using RDFa over eRDF.

AB Meta, which is new to me, appears to be a small subset of the intersection between RDFa and eRDF. Because it is a limited subset, it suffers a bit from the limitations of microformats: who gets to extend AB Meta? I would recommend sticking to the collaborative efforts such as RDFa and eRDF.

If you need more complete expressivity and the modularity required in a widget-ized web world, then you need RDFa.

Y!: What would you say to the critics who say that RDFa is too difficult to author?
BA: It’s a matter of taste and finding the right compromise.

In my opinion, RDFa and eRDF have similar levels of complexity as far as authors are concerned. I prefer writing RDFa, and I’m sure Ian Davis prefers writing eRDF. But I don’t think either one of us would seriously argue that one is much easier than the other.

It’s a little bit more complicated to write RDFa than it is to write microformats, but that’s not surprising given that microformats are more limited in scope, and there are notable extensibility costs to using microformats.

In general, we expect that web publishers will write RDFa in HTML templates, rather than every time they have an item to publish. Most microformat deployments work this way, too, few people write them by hand each time. So the increased complexity is negligible in the bigger picture.

Y!: Unlike microformats, RDFa depends on the availability of shared vocabularies (ontologies). Is that a problem?
BA: A number of vocabularies are already available and particularly stable: Dublin Core for documents, FOAF for people and their networks, Creative Commons for document licensing, hAudio and hVideo for online media. Then there are highly specialized vocabularies, like Uniprot and the Open Biomedical Ontologies (OBO) for the life sciences.

In my opinion, this is a huge win for RDFa. You really want vocabularies developed by experts in the appropriate field. Bio-informaticians develop vocabularies for biomedical research, musicians develop vocabularies for music, and lawyers develop vocabularies for copyright licensing.

Y!: What’s next for RDFa?
BA: For the next few months, we’re going to focus on helping publishers produce RDFa and tool builders parse it correctly. Yahoo! is playing a pivotal role in this space with SearchMonkey. We hope to see Yahoo! properties publish RDFa soon!

Y!: Where can I learn more about RDFa?
BA: Our wiki has all the relevant material: http://rdfa.info/wiki

And you should join our brand new users’ mailing list: http://lists.w3.org/Archives/Public/public-rdfa/

May 15, 2008

The Monkey is Out and the Challenge is On

It’s been three weeks since we began the limited preview of Yahoo! Search’s new open developer platform, SearchMonkey. Today, we’re officially opening up the doors to all developers — professionals and hobbyists — to begin building applications that enhance the usefulness and relevance of search results.

There are three components to this open ecosystem:

  • Site owners share structured data with Yahoo!, using semantic markup (microformats, RDF), standardized XML feeds, APIs (OpenSearch or other web services), and page extraction.
  • Third party developers build SearchMonkey applications.
  • Consumers customize their search experience.

So, what’s in it for third party developers?

With SearchMonkey, developers have a hand in shaping the next generation of search by building customized search results and mash-ups that users can add to their Yahoo! Search experience. By leveraging structured data from sites like CitySearch, StumbleUpon, eBay, or Epicurious.com, developers can add navigational links, reviews, contact information, and even locations to provide enhanced search listings.

Developers can build two types of applications using SearchMonkey: Enhanced Results and Infobars. Enhanced Results replace the current standard results with a richer display. All the links in the Enhanced Results must point to the site to which the result refers. Infobars are appended below search results and can include metadata about the result, related links or content, or links for user actions (such as adding a movie to a Netflix queue).

infobar-netflix

The process for building SearchMonkey applications is very straightforward:

    1) Application Type — Decide what type of app you want to build (Enhanced Result or Infobar) and enter basic info such as application name, description and icon.
    2) Trigger URLs — Decide the URL patterns that will trigger your app. For example, for the Enhanced Result above, the pattern would be “acmemovies.com/*”
    3) Data Services — Data Services are the structured data on which SearchMonkey apps are based. They can be created using data available in the Yahoo! Search index (via data feeds or page markup such as microformats or RDF) or by using APIs or page extraction.
    4) Appearance — Use PHP to configure how structured data should appear in the application.

DevTool Screenshot

Announcing the SearchMonkey Developer Challenge

To foster innovation and creativity on the SearchMonkey platform, we’re hosting a good old-fashioned competition. The SearchMonkey Developer Challenge will recognize innovative applications within four categories: Best Enhanced Result, Best Infobar, Most Innovative Use of Structured Data, Best Data Service, and Grand Prize (best over all categories). You have until June 14th to submit your applications for a chance to win up to $10,000.

And don’t forget to come kick things off with us this evening at the SearchMonkey Developer Launch Party. Catch live demos, meet the product team and enjoy free food, beer and, of course, schwag at Yahoo!’s Headquarters in Sunnyvale.

Whether you can join us for the party or not, keep in touch — visit our suggestion forum or drop us a comment below. We want to know how the tool is working out for you.

We look forward to evolving web search with you.

Amit Kumar
Director, Product Management
Yahoo! Search

March 13, 2008

The Yahoo! Search Open Ecosystem

A few weeks ago, we began talking about the new Yahoo! Search open platform. Today, we’re releasing more details about two important components of the initiative — the developer platform as well as our support of a number of semantic web standards.

The Data Web in Action
While there has been remarkable progress made toward understanding the semantics of web content, the benefits of a data web have not reached the mainstream consumer. Without a killer semantic web app for consumers, site owners have been reluctant to support standards like RDF, or even microformats. We believe that app can be web search.

By supporting semantic web standards, Yahoo! Search and site owners can bring a far richer and more useful search experience to consumers. For example, by marking up its profile pages with microformats, LinkedIn can allow Yahoo! Search and others to understand the semantic content and the relationships of the many components of its site. With a richer understanding of LinkedIn’s structured data included in our index, we will be able to present users with more compelling and useful search results for their site. The benefit to LinkedIn is, of course, increased traffic quality and quantity from sites like Yahoo! Search that utilize its structured data.

linkedin_FINAL.JPG

In the coming weeks, we’ll be releasing more detailed specifications that will describe our support of semantic web standards. Initially, we plan to support a number of microformats, including hCard, hCalendar, hReview, hAtom, and XFN. Yahoo! Search will work with the web community to evolve the vocabulary framework for embedding structured data. For starters, we plan to support vocabulary components from Dublin Core, Creative Commons, FOAF, GeoRSS, MediaRSS, and others based on feedback. And, we will support RDFa and eRDF markup to embed these into existing HTML pages. Finally, we are announcing support for the OpenSearch specification, with extensions for structured queries to deep web data sources.

We believe that our open approach will let each of these formats evolve within their own passionate communities, while providing the necessary incentive to site owners (increased traffic from search) for more widespread adoption. Site owners interested in learning more about the open search platform can sign up here.

A Developer Ecosystem for Search
We’re also announcing, today, that the Yahoo! Search open platform will be open to all third party developers. We will be kicking off this component of our open platform with a developer launch party at our Sunnyvale campus in the coming weeks. That day, we’ll launch a beta program for a tool that developers can use to build Enhanced Results applications for the Yahoo! Search platform. Enhanced Results apps built by developers can utilize the structured data available through public APIs and in our index (made available by site owners through either feeds or the semantic web standards discussed above).

Let us know what you think below and keep an eye on the Search Blog — we’ll be posting more info about the upcoming launch party.

Amit Kumar
Director, Product Management, Yahoo! Search