Inside Yahoo! Labs: A Chat with Dr. Ben Shahshahani
- Posted August 11th, 2010 at 7:00 am by Yahoo! Search
- Categories: People
Dr. Ben Shahshahani is the head of Search Sciences at Yahoo Labs, which focuses on the scientific areas of information retrieval, machine learning, data and text mining, and natural language processing to make innovations in Yahoo!’s search experience. Ben talked to Yahoo! Search Blog’s Mireille Majoor about going beyond the 10 blue links, giving the user information before they search, and using social data to better understand user intent.
Yahoo! Search Blog: We’ve been hearing a lot about moving away from the 10 blue links and providing exciting new search experiences. Can you tell us more about how Yahoo!’s search sciences are contributing to that effort?
Ben Shahshahani: The whole concept of search being just a way of finding the top ten relevant articles has been changing for some time. What you see now on the search results page now is a very rich blend of information coming from different sources, like back-ends of local databases or news, Twitter, real time, social information, images, and videos.
Now, the other thing that has been happening is an integration of structured data and unstructured data, so structured meaning that there are particular attributes to different entities. We have a pretty active technology and science effort in trying to understand the main object, attributes, and relationships – not just the text on a web page. So, rather than actually having to go through various content pages and trying to find that information, we actually bring that information up front for the users.
Q: What can users look forward to with new developments in determining user intent?
A: We start very early in the process, trying to basically read the mind of the user and figure out what the user might be asking for so we can help them get to the information they want very quickly. So if you think of Yahoo!’s search assist, the minute you start typing the first few letters, the tray opens and we start making recommendations for the completion of the query.
Recently, we made a change where we promote time-sensitive queries in our Search Assist feature. For instance, if we think that an event that’s happening right now will be important to users, we may suggest the latest news about that query. So, if the World Cup is happening, and you start typing in “w-o-r,” you’re going to see queries that are related to World Cup instead of a less timely suggestion like “world map”.
Once a query comes in, the question is: “what is the intent” or “what are the common intents of the users submitting this query?” To answer that question, we use a variety of ways to understand the query – a lot of the queries are about objects.
Q: What do you mean when you talk about objects in the search context?
A: Objects are things in the real-world. They can be events, a location, a person or a product. Our active effort in understanding attributes and their relationships helps us find out the things you can do with those objects. Analysis of the data that we have tells us the kinds of things people do after they search for something. This helps us identify the possible intents when a user submits a query about an object. Now, on the other end of the spectrum, we are really looking towards trying to guess the intent of the user even before they start submitting a query, to give them the information they need before they even think to search.
Q: What are some trends you have observed about data sources and how users are getting content?
A: If you look at the way people discover content and information, we also see a change there. Nowadays content is often discovered through social networks. Your friends or people that you follow on Twitter may share a URL and as a result the traffic to those sites is going up.
Another trend has been that search is generated in the context of some other activity. For example, you may be reading an article, and that article might contain references to some things or objects that trigger your interest. We may extract information from our backend and then provide a blend of structured and unstructured data on the side for the user to see.
Q: Let’s talk about search and social. Now that we have more social data available to us than we did a few years ago, how are we making use of that data to determine search intent?
A: Social is a more recent topic. There are multiple facets into it. On one hand, social is a kind of data. We have social and real time content that is generated on Twitter, on Facebook, on Yahoo! Pulse, the comments that people leave, the traces that they leave behind on various pages. To the extent that these are all content, in the context of search, you want to present the most relevant content to a user, including social content. We already trigger results from Twitter when you do a search. But we have algorithms that decide whether or not reaching out to Twitter as a source of content is interesting for that query, for that person.
The other aspect is insights users leave behind by their social activities. The comments you leave, by sharing, by “thumbs up” and “thumbs down,” and the various other kinds of sharing you do online all help in enriching our insights about users and content.
Q: How does your research at Yahoo! Labs influence the way you view search?
A: One focus area for us is really about understanding users in different contexts. Search is the context where the intent is the most explicit. But even in the context of search, we like to consider search not as a stateless, information-extraction, but sort of an ongoing dialogue between the user and the system. User intent even in the context of search can be beyond the query that the user submitted; it could be in the context of the entire session. That’s from the perspective of understanding the user. We also have a strong effort in understanding content. What are contents about, in terms of the objects and relationships associated with the query? Bringing these two together can create awesome experiences for users.
- 2 Comments