« November 2004 | Main | January 2005 »

December 30, 2004

Ali Diab Interview, Part II

This is the second half of the Interview with Ali Diab

Q: Going back to the products you work on at Yahoo!. I know that Yahoo! Maps is about to introduce some new features. Can you talk about them yet?
A: Actually, we recently launched our new real-time traffic feature for Yahoo! Maps. It provides driving conditions, incident reports, the speed of traffic, the severity of each incident and a bunch of other things that help people get where theyíre going with less of a headache. In the future, I think you can expect Yahoo! Maps to become much more context-aware, user-friendly and inter-connected with other services and device interfaces, be they mobile phones or PDAs.

Q: Earlier you talked about the reviews feature of Yahoo! Local. This question came in from a blogger who questioned how Yahoo! would sell advertising to businesses whose customers might write negative reviews. He wanted to know why heíd pay to advertise and risk getting a bad review from an unhappy customer.
A: Well, the key concept behind Local is the idea of community. To build any community you have to allow people to hear and be heard. Ratings and reviews allow customers to voice their opinions about the businesses and services theyíve used. Of course, there will be a broad range of opinions from negative to positive and all the ones in between, but to add real value to the Local user whoís trying to decide whether to use a certain business, you have to have an open forum.

In some ways, I also see the reviews as a kind of checks and balances for the business. It holds them accountable to their customers who now have an additional platform to voice their likes and dislikes. People are going to talk about what was right or wrong with a business anyway. At least this way, the business can monitor whatís being said and maybe learn from it.

But ultimately, the businesses will have to weigh the pros and cons. When all is said and done, Local is still one of the best and easiest ways to reach a massive audience that would cost a bundle to target in the traditional marketing world.

Q: So the reviews arenít censored by Yahoo! to create a kind of biased, business-partial directory?
A: Again, itís about providing honest feedback from real customers. If we slanted them in anyway, they wouldnít be reliable. We do provide guidelines for what to write and not write in the reviews, but for the most part, we trust that customers will use the platform responsibly.

Q: I heard you just got married a few months ago. Is there any similarity between planning a wedding and launching a product?
A: I did learn some very interesting lessons planning the wedding that Iíve applied to my job and vice versa. First you need to go with the people you trust will do a good job and who have a good track record. Then you have to let them do their job. You canít try to micromanage them because you canít control everything. If they enjoy what theyíre doing, theyíll put their heart into it and go beyond the call of duty.

In terms of what I learned from the wedding, obviously Nora has great taste and is very strong willed -- like me. (laughs) But our tastes differed on a lot of things so we came up with a rule that I think is very important in the work place as well: if it matters more to that person than it does to you, then let them have the final say--obviously within reason. Thatís something Iím increasingly applying in my work and I find that the results are great because people who care passionately about something typically have thought about it a lot more than you have and may have more insight into it than you.

Q: I understand that you have a Math degree from Stanford and youíve talked about always having a passion for computers and technology. When did you first realize that this was something you truly loved?
A: Well my parents always say that when I was a toddler, my dad had a really old micro-computer and I was fascinated watching him work on it. In addition Iíve always been interested in taking things apart and putting them back together again. Iíve also always been product oriented. I used to make furniture when I was in high school and build my own bikes (I've been racing motorcross since I was a kid). I must have torn down and built up probably a half dozen cars and motorcycles.

I like building things that have a purpose or that lead to a certain outcome. I find it interesting to research the best materials and resources to figure out the most effective way to build something. Itís really fun.

Q: So what are you working on right now? What are you building?
A: Iím an avid skier and Iím trying to get my hands on an old ski-press so I can create my own pair of skis. I bought these really phat powder skis for the season and theyíre nice but I donít think theyíre going to have the edge-holding characteristics I need. Skis that are very good for holding an edge on ice tend to have a lot of metal or a lot of wood in them. They also tend be heavy and narrow so they can hold an edge against a mountain. Whereas powder skis tend to be wider and lighter to float and keep you above the powder so you donít sink.

Iíd like to build these kind of hybrid skis that have the best of both worlds. I want to use a certain type of wood that is very porous and lightweight so that itíll float well on powder but at the same time have very good vibration dampening characteristics so that it will hold well on ice and chattery snow.

Q: As we wrap this interview up, what do you look for when youíre recruiting for the Local team?
A: We recruit all the time and weíre always looking for smart, self motivated people. I especially like recent grads. I feel like theyíre hungry and I like watching people evolve and teach themselves how to do things. Its fun to see people just getting out of school. They make mistakes but itís seeing that energy and honesty and that effort that makes it really really satisfying.

I also look for people who have a passion outside of workóbe it sporting, musical, philanthropic, whatever, because I think that kind of balance is important.

But balance doesnít always come easily. At first you go from one extreme to the otheróitís like a pendulum, and then eventually you find that happy mediumóthat harmonic frequency that works.

Yvette Irvin
Y! Profiler


[ Yahoo! ] options

December 22, 2004

An Interview with Ali Diab

Next up in my "quest to meet the most dynamic people in search" is Ali Diab. Ali doesn't sit on the sidelines. He's the kind of guy who likes to roll up his sleeves, figure out a problem, and dive in a make it happen. You see it in his passion for developing exceptional Search products as well as in his general zest for life.

In this two-part series we'll find out what Ali has to say about competition in the Local Search market, his love of technology, and his passion for helping those in need.

Q: Describe to me what you do for the Local team here at Yahoo!.
A: I oversee product management for Yahoo!'s local products, which includes City Guides, Local, Maps, Yellow Pages. I also work with other groups who want to leverage or add some type of a local aspect to their products. That obviously includes heavy integration with Yahoo! Search, as well as teams like Personals, HotJobs and all sorts of other areas of the network.

Q: What made you decide to join the Yahoo! Local team?
A: After meeting the team, I was really impressed. They are not only some of the top technologists and engineers I've ever worked with but they also have a very strong understanding of their consumer needs. They are genuinely a good bunch of people to work with which is always a huge plus. I'd actually say, that's one of the most important things.

Q: What would you say is the biggest challenge working on a product that has such strategic importance to the organization?
A: I think the biggest challenge for us as a team is continuing to innovate and build products that our consumers really want. The level of competition in this area is immense and comes from many different directions. So staying ahead of the competition, so-to-speak, can be challenging.

Q: Speaking of competition; what's unique about Yahoo! Local compared to similar local offerings?
A: I think there're a lot of things that differentiate Yahoo! Local. Of course we have the ratings and reviews which gives people an idea of what others think of the business or service before they try it themselves.

But I think one of our biggest differentiators is the depth of structured content we offer. We provide the basics like business address, phone number, website and then we take it a step further: we let businesses provide information that's unique to them like hours of operation, payment methods, specialty, and ambiance.

Q: What if the business doesn't have a website? Can I still find them in Yahoo! Local?
A: Yes, you'll still find them. Most people assume that if the business doesn't have a website then it can't be listed, but that's not true. If they have a physical place of business, we'll include them in Yahoo! Local. This is important because more and more people are turning to the Internet for local information and they need to find more than just those businesses that have their own website. It also helps smaller businesses connect to a much larger audience then they could traditionally.

Related to that, we have a substantial number of businesses and services on Local and we're continuing to grow that content. We have over 15 million listings and we've made it easier for people to add or update businesses. This is something that our users said they wanted when we were in beta and now we've added it. It essentially lets business owners either add their business or edit their existing one and it also encourages non-business owners to suggest a business listing or alert us to business changes in their community. [See Search Engine Watch article]

We also provide things like our refine and sort, which are unique to Local. These features let you define the specific type of business you're looking for. For example, instead of just looking for any and all restaurants in San Francisco, I can specify that I want an elegant restaurant with entertainment, a great bar and within a certain price range.

Q: What do you think you uniquely bring to what you do?
A: I think I'm good at building teams and recruiting people and motivating people to perform. I believe you have to find the right people to do the job and then give them breathing room. You have to let them demonstrate that they can succeed.

Zod [Farzad Nazem] our CTO, is a really good role model in terms of how to run an organization. He's told me many times that you need to just give people the benefit of the doubt and you need to let them do their job even if it isn't always in the way that you think it should be. Even if you're right, it's kind of the nature of democracy; people have free will. You can probe and you can question, but at the end of the day, if people are in a role and you want them to be successful and the company to be successful, you have to let them do their job.

Q: At thirty, you've accomplished quite a lot. What's the biggest lesson you've learned from your experiences so far?
A: I guess I've learned through both my academic and professional experiences that you need to pace yourself. I feel like if I want to be happy long term, I really have to enjoy everything I do and that may require me to focus on doing fewer things.

I can't say that it [pacing] has been an easy thing for me. In some ways I've had to rework my wiring from being always driven, always pushing, to sometimes kind of laying back a little bit and letting things happen at their own pace. And being in a type A driven industry and a type A driven company in particular, it's sometimes hard to pace yourself because you often feel like you're foregoing opportunities or you're not rising a fast as your peer group or whatever, which may sometimes be the case. But I do believe ultimately-long term-if you're going at your own pace and you're doing the things that you really enjoy, you'll achieve the things that you want to achieve when it's right to achieve them. And you'll enjoy yourself along the way which is more important.

Stay tuned for part II next week.

Yvette Irvin
Y! Profiler


[ Yahoo! ] options

December 16, 2004

Searching For...Gridlock?

Thought you all might think this as cool as we did (and very useful for us commuters). Yahoo! Maps now includes real-time traffic, construction & accident reports, and even speed/congestion information which can then be overlaid onto your map or driving directions. When you pull up a map or driving directions, click on the module on the right to activate traffic info. For example, San Francisco traffic:

traffic.gif

Coverage is in 70 top metro areas right now. The data comes from many sources including government agencies responsible for traffic info collection in those cities and it works much the same way that SmartView does.

Play around with it and let us know what you think!

Jeremy Kreitler
Product Manager
Yahoo! Local


[ Yahoo! ] options

AV/ATW and Video Search

We've seen people commenting on the similarities between AllTheWeb & AltaVista video search and the just released Yahoo! Video Search beta. Those observations are correct: we've made a number of improvements to the Yahoo! Search infrastructure and index as part of the video search beta rollout (not to mention adding RSS support) and ATW, AV and other sites we supply video search results for are now benefiting from that.


Please keep the feedback coming!

Andy Volk
Product Manager
Yahoo! Video Search


[ Yahoo! ] options

December 15, 2004

Yahoo! Video Search Beta

There were some rumors a few weeks back about Video Search products coming in 2005. Well, we're ready to show you what we've got today--and to ask for your feedback. An early Yahoo! Video Search Beta is now up on Yahoo! Next, our preview site for new technology and applications. I've spent a few hours with it in the last few weeks. Go try it out and let us know what you think. Remember that it's a beta product.

Why Video?

The costs of producing video content have been steadily decreasing in recent years. Between the adoption of broadband Internet connections, and easier to use video editing software, it's no surprise that we're seeing a lot more video content make its way on to the Internet. And what's out there today is just the tip of the iceberg.

The Backstory

But there's more to the story here than the blossoming world of on-line video and building a video search system: it's often not easy for a web crawler to find downloadable and streaming video content. Unlike web images and most audio files, videos aren't always easy to discover. In many cases, they're hidden behind complex JavaScript, Flash-based players, and other non-crawler friendly obstacles. That's exactly why we've talking to a lot of our existing media partners, many of whom have sizeable video assets which have yet to be indexed.

Enter RSS

When we started thinking about how to make it easier for anyone to expose video and other rich media content, one of the first things we thought of was podcasting and RSS. Podcasting uses RSS Enclosures to provide an audio file along with a news item or blog posting in an RSS feed.

So rather than build a completely new way to do this, we decided to see what it takes to make RSS Enclosures work for video content as well: video enclosures. It's not a new idea but we think it's one whose time has come.

At the most basic level, this is just a matter of pointing to a video instead of an MP3 file.

Instead of this:

<enclosure url="http://www.example.com/baby_walks.mp3"
 length="64358" type="audio/mpeg"/>

You could use this:

<enclosure url="http://www.example.com/baby_walks.mov"
 length="2144275" type="video/quicktime"/>

For many publishers, that's all it takes. The beauty of this is that there's existing infrastructure for handling simple enclosures. Many RSS readers already consume enclosures just fine.

In the very near future the Yahoo! Video Search crawler will support indexing video enclosures in RSS feeds.

Metadata Extensions

As Marc Canter has noticed, we could all benefit from a bit more metadata to go with this growing pool of media. Who published this video? What formats are available? How is it licensed?

From our point of view, it means we can build a much better video search. You might want to filter results based on some of that metadata (title, actor, file format, etc). But it also opens up so many more doors. For example, your news aggregator might use your preferences to figure out which videos to download: Windows Media or Quicktime? High bandwidth or low? Heck, we can see entirely new rich media aggregators and tools being built--something like the popular iPodder currently used for podcasting. And when they are, this metadata becomes all the more important.

To get this started, we're suggesting an optional set of metadata extensions that we've been calling "Media RSS" (yes, we're so creative with names). They're aimed at publishers who'd like to provide a rich set of metadata about the media being published. Our video search system will also support these Media RSS extensions in addition to video enclosures (see the FAQ and the draft spec).

In addition, we're working with several other companies and organizations to help refine these ideas. They include: AtomFilms, Creative Commons, Buzznet, Ourmedia, and Broadband Mechanics.

Evolution

This is all about helping RSS evolve to handle all the media types we might want to attach to RSS feeds. Early last month when InternetNews.com asked me about enclosures, I said two things that are worth repeating in this context:

Enclosures are more evolutionary that revolutionary.

That's exactly why we want to start by building on the foundation of enclosures rather than introducing a brand new format.

RSS is a reasonably flexible format for distributing content. Enclosures are to RSS what attachments are to e-mail.

In other words, enclosures aren't just for podcasting anymore. If you want to be a video blogger, enclosures should work for that too.

If you're a user, aggregator author, or content provider who'd like to get involved with the development of Media RSS please join the rss-media group. Of course, you can also leave a comment here or trackback this post to let us know what you think.

Jeremy Zawodny
Yahoo! Search

P.S. There a lot of funny stuff out there like Monkey Karate :-)


[ Yahoo! ] options

Searching For Last Minute Gifts?

Around this time of year, I like to watch for trends in search as it relates to holiday shopping - what are the most popular items people are searching for?

1. Digital camera
2. Nintendo DS
3. ipod
4. Playstation 2
5. Flowers
6. Toys
7. Ugg
8. Terrain Twister
9. Xbox
10. Furreal Friends

That got me wondering what people are actually buying, so I checked our Yahoo! Shopping Holiday Gift Center for the most popular items being purchased:

Toys:
1. Nintendo DS
2. Barbie
3. Tamagotchi Connection
4. Leap Frog learning toys
5. Furreal Friends

Tech Gadgets:
1. Digital camera
2. MP3 players (especially ipod)
3. Mobile phones
4. PDA's
5. Home theater

Clothing & Accessories:
1. Burberry scarves
2. Ugg boots
3. Puma sneakers
4. Lacoste Pique Polos
5. Skagen watches

If you're like the rest of us and still haven't finished your holiday shopping, hopefully the above lists offer some ideas. You might also want to check out the Yahoo! Shopping Last Minute Gift Center. It refreshes in real-time so it only displays the merchants who can deliver products on or before Christmas.

If previous years are any indication, the really last minute shoppers will be sending gift baskets or flowers on the 24th. I'm guessing your brother would prefer a digital camera instead of a box of pears so -- start shopping!

Greg Gunwall
Sr. Product Manager
Yahoo! Shopping


[ Yahoo! ] options

December 13, 2004

Some New Folks at Yahoo!

It was announced today that Usama Fayyad and Bassel Ojjeh have joined Yahoo! as Chief Data Officer and VP of Technology-Data, respectively. You can read all about their pretty impressive experience and accolades here, but suffice to say, it's great to see Yahoo!'s continuing commitment to recruit the best and brightest technical and scientific people!

Usama puts it nicely in the press release:
"The opportunity to bring a disciplined, scientific approach to data technologies at Yahoo! was irresistible on both a person and technical level," said Fayyad. "I look forward to creating a scientific environment at Yahoo! that will continue to attract some of the best technical minds in the industry, while working to apply that knowledge in ways that will benefit all of our businesses."

Hm...Usama and Bassel would be great folks for Yvette to profile here! We'll have to bug them after they get settled. Stay tuned...

Nancy Evars
Yahoo! Search


[ Yahoo! ] options

December 10, 2004

Holiday Recipe Friday Buzz

Every day offers a new temptation. Cookies, candies, and brownies, oh my. The holiday season in the office brings a constant battle with a tenacious sweet tooth that's never fully sated.

Cooks everywhere are busy stirring up the sweet treats that make the holidays oh-so-delicious. When it's time to branch out and try a new confection, we see folks flocking to the search box for some sweet inspiration.

Over the last week, we've witnessed spikes on:

* Marzipan (+169%)
* Chestnuts (+155%)
* Sugar Cookie Recipes (+150%)

and even the dreaded and derided...

* Fruitcake (+136%)

But that's not all. We've seen search surges on "Christmas Cookie Recipes," "Christmas Candy," and "Christmas Recipes" over the past few days as home chefs have their ovens working overtime in advance of the 25th.

What's tops in holiday-related recipe searches? Well, we're sure you'll find something in this list that'll make your mouth water:

* Fudge Recipes
* Holiday Cookie Recipes
* Gingerbread House Recipes
* Hanukkah Recipes
* Chocolate Chip Cookie Recipes
* Biscotti Recipes
* Eggnog Recipes
* Easy Cookie Recipes
* Gingerbread Cookie Recipes
* Candy Cane Recipes

I prefer a tooth-challenging homemade toffee or perhaps a decadent, ultra-tart lemon bar, but it seems ginger and chocolate carry the day when it comes to holiday sweets. What's your guilty gustatory pleasure this holiday season? Will you be searching for a special recipe to make something yummy for family and friends? Is favorite recipe missing from our current top 10? Let me know.

Erik Gunther
Yahoo! Buzz Index Editor


[ Yahoo! ] options

Desktop Search News

Well, it looks like this desktop search stuff is a pretty hot topic. Who'd have thought?! :-)

We've seen a lot of write-ups since last night. Here's a sampling:

As they've indicated, we'll have much more to say in the coming weeks and you can expect to see us talk more about it right here on the Yahoo! Search blog.

Stay tuned...

Jeremy Zawodny
Yahoo! Search


[ Yahoo! ] options

December 08, 2004

Questions for Ali Diab of Yahoo! Local Products

Ali Diab studied Economics, French Literature, and Math at Stanford and graduated business school at age 23 as part of Oxford's first MBA class. From there he joined Goldman Sachs where he worked on some famous mergers and IPOs (including the IPO of NTT DoCoMo, Japan's largest mobile phone company). He started his own Internet company, BuildPoint, worked for Microsoft and was offered the job of Chief Product Officer at T-Mobile Online before joining Yahoo!--all well before his 30th birthday.

As head of production for Yahoo! Local Products, Ali keeps that same momentum going. He provides strategic direction for the Local product line which includes City Guides, Yahoo! Local, Maps, and Yellow Pages.

I'll be sitting down with Ali soon to talk about all this and more. If you have questions you'd like me to ask him, just post them below.

Yvette Irvin
Y! Profiler


[ Yahoo! ] options

December 07, 2004

Yahoo!'s Year End Party Celebration

No -- I don't show up here only to post about the Yahoo! parties, but we did have one doozy of a year end party (YEP) this past Saturday night in SF. This was my fifth YEP and as with all the previous years, I thoroughly enjoy seeing friends and colleagues dressed in something other than t-shirts and jeans!

The Girls All Dressed Up How Many Yahoo's Are Here??

In addition to the ice sculptures, amazing food and great music, one of the reasons I love attending the YEP, is the charity auction held every year. The auction starts a week in advance online through Yahoo! Auctions and culminates with in-person bidding at the YEP for the big ticket items. We all have the chance to bid on cool donated items and experiences ranging from a Stanford hoops game with Jerry & David or an LA movie premiere from Terry, to coveted parking spots or a weekend in Tahoe. Proceeds go to the Yahoo! Employee Foundation(YEF), our grassroots philanthropic organization that helps out non-profits in the community. Proceeds from both the online and live auction brought more than $120,000 to YEF, far more than any previous YEF auction.

A great way to start out the holiday season!

Nancy Evars
Yahoo! Search


[ Yahoo! ] options

December 06, 2004

Tim Converse Interview, part 2

As promised, this is the second half of the Interview with Tim Converse.

JQ: There's been this sort of continuum evolving since the Alta Vista days, where we started with primarily static content and then the next generation was catalogue type shopping like Amazon. Now it's this whole micro-publishing independent thing. What do you consider the biggest challenge right now in terms of classifying content?

A: One of the big challenges for us is just understanding what's out there. It's almost like astronomy where you're just trying to catalogue all the different things out there and track how they're growing, to some extent. So it's important for us to know just how many sites and kinds of documents there are so that we can catch trends.

JQ: You talked about comprehensiveness. There's this perception that there's the web that most of us see and then this dark web: the stuff that the crawlers don't reach. How do we try to get that data into the index? Are there barriers that webmasters put up that they should avoid to help us better index the content?

A: At it's simplest, webmasters aren't aware of robots.txt and it's uses. Redirection can also be problematic if people create content by creating lots of domains or hosts so we encourage people to organize their sites in many documents before they get a new host.

And of course, there's also the issue of crawler traps which some people do intentionally but much more often, they've unintentionally created crawler traps....

JQ: ...and a crawler trap is...

A: A crawler trap is something where you crawl a page and it has a link, usually in the same site that's dynamically created and then you follow that link and it has another analogous link that's dynamically created and often, just because people make mistakes, you're attaching on another directory every time which doesn't exist and takes you back to an automatically generated error page which has the same link. So you can fall into traps where there are an infinite number of pages that don't have any content.

Another thing people can do to help us is, this is sort of geeky but, don't make page not found pages that return a status 200.

JQ: I was just about to ask that. 404 pages back in the day, were these ugly grey things with block text that all looked the same and now they're done up to look like regular pages to be more appealing to users.

A: We do actually have ways of detecting that but it's a lot easier for us if a web server just says, "this page doesn't exist" as opposed to creating a nice page for the user that to a crawler looks like any other page. In general, if the server tells us 404, then we discard it.

YQ: I worked for a company that used CIDs instead of cookies to follow users through the site and it turned out to be a disaster. We went from having pretty much every page indexed to hardly any. So what about CIDs and how they affect the crawlers?

A: If you have differences in the URL that don't actually make a difference in the site, that can be hard for us to untangle. We're getting better at it. One of the scenarios you're talking about there would just create a lot of duplicates for us. So it's nicer for us if we have one URL per actual content but we understand that you're not designing this just for us. And we obviously do a lot of duplicate detection--actually, we do duplicate detection in a couple of different ways. Finding out if documents are the same; finding out if sites are mirrors of each other.

JQ: This question came up today on a mailing list that I'm on. The concern for this particular company is that they want to move their site to a new domain but they don't want to become invisible for the next six months or year or however long it'll take for people to point to their new website. What can we tell people like that?

A: We can tell them that in the future, if you actually want to move your site, you want to use a 301 redirect which will do as much of the right thing as we can.

YQ: What actually happens there? I've heard of companies who have used 301 redirects and yet their old pages continued to show up in the search engines anyway. Why is that?

A: The underlying problem is that people out there haven't changed their links and search engines do pay attention to links.

I can't give you a date, but we're changing how we deal with redirects. The thing about redirects is that everyone thinks it's obvious how a search engine should treat them and the obvious answer is not really that helpful. Any policy you develop with redirects is going to make someone unhappy but what we're about to roll out we will pay better attention to 301 redirects and the exact problem you're talking about should be less.

[In the time since we met with Tim, the team has rolled out a fix for 301/302 redirects. Documents will be handled by the new redirect policy as they are re-crawled and re-indexed and webmasters will start to see many of the sites change in the next couple of weeks. The index should be fully propagated within a month. See Tim Mayer's Webmaster World presentation for details.]

YQ: You mentioned earlier that you'd just bought a piano. I read on your website that when you were eight, you ran away from home to escape piano lessons. Is that true and did you just hate piano back then?

A: Yeah, that's true but I never hated piano, just lessons. Now that I've bought the piano, I'm practicing again. Right now I'm learning a classical piece by Bach. I'm a slower learner now though--I should have stuck with it when I was eight. But when it comes to the kind of music I actually listen to, I like rock, hip hop and classical. I'm not too into jazz.

JQ: Which do you listen to when you're programming?

A: (laughs) I don't actually. I don't deal with headsets well.

JQ: In terms of freshness, there's a lot of talk about how quickly an RSS-watching engine will pick up new content as opposed to getting stuff into Yahoo! Search. The question they ask is why can't we just ping Yahoo! and get the crawler over here?

A: Well, that's not the only source of latency or possible delay. We build very large databases and it's kind of a large industrial process involving lots and lots of machines. There's some delay between the last document we heard about and the time we actually put something live. In some cases that delay matters more than any delay in finding out if something has changed. We also pay a lot of attention to whether something has changed. But I think you'll see us getting fresher and fresher.

YQ: When you're looking for things that are changing on the page, what are you specifically talking about? I'm sure it's not enough to just change a hidden date stamp in a footer.

A: Yes, It's more than that. Most of the web is just static even without there being date stamps. We do have a more nuanced notion of what it means to change so we can detect a trivial change from a significant one. We can tell a major change from a trivial one.

YQ: You mentioned that you were between the scientist and the programmer. What do you think you uniquely bring to what you're doing?

A: Well I'm hoping that one side thinks I'm the other side and that the other side thinks I'm the one side. (laughs) I'm hoping I've got them all fooled but I have this feeling that I don't. But no, I think that what I uniquely bring is that I can talk to both sides, I've been a programmer; I've trained to some extent in the direction that the scientists have trained and went to grad school for a long time in related topics so increasingly, though I never thought I would play this role and it's not what I envisioned, what I'm bringing to it is I can talk to a lot of different sides and I can prioritize the stuff and my grasp of the technology's not so bad either.

Yvette Irvin
Y! Profiler


[ Yahoo! ] options

December 03, 2004

All Streaks Must End

ken jennings.jpg

The Answer: Most of this firm's 70,000 seasonal white-collar employees work only four months a year.

The Question: What is H&R Block? -- The response that terminated the reign of Jeopardy darling Ken Jennings.

It was yesterday afternoon that word hit -- and while the blogs had suspected as much for awhile -- we soon learned the rumors were true: America's favorite know-it-all, the greatest game-show winner of all time, had met his match. Searches on "Ken Jennings" skyrocketed 318% on the news. Even "Ken Jeopardy" jumped in search, suggesting the perfect name change for the recordholder. Jeopardy-watchers went hunting for H&R Block (+132%) (the correct answer), FedEx (+194%) (the incorrect answer), and Nancy Zerg (who dethroned the ruler). The moment Jennings gave that sly, sideways smile, we knew he was sunk. In the weeks before, we'd seen gloomy searches on "Ken Jennings loses" and "Ken Jennings lost." Why never a "Ken Jennings winner"? Yes, the 74-game run is over, but the elfin wunderkind walked away with more than $2.5 million. Despite the loss, TV viewers weren't rid of him -- he popped up again later that night on David Letterman (reminding us of his touted Top 10 showing) and on Regis and Kelly on Wednesday morning. And Jennings junkies can always visit him at his own site, Ten Favorite Films by Year.

Molly McCall
Yahoo! Buzz Index Editor


[ Yahoo! ] options

December 01, 2004

An Interview with Tim Converse

Tim Converse isn't your average search geek or for that matter, your average guy.

Though he's ostensibly shy and unassuming, he has a definite adventurous streak. Tim has tried everything from skydiving to African safaris, he likes rap (but only old school) and he once played keyboard in a punk rock band. But what really sets Tim apart is his knowledge of the inner workings of search. Unlike most of us who may have a decent understanding of the search world (or us novices who know just enough to type in a query and hit the "search" button), Tim understands the mysteries behind what makes it all work and how to make it better.

As an engineering manager in Yahoo!'s Content group, Tim and his group help make search results more relevant.

Jeremy and I spoke with Tim last week. Here's what we learned about content classification, what Tim likes to do for fun, and some little-known facts about Yahoo!'s obsession with foosball.

JQ: You've said that your group is charged with content classification. What exactly is content classification and why is it important to search?

A: Well, the more we know about documents the better. So part of what the Classification group does is label web pages and sites, or put them into categories. And while I can't get into specifics about the categories we use, a big part of this is trying to detect who's spamming us--or trying to trick us into ranking their sites higher in our search results.

Our classification code gets deployed in the Content system, which does the crawling and indexing to build search indexes that we end up serving queries from. That's mainly for our own group YST [Yahoo! Search Technology], which handles the back end of web search, but we also provide data to other groups, including Image Search.

My group also writes tools to interact with the Content system. We can query it in all sorts of ways to find out what's happening with particular sites or URLs. This is a challenge because the Content system is very distributed and heterogeneous.

YQ: It seems to me that if you're writing code for something, at some point, you've written it and it's done...

A: Well, we're never really "done." A few years ago my cousin asked me what I was working on and I told him "Excite's web search engine." He said, "so that would imply it's not done? Or it needs work or what?" (laughs) And so yeah, especially with how competitive the market is, these things are always under development. There's far more ways we can think of to make it better than we have engineers to do it. So even just with our list of ideas right now, we could be going for five years and there are always new ideas.

I should point out that although we're focused on deploying code for YST, there's a lot of expertise in the company and several different groups of scientists focused on classification. A lot of the challenge for me is just managing to benefit from that expertise for YST...

JQ: And connecting those dots in the company?

A: Yeah. It's kind of a cool job because I'm sort of in between scientists and programmers and there's such a spectrum of roles and responsibilities. We have people all the way from kernel hackers to linguists and needless to say, a kernel hacker can't really talk intelligently to a linguist or vice versa but you have to have this long chain of people who can really talk to each other so I've kind of got scientists on one side and programmers on the other.

YQ: What has been the biggest change in the way you approach writing the code and how you approach content classification?

A: I don't think the way we think about writing the code has changed. The way we're approaching search itself has changed a lot.

For instance, comprehensiveness is a much bigger deal these days. In the Inktomi days we wanted just one copy of anything that was good because serving documents costs so much. Now we'd really like to have everything.

So then the challenge is ranking everything appropriately. You really want to put everything out there but then...

JQ: ...that assumes there'll be a lot of junk?

A: Right and so then the challenge is identifying and appropriately ranking it all.

The big things for us are "relevance," "comprehensiveness," "freshness," and "presentation." That's "RCFP" and it's kind of our mantra. I'm much more focused on the "R" part of the relevance, although we have a whole group of scientists and modelers who are totally devoted to relevance too. My buddies in my group who work on crawling and indexing are focused on comprehensiveness and freshness as well.

YQ: Switching gears for a moment, Tim. What do you do for fun? What kinds of things are you into?

A: I like games. All kinds. Strategy games, pool, 8-ball, 9-ball, billiards. I'm a little worried about Yahoo!'s growth plans, because I think our pool table may not be scaling with the hiring we're doing. Is anyone looking into that? We had a nice one at Inktomi, but I think it's in storage somewhere.

I guess foosball is a Yahoo! game of choice, so I'm trying to catch up on it. It's a little known fact that one of the game's experts, Phu Hoang, is here at Yahoo!, and the game was named after him. I'm also interested in music and I just recently bought a piano. I hadn't played piano in a long long time.

JQ: When you're hiring someone for your team, what are you looking for?

A: We look for pretty senior engineers and like I said before, it takes a lot of types of expertise to make a web search engine. In terms of skills, we're looking for C++ coders, strong problem solvers, and people who understand CS algorithms. Obviously there are particular roles that require some particular expertise like experience in classification and textual analysis.

Right now, we're hiring pretty aggressively.

JQ: When it comes to fighting spam, there's all kinds of software and many people trying to stop spam attempts. With all of us trying to detect this, is there a way to tell the search engines about it?

A: We get a lot of that data on our own. We have a pretty large view and we're approaching the spam problem from a lot of different directions. But nobody should expect to see any sudden change in spam just yet.

Take weblog comment spam, for example. Two things will have to happen for comment spam attempts to decrease; one is that spamming will have to not work for search engines and the second is that comment spammers will have to realize it. (laughs). There could be a long lag there where, even if every search engine totally nailed them, spammers could still operate under the belief that it worked. What we can do from the search engine point of view is make spam not help.

Next week Tim talks about redirects, index-able pages and why he doesn't listen to music while he's programming.

Yvette Irvin
Y! Profiler


[ Yahoo! ] options

Hosting by Yahoo!