Archive for the ‘People’ Category

December 08, 2004

Questions for Ali Diab of Yahoo! Local Products

Ali Diab studied Economics, French Literature, and Math at Stanford and graduated business school at age 23 as part of Oxford’s first MBA class. From there he joined Goldman Sachs where he worked on some famous mergers and IPOs (including the IPO of NTT DoCoMo, Japan’s largest mobile phone company). He started his own Internet company, BuildPoint, worked for Microsoft and was offered the job of Chief Product Officer at T-Mobile Online before joining Yahoo!–all well before his 30th birthday.

As head of production for Yahoo! Local Products, Ali keeps that same momentum going. He provides strategic direction for the Local product line which includes City Guides, Yahoo! Local, Maps, and Yellow Pages.

I’ll be sitting down with Ali soon to talk about all this and more. If you have questions you’d like me to ask him, just post them below.

Yvette Irvin
Y! Profiler

December 07, 2004

Yahoo!’s Year End Party Celebration

No — I don’t show up here only to post about the Yahoo! parties, but we did have one doozy of a year end party (YEP) this past Saturday night in SF. This was my fifth YEP and as with all the previous years, I thoroughly enjoy seeing friends and colleagues dressed in something other than t-shirts and jeans!

The Girls All Dressed Up How Many Yahoo's Are Here??

In addition to the ice sculptures, amazing food and great music, one of the reasons I love attending the YEP, is the charity auction held every year. The auction starts a week in advance online through Yahoo! Auctions and culminates with in-person bidding at the YEP for the big ticket items. We all have the chance to bid on cool donated items and experiences ranging from a Stanford hoops game with Jerry & David or an LA movie premiere from Terry, to coveted parking spots or a weekend in Tahoe. Proceeds go to the Yahoo! Employee Foundation(YEF), our grassroots philanthropic organization that helps out non-profits in the community. Proceeds from both the online and live auction brought more than $120,000 to YEF, far more than any previous YEF auction.

A great way to start out the holiday season!

Nancy Evars
Yahoo! Search

December 06, 2004

Tim Converse Interview, part 2

As promised, this is the second half of the Interview with Tim Converse.

JQ: There’s been this sort of continuum evolving since the
Alta Vista days, where we started with primarily static content and
then the next generation was catalogue type shopping like Amazon. Now
it’s this whole micro-publishing independent thing. What do you
consider the biggest challenge right now in terms of classifying
content?

A: One of the big challenges for us is just understanding
what’s out there. It’s almost like astronomy where you’re just trying
to catalogue all the different things out there and track how they’re
growing, to some extent. So it’s important for us to know just how
many sites and kinds of documents there are so that we can catch
trends.

JQ: You talked about comprehensiveness. There’s this
perception that there’s the web that most of us see and then this dark
web: the stuff that the crawlers don’t reach. How do we try to get
that data into the index? Are there barriers that webmasters put up
that they should avoid to help us better index the content?

A: At it’s simplest, webmasters aren’t aware of robots.txt and
it’s uses. Redirection can also be problematic if people create
content by creating lots of domains or hosts so we encourage people to
organize their sites in many documents before they get a new host.

And of course, there’s also the issue of crawler traps which some
people do intentionally but much more often, they’ve unintentionally
created crawler traps….

JQ: …and a crawler trap is…

A: A crawler trap is something where you crawl a page and it
has a link, usually in the same site that’s dynamically created and
then you follow that link and it has another analogous link that’s
dynamically created and often, just because people make mistakes,
you’re attaching on another directory every time which doesn’t exist
and takes you back to an automatically generated error page which has
the same link. So you can fall into traps where there are an infinite
number of pages that don’t have any content.

Another thing people can do to help us is, this is sort of geeky but,
don’t make page not found pages that return a status 200.

JQ: I was just about to ask that. 404 pages back in the day,
were these ugly grey things with block text that all looked the same
and now they’re done up to look like regular pages to be more
appealing to users.

A: We do actually have ways of detecting that but it’s a lot
easier for us if a web server just says, “this page doesn’t exist” as
opposed to creating a nice page for the user that to a crawler looks
like any other page. In general, if the server tells us 404, then we
discard it.

YQ: I worked for a company that used CIDs instead of cookies
to follow users through the site and it turned out to be a disaster.
We went from having pretty much every page indexed to hardly any. So
what about CIDs and how they affect the crawlers?

A: If you have differences in the URL that don’t actually make
a difference in the site, that can be hard for us to untangle. We’re
getting better at it. One of the scenarios you’re talking about there
would just create a lot of duplicates for us. So it’s nicer for us if
we have one URL per actual content but we understand that you’re not
designing this just for us. And we obviously do a lot of duplicate
detection–actually, we do duplicate detection in a couple of
different ways. Finding out if documents are the same; finding out if
sites are mirrors of each other.

JQ: This question came up today on a mailing list that I’m on.
The concern for this particular company is that they want to move
their site to a new domain but they don’t want to become invisible for
the next six months or year or however long it’ll take for people to
point to their new website. What can we tell people like that?

A: We can tell them that in the future, if you actually want
to move your site, you want to use a 301 redirect which will do as
much of the right thing as we can.

YQ: What actually happens there? I’ve heard of companies who have used 301 redirects and yet their old pages continued to show up in the search engines anyway. Why is that?

A: The underlying problem is that people out there haven’t
changed their links and search engines do pay attention to links.

I can’t give you a date, but we’re changing how we deal with
redirects. The thing about redirects is that everyone thinks it’s
obvious how a search engine should treat them and the obvious answer
is not really that helpful. Any policy you develop with redirects is
going to make someone unhappy but what we’re about to roll out we will
pay better attention to 301 redirects and the exact problem you’re
talking about should be less.

[In the time since we met with Tim, the team has rolled out a fix
for 301/302 redirects. Documents will be handled by the new redirect
policy as they are re-crawled and re-indexed and webmasters will start
to see many of the sites change in the next couple of weeks. The
index should be fully propagated within a month. See href="http://www.ysearchblog.com/files/wmw2004/search-friendly-design.ppt">Tim
Mayer's Webmaster World presentation for details.]

YQ: You mentioned earlier that you’d just bought a piano. I
read on your website that when you were eight, you ran away from home
to escape piano lessons. Is that true and did you just hate piano
back then?

A: Yeah, that’s true but I never hated piano, just lessons.
Now that I’ve bought the piano, I’m practicing again. Right now I’m
learning a classical piece by Bach. I’m a slower learner now
though–I should have stuck with it when I was eight. But when it
comes to the kind of music I actually listen to, I like rock, hip hop
and classical. I’m not too into jazz.

JQ: Which do you listen to when you’re programming?

A: (laughs) I don’t actually. I don’t deal with headsets
well.

JQ: In terms of freshness, there’s a lot of talk about how
quickly an RSS-watching engine will pick up new content as opposed to
getting stuff into Yahoo! Search. The question they ask is why can’t
we just ping Yahoo! and get the crawler over here?

A: Well, that’s not the only source of latency or possible
delay. We build very large databases and it’s kind of a large
industrial process involving lots and lots of machines. There’s some
delay between the last document we heard about and the time we
actually put something live. In some cases that delay matters more
than any delay in finding out if something has changed. We also pay a
lot of attention to whether something has changed. But I think
you’ll see us getting fresher and fresher.

YQ: When you’re looking for things that are changing on the
page, what are you specifically talking about? I’m sure it’s not
enough to just change a hidden date stamp in a footer.

A: Yes, It’s more than that. Most of the web is just static
even without there being date stamps. We do have a more nuanced
notion of what it means to change so we can detect a trivial change
from a significant one. We can tell a major change from a trivial
one.

YQ: You mentioned that you were between the scientist and the
programmer. What do you think you uniquely bring to what you’re
doing?

A: Well I’m hoping that one side thinks I’m the other side and
that the other side thinks I’m the one side. (laughs) I’m hoping I’ve
got them all fooled but I have this feeling that I don’t. But no, I
think that what I uniquely bring is that I can talk to both sides,
I’ve been a programmer; I’ve trained to some extent in the direction
that the scientists have trained and went to grad school for a long
time in related topics so increasingly, though I never thought I would
play this role and it’s not what I envisioned, what I’m bringing to it
is I can talk to a lot of different sides and I can prioritize the
stuff and my grasp of the technology’s not so bad either.

Yvette Irvin

Y! Profiler

December 01, 2004

An Interview with Tim Converse

Tim Converse isn’t your average search geek or for that matter, your
average guy.

height="299 width="297" border="0" hspace="5" />

Though he’s ostensibly shy and unassuming, he has a definite adventurous streak. Tim has tried everything from skydiving to African safaris, he likes rap (but only old school) and he once played keyboard in a punk rock band. But what really sets Tim apart is his knowledge of the inner workings of search. Unlike most of us who may have a decent understanding of the search world (or us novices who know just enough to type in a query and hit the “search” button), Tim understands the mysteries behind what makes it all work and how to make it better.

As an engineering manager in Yahoo!’s Content group, Tim and his group
help make search results more relevant.

Jeremy and I spoke with Tim last week. Here’s what we learned about content classification, what Tim likes to do for fun, and some little-known facts about Yahoo!’s obsession with foosball.

JQ: You’ve said that your group is charged with content classification. What exactly is content classification and why is it important to search?

A: Well, the more we know about documents the better. So part of what the Classification group does is label web pages and sites, or put them into categories. And while I can’t get into specifics about the categories we use, a big part of this is trying to detect who’s spamming us–or trying to trick us into ranking their sites higher in our search results.

Our classification code gets deployed in the Content system, which does the crawling and indexing to build search indexes that we end up serving queries from. That’s mainly for our own group YST [Yahoo! Search Technology], which handles the back end of web search, but we also provide data to other groups, including Image Search.

My group also writes tools to interact with the Content system. We can query it in all sorts of ways to find out what’s happening with particular sites or URLs. This is a challenge because the Content system is very distributed and heterogeneous.

YQ: It seems to me that if you’re writing code for something, at some point, you’ve written it and it’s done…

A: Well, we’re never really “done.” A few years ago my cousin asked me what I was working on and I told him “Excite’s web search engine.” He said, “so that would imply it’s not done? Or it needs work or what?” (laughs) And so yeah, especially with how competitive the market is, these things are always under development. There’s far more ways we can think of to make it better than we have engineers to do it. So even just with our list of ideas right now, we could be going for five years and there are always new ideas.

I should point out that although we’re focused on deploying code for YST, there’s a lot of expertise in the company and several different groups of scientists focused on classification. A lot of the challenge for me is just managing to benefit from that expertise for YST…

JQ: And connecting those dots in the company?

A: Yeah. It’s kind of a cool job because I’m sort of in between scientists and programmers and there’s such a spectrum of roles and responsibilities. We have people all the way from kernel hackers to linguists and needless to say, a kernel hacker can’t really talk intelligently to a linguist or vice versa but you have to have this long chain of people who can really talk to each other so I’ve kind of got scientists on one side and programmers on the other.

YQ: What has been the biggest change in the way you approach writing the code and how you approach content classification?

A: I don’t think the way we think about writing the code has changed. The way we’re approaching search itself has changed a lot.

For instance, comprehensiveness is a much bigger deal these days. In the Inktomi days we wanted just one copy of anything that was good because serving documents costs so much. Now we’d really like to have everything.

So then the challenge is ranking everything appropriately. You really want to put everything out there but then…

JQ: …that assumes there’ll be a lot of junk?

A: Right and so then the challenge is identifying and appropriately ranking it all.

The big things for us are “relevance,” “comprehensiveness,” “freshness,” and “presentation.” That’s “RCFP” and it’s kind of our mantra. I’m much more focused on the “R” part of the relevance, although we have a whole group of scientists and modelers who are totally devoted to relevance too. My buddies in my group who work on crawling and indexing are focused on comprehensiveness and freshness as well.

YQ: Switching gears for a moment, Tim. What do you do for fun? What kinds of things are you into?

A: I like games. All kinds. Strategy games, pool, 8-ball, 9-ball, billiards. I’m a little worried about Yahoo!’s growth plans, because I think our pool table may not be scaling with the hiring we’re doing. Is anyone looking into that? We had a nice one at Inktomi, but I think it’s in storage somewhere.

I guess foosball is a Yahoo! game of choice, so I’m trying to catch up on it. It’s a little known fact that one of the game’s experts, Phu Hoang, is here at Yahoo!, and the game was named after him.
I’m also interested in music and I just recently bought a piano. I hadn’t played piano in a long long time.

JQ: When you’re hiring someone for your team, what are you looking for?

A: We look for pretty senior engineers and like I said before, it takes a lot of types of expertise to make a web search engine. In terms of skills, we’re looking for C++ coders, strong problem solvers, and people who understand CS algorithms. Obviously there are particular roles that require some particular expertise like experience in classification and textual analysis.

Right now, we’re hiring pretty aggressively.

JQ: When it comes to fighting spam, there’s all kinds of software and many people trying to stop spam attempts. With all of us trying to detect this, is there a way to tell the search engines about it?

A: We get a lot of that data on our own. We have a pretty large view and we’re approaching the spam problem from a lot of different directions. But nobody should expect to see any sudden change in spam just yet.

Take weblog comment spam, for example. Two things will have to happen for comment spam attempts to decrease; one is that spamming will have to not work for search engines and the second is that comment spammers will have to realize it. (laughs). There could be a long lag there where, even if every search engine totally nailed them, spammers could still operate under the belief that it worked. What we can do from the search engine point of view is make spam not help.

Next week Tim talks about redirects, index-able pages and why he doesn’t listen to music while he’s programming.

Yvette Irvin

Y! Profiler

November 02, 2004

Questions for Tim Converse about Content Classification?

Yvette and I are planning to sit down to chat with Tim Converse. Much like she did with Paulien, we’ll ask some questions about where Tim came from as well as what he and his group are up to these days.

I asked Tim for a description of what his group is all about so that we could solicit questions from those of you outside of Yahoo. He said:

I manage the Content Classification group within YST (that’s the backend of Yahoo search). The Content group does all the crawling, indexing, and webmapping of documents for web search, and my group is responsible for categorizing those docs. We write software to algorithmically classify web pages with a special focus on catching search engine spam. We also write software to help us understand what the Content system is doing.

So if there’s something you’d like us to ask Tim, leave a comment below.

Jeremy Zawodny

Technical Yahoo!

October 28, 2004

Jerry’s Take On What’s Next in Search

With Yahoo! approaching its 10th anniversary, the question I’m hearing a lot lately is “what’s next in the world of search?”

Ten years ago, we were focused on a simple yet vast problem: finding better ways to aggregate and organize information so people can find it. Today, the challenge is different. On the one hand, there’s a lot more information to aggregate and it’s not just more in terms of quantity; there’s a larger variety of content as well — from products and images to news and business information. In addition, we’re pulling content from more sources than ever before.

On the other hand, our user’s expectations have also changed. It’s no longer enough to simply provide a structure for users to find what they want on the Web. Today, people expect to find precisely what they’re looking for exactly as it relates to them. It’s the old example of the “Java” search query. Are you looking for coffee or for the programming language? People want to define what’s relevant to them in their own personal way. They also want to tap into the source of their information at will and they want to manage it all to personally suit their needs.

That’s what is exciting about where we are today. Search as a problem is still far from being solved. The user is in the driver’s seat: they want an experience that is increasingly personal, more relevant, and ties into their task more integrally. Search is just a way to get that integrated experience, but it’s all about what the users want – when they want it, how they want it, and who they want it from.

Jeremy hit on it in a recent blog entry; we have to “make search more relevant and personal.” Those two things are the natural progression for search and they are tightly connected to our concept of seamless integration. Search has to reach a higher bar: it has to enhance the user’s life on a daily basis. Integration of search, community, personalization and content builds the foundation for relevancy in people’s lives.

Because the Net is obviously a bigger part of people’s lives than 10 years ago, we at Yahoo! also have an opportunity to integrate into people’s lives more deeply than before. Yahoo! Local and the beta version of My Yahoo! Search are just two of the examples of how we’re enabling people to manage their search content, search within locations of their choice, and build personal communities online. Users can connect to people with similar interests and they can gather and share search information at will.

Fortunately, we’re also at a time when the technology is helping us plug into people’s lives even more richly. For instance, at this year’s Web 2.0 Conference there was a lot of talk about RSS and wireless technology. This is stuff we only dreamed about ten years ago and its helping redefine what we do with search today. RSS is allowing people to access exactly what they want and wireless is letting us deliver the information wherever you are. People aren’t chained to their PCs anymore and neither is search. Yesterday’s introduction of Yahoo! Search for Mobile is just one example of how technology is propelling search forward. Search is literally in your pocket and at your bus stop. It doesn’t get more integrated than that.

The question to ask now isn’t if or when; it’s “what else.” What else can we do to take search to the next level? What else can we do to make search even more useful and accessible to you?

These are the challenges that will keep us busy for at least another 10 years and we’re getting closer everyday. At Yahoo! it’s our job to stay ahead of consumer needs and expectations and, based on the responses of our users, I believe we’re doing a really good job so far — but it’s still very early. It’s one of the reasons I remain really excited about how we can continue to provide real solutions to people’s problems, and make a difference. While I’m not nearly as technical as I was 10 years ago (I got my hint when David Filo changed the password on me so I can’t touch code anymore), I firmly believe that the technology we are building today makes the future of the Web even more useful, informative, and entertaining. As long as there’s a way to help people find more precise and more relevant information on the Web, you’ll find me in the thick of things searching for it.

Jerry Yang
Chief Yahoo

October 25, 2004

An Interview with Paulien Strijland of Yahoo! User Experience Design

height="179" width="179" border="0" hspace="5" />

Paulien Strijland is Yahoo!’s director of User Experience Design (UED) for Search and Marketplace and when you first meet her, you can tell that she’s creative. She is a striking figure at 6′1′ and wears expressive, flowing outfits and chunky, eclectic jewelry. She speaks enthusiastically about UED and she always seems to be in the middle of something interesting.

But what Paulien brings to Yahoo! is a lot more than creative energy. She is a business-savvy pragmatist who values collaboration tempered with practicality. But it may be her penchant for diplomacy, more than her pragmatism, that helps her provide unique direction for Yahoo! UED.

Here’s what I know about UED: you can build the best engineered product around but if no one understands how to use it, then who cares? It’s like the new cell phone Paulien was fiddling with when we spoke, ‘this phone’s got at least 100 features,’ she said. ‘But all I care about is getting to the two or three that I want. They’re randomly buried in with all the others so it’s hard to find them and get to them fast. That’s not good user design.’

I sat down with Paulien over coffee last week as she shared her thoughts on user design and the world beyond Yahoo!.


Q: You’ve been involved in user interface design for over ten years now. What changes have you seen in the direction of UED and how it’s perceived?

A: Years ago there was no formal training for UI (User Interface) design and it was a discipline that wasn’t really recognized or viewed as important. Most companies didn’t even have UI designers. These days, even the smallest organizations have an appreciation for the field. So you spend less time trying to explain how UED affects the bottom line and more time getting to the design.

On top of that, the numbers of people using computers has significantly increased. This means we’re now designing for new types of users with different perspectives and different levels of computer savvy. Our designs have to be easy enough for the novice to use but compelling enough for the power user.

Q: What’s the toughest aspect of your job?

A: Everyone has an opinion! Yahoo! is very collaborative and everyone is a user on some level or another. The toughest thing is understanding the value of hearing everyone’s feedback but knowing that everyone’s opinion can’t go into the product. If it did, we’d have a hodge-podge design that really served no ones purpose. You have to be very diplomatic.

You also have to remember that we’re not the typical user. Our teams do a lot more searches in more ways and with more comparing than regular users. So we may not see things the same way they do. When you understand that distinction, you’re able to really hear what users are telling you about the product and about the design. It comes down to striking a balance between what your original product design goals might have been and what you’ve learned from the people who are going to use it.

For example, before Yahoo! Local was in beta, we’d received lots of very positive user response about the “view results on map” feature. The problem was that once we’d made the beta public, users weren’t even aware that a “view results on map” feature existed. We had designed a button for it and we thought it as very visible and very intuitive. In our minds, it was “right there.” But users still weren’t using it. They just weren’t registering it visually.

We ended up sitting back down and seriously rethinking how we’d treat that feature and it was very different from our original design concepts. When Yahoo! Local came out of beta, we’d found a much more effective way to call it out.

Q: But how do you really know that it’s effective? Maybe users still aren’t using it.

A: Well, we can tell that it’s a very used feature now. Our reports show that people are clicking on the button so they must be finding it. On top that, they’re telling us themselves that they really like the feature in the feedback they send to us. So we know they’re using it. I’d say the redesign worked.

(more…)

September 30, 2004

Search Geeks Gone Wild

We’re making the trek north today to Lake Tahoe for the 4th annual Gnomedex conference. In fact, Yahoo!’s the title sponsor, so if you’re attending, swing by our table and introduce yourself.

[UPDATE: Jeremy and friends created a really cool page where you can easily add to My Yahoo! the blogs of many of the people attending Gnomedex.]

Nancy Evars
Yahoo! Search

September 20, 2004

Meet the Yahoo!’s

I started working at Yahoo! in July and as with any new job, getting to know everyone can take a while. So when I was approached to start the “Y! People Profile” posts for this blog, I jumped at the chance to learn the organization while getting to know all the interesting and quirky people I see in the halls.

In the coming weeks, I’ll introduce you to some of the folks here on the search team as well as others from across the company. You’ll meet everyone from designers to researchers, to engineers and product managers. I’ll post my interviews here so you can get to know them the same time as I do.

I am planning to focus on how they’re thinking about search, but I’ll ask them everything from the serious to the silly so you can see who they are on and off the Yahoo! campus (yes, some of us actually have lives outside of Yahoo!).

My first profile will be Paulien Strijland, Head of User Experience and Design for Yahoo! Search & Marketplace. Paulien hails from the Netherlands and has spent the better part of 15 years at Apple, PayPal and NetFlip.

I’m sitting down with Paulien shortly so if there is anything you’d like me to ask, post a comment and I’ll try to throw it her way.

’til then….

Yvette Irvin
Y! Profiler (and In-Product Marketer)

September 09, 2004

A Vacation From Search? Hardly!

Last Monday I was just back from a long-delayed one-week vacation in Mexico, and got two emails from two VPs at Yahoo! about some posts on a search forum. A thread complained about quality/relevance of Yahoo! Search results for certain queries, and another was on error 999 that a few users encountered while using Yahoo! Search.

Man, talk about back to reality: from basking in the sunshine on the sandy beaches of Cancun, to having to dig through all the software changes and log entries to investigate these two seemingly random posts. Hmmm…Now I know why search developers have no social life: having to do no evils at the cost of no vacations (or vacations immediately purged from one’s short/long-term memory). But I digress. The investigation actually turned out to be fruitful: one inadvertent bug (is there any other kind of bug?) and one intended behavior (error 999 would be served to machines sending “excessive, anomalous, or abusive traffic” – and on the heel of myDoom, a few users might get this, not knowing that their machines might be infected with myDoom. This should be a topic of another blog post.)

But I digress again. The point here is that we get feedback in nearly real-time now. Blogs, forums, message boards, you name it. I am impressed that Yahoo! Search really cares about its product quality to monitor the posts from its users, especially when the monitoring was actually done by the “higher ups” (hey, we have QA and surfers here at Yahoo! Search too you know). Before having a search engine of our own, we relied on others and had to wait for escalation, feedback, service level agreements, etc … before an issue was resolved. Now it seems like everyone here helps keep an eye on things. For that, I am thrilled.

Whether that thrill qualifies to erase my annual vacation so fast from my memory, the jury is still out.

Nam Nguyen
Technical Yahoo!