Guest Bloggers Blog Posts

This is the Guest Bloggers archive of the Yahoo! Search blog. To go back, use the "back" button on your browser. Or you may return to the Yahoo! Search Blog home page.

January 23, 2006

Yahoo! Hacks. The Good Kind...

yahoo hacks Hey All, Yahoo! turned the keys of their blog over to me for one post so I can tell you about a book I put together called Yahoo! Hacks. If you're not already familiar with O'Reilly's Hacks Series, don't worry, the book isn't about breaking into Yahoo! without permission. Instead, the Hacks Series is trying to reclaim the word "hacks" for the good guys, using the word in its original geeky sense of describing a cool technical shortcut, useful bit of code, or a clever use of an existing application. Each hack is a tip, trick, or project you can put together that uses Yahoo! in a unique way—ranging from using simple shortcuts to fine-tune searching to adding new features with Yahoo! Web Services.

Here's a quick example. I like to change my desktop background. Most of the time I spot something on the Web at random, right-click the image, and choose "Set as Desktop Background..." from the menu. This method isn't perfect though because most of the time the image isn't quite the right size for my desktop. So I have to go into my desktop settings and choose "stretch" or "center" to see which is the best way to display the background picture. Stretching the image usually means the picture will look blurry or distorted, so ideally I'd like to find images that are the right size for my desktop.

Yahoo! Image Search is great for spotting new desktop backgrounds. Yahoo! has already done the work of gathering thousands of images from across the Web into one interface for browsing. And if you know a few of what Yahoo! calls Search Meta Words, you can specify the size of images you'd like to see. Try a search for cityscape, and you'll find thousands of images of all shapes and sizes. Say your screen resolution is set to 1024 x 768, you can specify that with the width: and height: Search Meta Words. So the query cityscape width:1024 height:768 will give you only images that are a perfect match for your desktop background.

In Yahoo! Hacks, Hack #83 shows how you can take this idea a step further with a bit of code to set a new desktop background automatically when you start your computer. So you specify a word like cityscape, and Yahoo! Web Services will deliver a random image to your desktop. It won't always give you a background image that you'd set for yourself, but it's a fun example of using Yahoo! to bring some randomness into your life.

Another of my favorite Search Meta Words is aspect: for Yahoo! Video Search. The term aspect ratio refers to a video's display width divided by its height. As HDTV is taking off, more and more video is available in a widescreen format which is represented by the 16:9 aspect ratio. Standard television video has a 4:3 aspect ratio—almost a square, and computer monitors 5:4. With a little math, you can tell Yahoo! Video Search which aspect ratio you're looking for. You take the video width multiplied by 100, then divide by the height and round down. That means widescreen movies and HDTV have the computed aspect value of 177.

Here's a quick way to see how it works. Try a search for Matrix at Yahoo! Video Search and you'll get over 20,000 results. Now specify the aspect ratio by searching for Matrix aspect:177 and you'll get under 100 results, but each result is in widescreen format. (Usually a bit higher quality than 4:3 videos.) There are quite a few Search Meta Keywords for Yahoo! Video Search, and Hack #11 in the book explains them all.

Many thanks to the fine folks here at Yahoo! Search Blog for letting me tell you a bit about Yahoo! Hacks, I hope you'll check it out. And happy Yahoo! hacking (the good kind).

Paul Bausch
Author

October 02, 2005

Announcing the Open Content Alliance

From time to time we've invited guest bloggers to write on the Yahoo! Search blog. Today we welcome Brewster Kahle, founder of the Internet Archive. We asked Brewster if he'd like to introduce the Open Content Alliance.


Is Open Content the next step in the traditions of Open Source and an Open Network? Many people seem to think so (and wouldn't it be great?). Working with libraries, government institutions, archives, technology companies, web companies-- and we all are saying the same thing-- it is time to have more great material available on the Internet and to be able to have it be open and free.

The opportunity before all of us is living up to the dream of the Library of Alexandria and then taking it a step further-- Universal access to all knowledge. Interestingly, it is now technically doable. Then the question became-- is it in the interest of enough people and institutions to get there? Some hang-ups have been around costs, rights, and guidelines for sharing. All of these things were worked out for their domains by Internet folks and open source folks in the last few decades. But how are we going build a system that has everything available to everyone?

I am jazzed to say that a group of organizations is starting an Open Content Alliance to try out answers by joining new and existing collections. We are looking for more contributors and helpers. We are starting with a set of principles.

To kick this off, Internet Archive will host the material and sometimes helps with digitization, Yahoo will index the content and is also funding the digitization of an initial corpus of American literature collection that the University of California system is selecting, Adobe and HP are helping with the processing software, University of Toronto and O'Reilly are adding books, Prelinger Archives and the National Archives of the UK are adding movies, etc. We hope to add more institutions and fine tune the principles of working together.

Initial digitized material will be available by the end of the year.

So the costs are mostly being borne by the host institutions based on their own fundraising or business models. The cost of digitization is sometimes offset by a different party (in the case of American Lit-- Yahoo!). We think this can scale to millions of books movies and audio recordings.

Yahoo! has been great to work with on this because they get it, and have substantial abilities to cause things to happen. I find it interesting with how enduring a company�s culture is. Jerry Yang and David Filo�s personalities are still quite evident in the company today.

The rights issues come in many flavors, but our guiding principle is to offer high-resolution, downloadable, reusable files of the public domain. When we are dealing with in-copyright materials, the Internet Archive has been leveraging the creative commons licenses to great effect. In-copyright issues remain, but at least we can get substantial work going on the public domain.

We believe that donors should have the option to restrict the bulk re-hosting of a substantial part of a collection. This seems fair and is similar to the Creative Commons Sampling license. Interestingly University of California and Yahoo have decided to not put any restrictions. So if another library wants to re-host these on their website, or another search engine wants to integrate them into their page flipping system, they are welcome to. This is so great�let�s let the public domain stay public and build business models on in-print materials.

To be clear, the public domain works in the Open Content Alliance can be "borrowed" in bulk for build navigation services, do research on, and the like. Bits and pieces of the public domain collections can be re-used and re-interpreted. If someone wants to print and binding a book and sell it on Amazon.com-- go nuts, if they want to make it into an audio book and post it on the web-- go for it (we will even supply the hosting for this), basically let�s have a blast building on the classics of humankind.

On October 25th we will be demonstrating some of the new bookscanning and partner technologies.

If anyone is interesting in helping with this, please contact us at oca at archive dot org.

Thank you!

-brewster
Founder, Digital Librarian Internet Archive

May 17, 2005

Marc Canter on Ourmedia's Support of Media RSS

From time to time we've invited guest bloggers to write on the Yahoo! Search blog. Today we have a post from Marc Canter, one of the main forces behind Ourmedia.org.


Since its launch in late March, Ourmedia.org has quickly gained a reputation as the place to upload and store works of personal media for a global audience. We're a nonprofit out to make it easy for the masses to publish their videos, audio files, podcasts, photos - any digital media.

And today we're announcing our support for Media RSS output.

This not only gets us into sync with state-of-the-art media publishing, but it also allows us to get all of the Ourmedia.org content indexed into the Yahoo media search engine - so that LOTS of people can find our material, whether it's under traditional copyright or a Creative Commons license.

As the open media world grows, having support from folks like Yahoo really helps. Yahoo brings cred, resources and a professional approach to a world rampant with hobbyists, amateurs and lovers of media - but devoid of code repositories, test rigs and good QA control.

We've been working hard to bring the vision or free storage and bandwidth to digital creators and having Media RSS support now enables us to get that stuff to a wider world out there.

Marc Canter

April 07, 2005

Jimmy Wales on Wikipedia and Yahoo!

From time to time we've invited guest bloggers to write on the Yahoo! Search blog. Today we have a post from Jimmy Wales, president of the Wikimedia Foundation. We asked him to write a few words about our donation to the foundation and our efforts to better integrate Wikipedia content into Yahoo! Search worldwide.


Wikipedia is a global charitable effort to create and give away a freely licensed encyclopedia in every language of the world. We have achieved a remarkable amount in our short history (just over 4 years!), having built already the largest English language encyclopedia in history, and very large encyclopedias in French, German, and Japanese, as well as strong efforts underway in over 100 more languages.

In addition to Wikipedia, we have many spin-off projects of equal importance from Wiktionary (dictionary) to Wikibooks (textbooks) to Wikinews (news reporting) and more.

Our growth in web traffic continues to be staggering, doubling every few months. Yahoo's generous donation to our cause in the form of servers, hosting and bandwidth will have a huge impact on our ability to get our message of sharing knowledge out to the world. Yahoo's donation is purely charitable in nature with no requirements for us to show advertising, and no ownership or control of our work by Yahoo of any kind. Yahoo is simply enthusiastic about the goodness of our work.

As our relationship with Yahoo has grown over the past year, we began to talk about other ways that Yahoo could help us. One theme that made sense for both of us was to think about Yahoo's global reach and Wikipedia's global goals. As we have grown it has become apparent that we can better serve our visitors by adding data centers around the world.

With the growth of the many Asian Wikimedia communities, the location of a new datacenter for Wikimedia in Asia made a lot of sense to us both.

But as generous as the hosting is, we are even more excited about Yahoo's recognition of the value of our work in enhancing the experience of Yahoo visitors. This exposure will let even more people know about the great cultural things that are happening on the Internet and get even more people involved in helping us to help each other make the world a better place.

Jimmy Wales
Wikimedia Foundation

November 15, 2004

Find It in a Nearby Library

As we said back in September, every so often, non-Yahoo's will be showing up here to blog about their views on search or share some news. Today, Andy Boyer from Open WorldCat is joining us to introduce the new Yahoo!/OCLC toolbar...


Want a quick way to find library resources online?

The OCLC -- Online Computer Library Center -- has teamed up with the Yahoo! Toolbar folks to make it easier for you to access two million of the most popular records found in WorldCat, a central catalog of library holdings. The Yahoo!/OCLC toolbar is a project associated with Open WorldCat, a new OCLC initiative designed to increase the online visibility of libraries and their collections.

The toolbar lets you restrict your search to just the WorldCat database and locate libraries in your area that house materials you'd expect to find in libraries: books, movies, and historical archives.

To find WorldCat records, enter a search term in the toolbar's search box, and click either the WorldCat logo or select "Libraries" from the drop-down menu next to the "Search Web" button.

Toolbar and Worldcat Integration

The WorldCat bibliographic database was built by thousands of librarians over several decades, and maintained by OCLC. It has 57 million catalog records for items in nearly 1 billion locations.

We're busy making the rest of the WorldCat database available to crawl and if you're interested in watching it grow (every 12 seconds, a new record is added to WorldCat), check out: http://www.oclc.org/worldcat/grow.htm.

Also, if you're attending the Internet Librarian conference in Monterey this week, stop by the Yahoo! Search-sponsored Internet Caf� where the Yahoo!/OCLC toolbar will be on every computer.

Andy Boyer
Open WorldCat Product Manager

September 13, 2004

Guest Blogger: Danny Sullivan, Editor, Search Engine Watch

From time to time, you'll see some non-Yahoo! folks talkin' it up on this blog about various topics related to search. We're excited to have Danny Sullivan, creator and editor of the popular Search Engine Watch newsletter, as our first guest blogger. As many of you know, Danny knows a thing or two about the search space. He's been covering search engines since '95 and is considered one of the top experts in the space. [Update: Just to put any doubt to rest, guest bloggers are not compensated in any way by Yahoo!. We'll be sure to make that clear with each guest blogger post.]
------------------------------------------------------------------------------------------

In 1989, drama in The OC wasn't on television but between two newspapers, the Orange County Register and the Orange County edition of the Los Angeles Times.

The LA Times wanted to win in Orange Country, where it had half the circulation of the OC Register. It started pouring resources and people into its coverage of OC. That's how I got my start in journalism, as part of a new wave of hires helping the Times in its newspaper war.

The Register didn't sit still, of course. It ramped up its efforts and people as well. As a journalist, being in the middle of this battle was great fun. Everyone was charged and excited about beating the competition. As a result, OC had news coverage like it had never seen before.

Why this trip down my memory lane? Many readers of this blog no doubt know there's something of a search war going on right now. Like the newspaper war I experienced, the result is great for those the search engines serve. The resources, efforts and people being tossed into trumping the competition should ultimately benefit the searcher.

I got to see the energy being unleashed first hand a few weeks ago, when I made a regular trip out to California to meet in person with various search companies. I've come to Yahoo! each year since 1997, but this last visit was special. I saw more new and innovative things in the works than ever before. But more important, the people themselves were pumped up and energized to be first in search.

I can't talk about the new stuff yet, so I wanted to instead highlight the most important thing Yahoo!'s done this year. It's not super cool search commands (though my colleague Tara Calishain just did an excellent job of pointing out some of those, so check them out).

It's not Yahoo!'s search shortcuts that have been blogged about already (but those are great -- keep them coming). RSS feed searching? Travel search? Local search? All great. But above it all, I like that in February, Yahoo! returned to having its own unique "search voice."

Search voice? How can a search engine have its own voice? I need some search engine 101 to explain, and apologies to those who know this stuff already!

When we search, the search engine checks what's called its index. The index is a collection of documents the search engine has gathered from crawling the web. Think of it as a big book of answers it consults, every time you do a search. No search engine has exactly the same set of documents in its index as another.

Each search engine also uses is own unique method of flipping through the index to decide which documents should be top ranked. That method is called the search engine's algorithm.

The combination of having a unique index and a unique algorithm is what gives a search engine its own voice, as expressed in the results it presents. (Want to easily see this? Check out this new comparison tool).

You can think of it like movie reviewers. Is the movie Cellular that just opened good? Checking Yahoo! Movies, I find that of 11 different reviews, grades ranged from C- to A. Each reviewer saw the movie differently, judging it against their own unique tastes and knowledge.

The same is true with search engines. We want a variety of opinions, alternatives we can consider. When Yahoo! unveiled its own search technology earlier this year, the company regained its unique search voice, and the web as a whole benefited from the greater diversity.

I fully expect Yahoo! to keep building on the progress its made so far -- and its rivals are just as full of energy and ideas as well. I hope the competitiveness of the search wars keeps going strong. We all benefit from the search companies being kept on their toes.

Danny Sullivan
Editor, Search Engine Watch