Dell, EMC, Dell Technologies, Cisco,

Wednesday, September 27, 2017

Yahoo is giving a critical piece of internal technology to the world -- just like it did with Hadoop

Yahoo is open-sourcing an internal tool called #Vespa, which it uses for content recommendations, ad serving, and executing certain searches. Vespa is arguably #Yahoo 's biggest open-source software release since #Hadoop in 2009, which formed the basis for two now-public companies, #Hortonworks and #Cloudera. Companies like #Amazon, #Facebook, and #Google could find it useful.

Oath, the #Verizon-owned parent company of Yahoo, is releasing for free some of its most important internal software, which the company has long used to make recommendations, target ads and execute searches.

The Vespa software solves a common but surprisingly difficult problem: quickly figuring out what to show a user in response to input, like when they type text into a box. Oath uses it in around 150 applications, including Flickr, Yahoo Mail and the main Yahoo search engine (specifically for components like entities, local results, images and answers to questions). It handles 3 billion native ad requests every day.

"The typical case is you don't know what you want to serve, but you have 20 billion pictures and you want to find the right ones," Jon Bratseth, a distinguished architect at Yahoo who led Vespa's development, told CNBC in an interview.

Vespa, which is now live on GitHub with an Apache 2.0 open-source license, can easily be added to different applications, making it suitable for use at big companies like Amazon, Facebook and Google that need to do different kinds of processing on different sets of data.

The release is the most important for Yahoo since it open-sourced the code for the Hadoop big data software in 2006. Hadoop has since come to be at the center of two public companies, Cloudera and Yahoo spin-off Hortonworks. Today people at lots of companies can contribute to technology that's still widely used at Yahoo, and build their own systems using Hadoop.

How Yahoo built it

Big tech companies regularly open-source their software. But if there's powerful software at the heart of a company's biggest revenue centers, it can take a while to come out into the open, and Vespa is no different.

Vespa dates back to the early 2000s. Yahoo already had web search technology, first through a partnership with Google and later through its 2002 Inktomi acquisition. What Yahoo didn't have was technology for delivering search results and recommendations on content that falls outside traditional web search results.

In 2003 Yahoo acquired Overture, which included its partner AltaVista as well as a lesser known search engine called AllTheWeb.com. After the deal, the roughly 30 AllTheWeb people were given a year to build software that could perform certain functions quickly before web pages were shown to end users. The system also needed to be easy to set up, run and tweak, so that it could be applied to a variety of applications without much trouble.

In around 2005, the AllTheWeb team worked with Yahoo's shopping team to adopt the new system. It required less management time, freeing up staffers to build new features.

"After that, we had a proven use case -- and that was a complicated one," Bratseth said. "More and more teams in Yahoo started using our system by themselves, because it made business sense. They would offload a lot of the problems they had to take care of themselves."

So Bratseth's team started expanding the powers of Vespa. They made it capable of handling input other than users' strings of text; over time it could also personalize content based on what users had clicked on in the past, which is valuable in cases when users haven't typed in anything. They also changed Vespa so that it could take direction from machine-learning algorithms.

https://www.cnbc.com/2017/09/26/yahoo-open-sources-vespa-for-content-recommendations.html

No comments:

Post a Comment