Wikipedia founder plans open-source search engine

-
Aa
+
a
a
a

31 July 2007

The building blocks for a community-driven internet search engine that will compete with the likes of as Google and Yahoo are being put in place according to Wikipedia founder Jimmy Wales.

Wales told a conference of software developers in Portland, Oregon, US, that his commercial start-up, Wikia, had acquired Grub, a distributed web crawler that scours the web and indexes relevant sites, from California-based company Looksmart.

Volunteers can download the Grub web crawler, which runs in the background on their PC, indexing web pages according to their content. The crawler will be used as the basis for Wikia's forthcoming search service. By contrast, search engines like Google run their own web crawlers and keep details of the way they work secret.

Search results for Wikia's search service will be generated using an open-source search platform called Lucene. Wales said he is looking at options to enhance Lucene, but would not reveal any details.

Untangled results

But, like Wikipedia, Wikia's search service will also seek to make use of human editors. When a public version of the search service launches, toward the end of 2007, users will be invited to help untangle results by, for example, identifying the correct site for terms with multiple meanings.

"If we can get good quality search results, I think it will really change the balance of power from the search companies back to the publishers," says Wales, chairman of Wikia, based in San Mateo, California. "I could be wrong about this, but it seems like a likely outcome."

Wales says Wikia will open up Grub to other developers so that they can make improvements or use the crawler for their own purposes.

Wales founded Wikipedia, a non-commercial project and one of the web's most popular sites, in 1996. He also co-founded Wikia, although the two organisations have no formal ties. Wikia lets users create specialised Wikipedia-style sites on topics ranging from popular TV shows to health or travel.

Explicit judgments

Open search is part of Wikia's broader push to promote the spread of free content publishing on the web, according to Wales. The objective, he says, is to make explicit the "editorial judgments" involved in modern web search systems. Proprietary search engines like Google keep key details of their search systems secret, to prevent link spam, as well as for competitive reasons.

Ultimately, Wales wants the Wikia search service to be available to other websites and smaller publishers, so that they can install a custom version of the service on their site. Target customers might include local newspapers, for example.

He detailed his plans at the O'Reilly Open Source Convention (OSCON), an annual gathering of open source software developers

http://technology.newscientist.com/article/dn12387-wikipedia-founder-plans-opensource-search-engine.html