Just a thought: free distributed search?

Just a thought: free distributed search?


Every once in awhile, I just get a hare-brained notion. Today's was, why do we use a central website for doing internet searches at all? Why Google?

Consider the success of the Planetary Society's distributed SETI project, and the distributed computing architecture that resulted from it. Consider the success of swarming download technology like BitTorrent. Consider how simple a basic web spider could be. Consider the efficiency of spidering networks locally. Consider the architecture of DNS.

See a pattern?

What if we replaced the concept of a search engine site with a search engine protocol? What if we ran small spidering operations on thousands of sites around the world instead of putting a massively parallel supercomputer in one room somewhere to do it? The individual spiders would be intelligent applications that learned their immediate environment, and then shared that data with others. Each person using the software could send queries into it, and it would propagate up through a series of spiders to find the best sources of information on the subject.

Probably, you'd still need central indexes somewhere. But what if the index servers where run by lots of people, and not just one company?

It would be a whole new architecture, of course, and there's probably some weaknesses to it, but the idea of a peer-to-peer based search network with peer applications sharing both the indexing and querying load with each other does seem feasible -- after all, distributed computing is able to capture more computing power more cost-effectively than just about any supercomputer architecture, so the power to do it is probably there.

Makes me wonder if someone is already building it.

Just a thought...

Category: 

Comments

shtylman's picture
Submitted by shtylman on

The idea may sound good, but I think that you will run into a network slowdown. If you have evry computer trying to search the network for the same things it will be inherently redundant. I think the central index idea is better. Just think, if at all times your computer network was being cralled with various spiders looking for content; just imagine the security concerns. Distributed things are good, but they do have their tradeoffs.

Terry Hancock's picture

My blog entries at Free Software Magazine may be reprinted with this notice:
Copyright (C)2004-2006 Terry Hancock / License CC-By-SA 2.5+
http://creativecommons.org/licenses/by-sa/2.5
Originally at http://www.FreeSoftwareMagazine.com

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

They have already done it:

http://www.majestic12.co.uk/

Anonymous visitor's picture
Submitted by Anonymous visitor (not verified) on

You'd have to have a large existing user base to pull this off. IMO, adding this as a component to Skype, or even better, Firefox would be a great way to go. Yes there would be some security concerns...but how big will they be? I think that if you have 10 PCs in a cloud index the same item, and 9 give one answer and 1 gives another, you toss the odd data out.

Martin Tibbitts

Author information

Terry Hancock's picture

Biography

Terry Hancock is co-owner and technical officer of Anansi Spaceworks. Currently he is working on a free-culture animated series project about space development, called Lunatics as well helping out with the Morevna Project.