вторник, 16 июня 2009 г.

Why and when computer search fails?

Why search fails sometimes (or even often)? When search fails often?

Do you notice that the more specific you request the more probable a search will fail? Do you notice that in some cases you would prefer to use not search engine but encyclopedia (as a variant: a portal)? Do you think why is that? The answer is nowadays any search is a word finding. And the more general is term, the more it coincides with meaning of this word. When a request becomes more specific than all depends on what words are used for it. If you use words which can be found in the text and are in the same order, then a search will be successfull. As soon as words vary, you will receive nonsense, though maybe with millions pages in result.

Words are not meaning, words express meaning. Therefore, nowadays any search is just a guess. Some search engines guess better, some do not. And let's say it explicitly and precise: not search but word search. What people really want to search is meaning. There is not a meaning search today.

How a search works today? Realize, you go to a library and ask books on some topic. A librarian leads you in a room with thousands books and propose to search here a little more. There is an alternative: to look in a catalogue and find corresponding topic, which contains books strictly on theme you are interested with. But, in fact, there is still quite many books with thousands of pages. It is easy task if you want to know something about stars and find a book on astronomy. But it is harder task to find a book about Andromeda galaxy. Even harder is to find specifics on what planets found in Andromeda galaxy, etc, etc.

The main difficulty in finding information is any text is just a stream of words. The main alleviation here is this stream is ordered by specific rules. Additional points to overcome: a context which could change meaning of words, a different understanding of the same words, terms, etc, etc. How contemporary search is working in general? It collects words from texts, creates a way to easy and fast find them again, as an option it could create some associations between them. Each search engine, in fact, is a creature which knows words but does not understand what they mean.

But information is not just words. Any piece of information is description of some space-time configuration. There's some objects (real or abstract) with some properties and attributes which introduced, they do something (in reality or in imagination), they linked with some associations, their time entities (like actions, events, etc) linked with another associations. So, meaning of each word, phrase, sentence, paragraph, or text leads us to a space-time construction. Which, could be linked with the whole space-time building which exist in our mind. When we search for something, we search not for words, but for some part of a space-time construction. If we search for "Andromeda", then we imply it is either a galaxy or a music band, or a constellation, or related to mythology, etc, etc. And search engine should know about variants. But not in million pages but in dozens of objects (or actions). If we give more defined words, like "Andromeda planets", it should give us pages or even paragraphs (for example, in an interview of an astronomer) about namely Andromeda planets.