вторник, 10 января 2012 г.

Can humans treat semantics? Can machines treat natural language?

I constantly hear objections like "Humans are pretty bad at semantics, they miscommunicate meaning, they have different assumptions about relationships or meanings attached to a term" with references to Frege, Russell, Mill, etc. On the other hand, I hear "Natural language is impossible to be interpreted by computers, there are a lot of such attempts in the past, but they all failed", etc. Of course, I agree with this, because it is true.

Yes, they are impossible tasks. But everything we use today was impossible task at some moment of the past. Mathematics and physics are full of unresolvable (and sometimes impossible) tasks, however, it does not prevent us from using particular cases of these tasks. Impossible tasks are not always solved by intuition or sudden burst of inspiration, some of such tasks are solved only because they were restricted appropriately. Do not think "Humans are bad in semantics, because they miscommunicate constantly", because despite such problems we understand each other eventually. Do not think "Machines cannot learn natural language therefore it is impossible for ever", because search engines in some cases can fetch meaningful results. There is constant progress in both directions, however, these directions are converging in some point. And namely this point could be turning one.

Machines prefer to have information as specified as possible (because they do only what they instructed to do). Humans prefer to have information as general as possible (because it we constantly abstract and it saves time, for example, instead of "I go to Main St. 21, Springfield" we say "I go home"). Therefore, when machines meet with natural language, which usually consists of as general terms as possible, they have problems with that. The same concerns humans: as soon as they need to specify something in details, they start to mistake, lag, and misuse. Or rephrasing this: machines can interpret only what they learned to interpret, but when they do this, they do this thoroughly; humans can interpret anything, but when they do this, they may do this unreliable.

The consequence of the difference in thinking model is information matches differently too. In the case of machines, any data format can define only what it set up to define. No more than that. If you need to link data from different formats, usually, you need to solve the task of compatibility of one format vs. another format. Natural language is quite different: it can define anything, based on small set of rules, and to link information, you need just to fit identifiers. For example, you have separate databases (which are a sort of formats too) and applications of movies, history, and geography. If you want to know more on events and places, which some historical movie is based on, then you need to work with three applications and copy-paste information from one application to another. Or, you need to have one application which has predefined linkage for these databases. There are no other ways. The same concerns even hypertext: if you have no links from the page of the movie to the concerned events, you need to use copy-paste text, which relates to the events, and open/search them separately. Of course, copy-paste is one of the most important technologies of information exchange (no joking). But it is forced solution. Natural language works differently: you can easily combine different words, which relates to different domains, in one sentence.

But what's worse, we are forced to comply with machine-friendly way of information processing. It was appropriate, back in 1960s, when hardware performance was not enough to go far from machine commands. Slow progress in this direction is marked with following events: (1) interpreting bytes as text, (2) invention of OS and command-line, (3) introduction of graphical user interface, (4) using search instead of portals, etc. In each case, computer entity was replaced by more human-friendly entity. But computers (and mobile devices) are still far from being human-friendly. There are a lot of examples when we forced to use computer entities. We manipulate with files and URLs according with computer locations of information, we manipulate with GUI according with interface locations of controls, we manipulate with emails, pages and other information containers just because everyone do this by habit. And because there is no convergence point between the ways of human- and machine-inclined thinking.

At first, it seems impossible task to improve such situation. All attempts to move from the opposite corners failed or are very restricted (which make them working, at least). From one corner, moving is going on in direction of analysis of natural language. From another, we move into the area of human-friendly way of creating semantics. Why these attempts failed? Because machine-driven analysis of natural-language may deal with rules of natural language, however, it surrenders in front of identifiers of the language. These identifiers (words, phrases, sentences, etc) are generalized by the very humans and have different and sometimes ambiguous meaning. What's worse, machines cannot fit them, because they work with the different type of compatibility.

Humans always fail to interpret full meaning because it is always infinite. For example, "art" has infinite definitions, which talk about senses, emotions, intellect, music, literature, painting or sculpture. There are a lot of books which describe art very differently. There are infinite ways of interpretation and unwinding of all implications of that, because (1) the meaning of any word may ascend to "everything" (thanks to abstraction), (2) there are infinite paths between a thing or a conception and "everything" (like "Nature -> Humanities -> Art" or "Entertainment -> Art", etc), (3) meaning may change with time and thought, (4) subjective meaning may differ significantly, etc. So, thorough meaning defining is impossible (even machines can't do this), but should we discard meaning at all? Apparently no.

Some consider tagging as the best case scenario for humans to treat meaning. However, tagging is only continuation of the idea of keywords, which was used already for long time. It flourishes nowadays not because it treats meaning, but because it provides multiple entry points to information (which is impossible with URL). But it fails constantly when meaning concerned. For example, Full speed ahead for UK high speed rail network article, which has "Birmingham, London, Transport, United Kingdom". Quite strange choices for tags. Why do Manchester and Leeds have no tags, though they mentioned in the article? Why "Transport" not "Railway"?

There are several problems with tagging (and keywords), because they are:
- arbitrary, because, as we mentioned above, anything has potentially infinite set of meanings
- fixed, the article classified as "Transport" won't fall into "Railway" category, which can be later created because "Transport" will cover too many articles (and become useless because of that)
- overlapping, because creators of information often want it fall into all possible categories (that is, into "everything).
What's about the title of the article? In fact, it also abstract meaning of it, however, there are another problem: usually it can have a lot of metaphors and allusions to attract consumers of information, however, it also may distort meaning.

However, everything changes if we lower the bar. We don't need full analysis of natural language and full unfolding of meaning. It is enough to just link them. Such linking exists today too: (a) developers may link computer entities and natural language in interface, (b) anyone may markup some word as a hyper reference, etc. However, application developing for each case is too expensive, whereas hyper reference can refer only to information resource. But namely hyperlink gives a notion of how sought solution has to be. This is semantic link which is just further development of the idea of hyperlink with following features:
- it relates to both natural language and computer entities
- it may refer to real things, abstract conceptions (and, in particular, to computer entities)
- it is flexible (can have as general or as specific as necessary).

As a model we may use another HTML tag, which relates to both natural language and computer entities, similarly to URL. For example, "I was in New York" has several implicit semantic links: "I", "was", "New York", and "I was in New York" as an event. Explicit form of semantic link for New York is <s-id="city"> New York </s-id> , thus, we linked "New York" word with the specific city not a state or a hotel or a cafe. The implicit form of semantic link is any text (word, phrase, sentence, etc) or data. Also, semantic link implies infrastructure behind it:
1. Identification with the help of globally resolvable identifiers. That is "city:New York" should be resolvable at any computer.
2. Establishing of relations (of identity, belonging, or associating) between identifiers and complexes of identifiers. You should be able to link explicitly "I", "was", and "New York" or it can be done by some automatic tool.
3. Identification routing, which allows delegating of establishing of derivable relations to identification routing servers. Thus, "city:New York" has derivable relations with "country:USA" or "USA:state:New York".
4. Usage of semantic context, which allows make meaning area narrower or wider. That is, if you have a lot of pictures of you in New York, you can limit their number by moving to the context of pictures of you in specific district of New York, or extend number of them by moving to the context of USA. Context is rather mix between folder system (with fixed number of scopes) and search (with top-level access to any item): it is top-level access to any item through flexible scope.
5. Semantic wrapping for computer entities, which means that semantics may be attached to a file or other entity, and accompanies it in the case of transfer.

Though semantic link is similar to hyper link, however it differs in one aspect: it does not refer to computer entities only. The purpose of it is meaning identification, associating, generalizing and specifying of any level. Though terminally it may refer to a thing, a conception, or a computer entity. For example, a picture of your family in New York can be called in very different ways like "I and New York" (for personal use), "My family at Times Square" (for ones, who acquaint with your family and New York), "John Doe in USA" (if someone knows only one person from your family and who is not aware about difference between US cities), etc. Semantic links make these forms equivalent in some sense (not equal), because behind scene, "New York" may route to "Times Square" and to "USA", and vice versa. That is, "New York" and "USA" here are not just names but rather semantics itself.

This is the key for understanding semantic link: everything is semantics. It is a semantic link itself, a destination of semantic link and context which a destination belongs to, a subject which uses semantic link and own context. This is consequence of not only nature of semantic link but also possibility to use it attached to a computer entity like a file. This allows to avoid additional hypertext to describe it and avoid redefinition of it on transfer. For example, the picture automatically will appear in the corresponding context at your friend side (of "New York" or "USA" or "you", depending on the necessary level of generalization and intentions).

Usually, I call identification "human-friendly", which means it has to allow humans to define semantics. However, at the same time, I emphasize human-friendliness only because the most of previous attempts were namely machine-friendly. But, in fact, the proposed ways of identification and relation establishing are both human- and machine-friendly. Not only humans will be able to markup semantics, but also it could be done automatically or semi-automatically. For example, an application may help to find all ambiguities and propose to resolve them in human-friendly way (through GUI or SLI). For example, proposing variants "New York" as the city and the state. Is semantic markup needed for machines? Yes, because today data often has unstructured information, which may be stored in fields like "description" or "custom data". And semantic link allows handling namely such information by generalizing or specifying it as much as necessary.

So, can humans treat semantics? Can they treat meaning of Full speed ahead for UK high speed rail network with manually added semantic hints? Yes, for example, they can identify it as High-speed rail in the United Kingdom (though, of course, such semantic link should not refer to the article in Wikipedia but to namely "High-speed rail in the United Kingdom"). Difference with hyper link is such identifier assumes (with the help of routing) it relates to London, Birmingham, Transport, the United Kingdom, and many items, which we cannot predict. However, there are a lot of variants between full identification and no identification: for example, you can identify that the article is about "UK" and "railway", or about "high speed rail network", etc. Such type of identification is quite affordable for an ordinary user, especially considering that text itself gives hints on semantics and identification could be (semi)automatic. Of course, you may ask: "But what's difference if we consider it as plain text?" Check "UK high-speed rail network" and "UK rail network" and "UK railways" queries: they give very different results, which include articles, which are similar to one mentioned above, only for the first query and does not include for the third one.

Can humans differentiate simple relations? How "UK" and "railway" is linked? Is UK railway? Is UK similar to railway? The answer is evident, therefore identity, equality, or similarity could not be applied here. Is UK belongs to railway? No. Is railway belongs to UK? Yes. What's about "I was in New York"? How are "I" and "New York" linked? Identity? No. Neither I belong to New York, nor it to me. So, we can always safely just link them with plain association. Proposed four relations are plain and easily recognized by users: actually they already use them when using files and directories. Identification is file naming, generalizing/specifying is placing a file in a directory, associating is not supported by all file systems but it is already introduced by hyper link.

Can machines treat natural language? Can they analyze natural language? No. But semantic markup gives the chance to improve such situation by setting semantic links only where they can be established automatically or set manually. Can machines convert data in natural language? Not quite. But, you can link them with semantics. For example, if you have databases of movies, then you can markup some movie data with semantic links to events, but you could not create databases of history or geography.

But, finally, who treats semantics and natural language better? Humans? Machines? In fact, the best result achieved when their abilities united. Yes, machines can identify some things and establish relations faster. However, they can't summarize them for one piece of text, they cannot abstract, and sometimes they cannot choose the correct variant basing on statistics analysis of something. Don't get me wrong: statistics is very helpful, when applied appropriately. When as precise match as possible is done, then statistics may help to choose one variant of many ones, which are equivalent. However, statistics cannot replace precise match.

The purpose of proposed approach is collaboration of humans and machines, and making it possible in quite simple and straightforward way here and now. Additionally it may change the way how we interact with user interfaces and Web. However, this approach does not resolve all meaning problems. Yes, there is possibility to miscommunicate meaning. But humans mistake constantly, they miscommunicate meaning with plain text too. Does it mean we should discard any information? Definitely no. Also this approach does not allow to convert natural language words and phrases into computer data. But machines are too far away from human-like thinking and dealing with arbitrary abstractions. Does it mean we should wait for that? Certainly no. Just know limits.

Комментариев нет:

Отправить комментарий