понедельник, 12 декабря 2011 г.

Future of operating system and Web: Q&A

 Brief introduction (Google Docs)

 Q&A (Google Docs)

How can OS look like?

How do we exchange information today?

How can we exchange information tomorrow?

среда, 28 сентября 2011 г.

Browser vs. Operating System: Paradigm Shift

Recently we observe the rise of a pack of new operating systems, which are considered as competitors to traditional ones (which dates back to 30 and more years). Some of new ones are oriented for mobile devices, others are Web- and browser-centric. The latter, in part, assumes browser may replace traditional user interface with similar one (which uses desktop, icons, etc) though based on hypertext. Of course, appearance of new operating systems is not accidental. There are a number of factors, which force the whole industry thinks in this direction:

- possibility (or even necessity) of lightweight solutions vs. heavyweight solutions of traditional operating systems and applications;
- deprecated legacy of old graphical interface vs. hypertext interface;
- growing significance of hypertext itself.

But can browser-centric operating system replace traditional one completely? Or we observe the rise of alternative lightweight systems, which will exist in parallel with traditional heavyweight ones? Will interface change in oncoming years? Is file deprecated as information storage unit? These and other problems are considered below.


There are several factors which cause a demand for lightweight solutions:

1. Lightweight environment. Sometimes we really need only standard tools for viewing and creating quite simple content.
2. Lightweight functions or SOA (service-oriented architecture) in action. Sometimes we use only several functions from thousands. It is good to have a choice between lightweight functions (or remote/local services) and heavyweight applications.
3. Transfer of heavyweight functions to remote servers or cloud.
4. It is impossible to create heavyweight solutions for all possible use cases.

Of course, the choice between lightweight and heavyweight functions appeared not today and even not yesterday. This choice closely linked with the choice between loose and tight coupling, with the nuance of choosing between local and remote components (added by Web), and between explicit and implicit location of processing (by clouds).

The choice between lightweight or heavyweight solution is not simple and depends on circumstances. Usually any product is a combination of both solutions. In real world, tight coupling is preferred when efficiency achieved by configuring dependencies between components in advance; or composition of components is quite difficult. Thus, any car should be assembled before sold to a driver. Loose coupling is preferred when we need flexibility and replaceability of components. Thus, some details of car can be replaced, because otherwise we would replace the whole car because of a small failure.

Such choice is not simple in computer environment too. Namely therefore thin clients cannot replace thick ones completely, Internet applications cannot replace OS-specific ones, cloud computing cannot replace local one. They are alternatives, which usually combined as necessary. And any future operating system has to combine both lightweight and heavyweight approaches.


Graphical interface does not influence evolution of operating systems anymore. Recent touch sensing revolution enhanced the way of interacting with interface but not the very interface. However, many companies and developers still try to improve interface, which is due to following factors:

5. Convergence of graphical interfaces. Almost any platform (including mobile ones) is uniform now: desktops, icons, windows, etc.
6. Growing significance of hypertext interface. The most of information is represented today with hypertext, therefore its interface sometimes is more important than one of local environment.
7. Innate problems of graphical interface are consequences of Goedel's incompleteness theorems. That is, a user deals either with understandable interface with few functions, or with more sophisticated interface (with greater number of functions) but which requires significant learning curve.
8. Lesser flexibility of graphical interface, comparing with command-line interface (which allows reuse of command and parameters, e.g. by shell scripting).
9. Forced (for a user) awareness about low-level aspects of operating system (like files, services, applications, etc).

Tasks of interface are not only representing information in human-readable form but also order and filter this information. Of course, complete ordering of information is impossible (see Goedel's incompleteness theorems). Therefore, we have either ordered hierarchies in GUI (when we cannot get access to all information, because it is hidden in different branches of hierarchy), or unordered interface of command line (when all information is accessible from the top level but you read a manual more often). Similar situation with Web, where ordered portals confront search engines. Another example of this duality is news sites, which try either order number of news (but then some will be hidden), or show all (but then a user may get lost in them).

Can this duality be resolved with design (which sometimes considered as the solution)? Yes, but only partially, because design is only yet another (visual) way of information ordering (and which influenced by the same incompleteness theorems). Can this duality be resolved with any kind simple interface? Yes, but it is possible only by dropping some functions (and again see the theorems).

Can this situation be resolved with video and interactive lessons? It is true that some things may be easily explained with video. However, simple explanation fits only simple things. The more complex information, the more complex explanation is required. Video may convey more information, but less meaning (that is, ordered information), comparing with text. Video is more resource consuming as for both creating and for usage too. And not only for computer but for a user too: the same explanation (especially for abstract conceptions) may be quicker to read than to see at video. Actually, these facts were known centuries ago. Thus, a teacher gives "video" (in person) lessons in real time, but these lessons cover only basics, whereas full understanding is available only through self-education with books.

Can the problem be resolved by reading a manual? Yes, but here we have a sort of dilemma. The more complex interface is, the more complex description of it is, the more entities are introduced. Finally, the tails start to wag the dog: instead of mere using, interface dictates how we should use it. The problem is aggravated by separation of interface and documentation, which often written by different people at all. Therefore reading manuals is the adequate solution only when manuals are adequate too (which is not always the case).

However, if looking deeply, it is clear that learning interface may become even more difficult because of the very conceptions and principles of interface. Have you tried to look for some option, when you are forced to browse through a list of all (50+) options? Have you tried to look for some option in graphical interface, when it is available only in specific and the only way (like menu A -> dialog B -> button 1 -> checkbox 2), which should be remembered for each option? Have you tried to remember 30-50 keyboard shortcuts for each application you use? Have you tried to use some icon-abound application by remembering all icons?

You may see that even "read the manual" principle is limited. And the cause of interface problems is rather in understanding it interface as a layer between applications and humans. Though, in fact, interface is representation of semantics in human-readable form. That is, it is a sort of human-friendly semantics (like natural language) vs. computer-friendly semantics of data and formats.


Influence of hypertext is difficult to overestimate, but and is easy to do too. Hypertext long ago ceased to be hypertext (that is, text linked with other information sources with hyperreferences). Dynamic aspect, real-time updates, video transformed hypertext into a mix of design and programming elements. That is, it evolved to be more or less similar to standard graphical interface. This is not bad by itself, but unfortunately this also meant hypertext is roamed from text features and conveying semantics. In general, the peak of hypertext evolution is not reached yet, because of a number of factors:

10. Hypertext (which used for describing information) did not transform into hyperinformation (which is information itself).
11. Hypertext allows transparent integrating local and remote information sources, which, in own turn, facilitates development of lightweight solutions.
12. Necessity of broad usage of semantics. Semantic Web started this trend, however, it is restricted only with Web, though should include operating systems too.
13. Necessity of human-friendly semantics (which may be possible thanks to simplified Semantic Web, which may facilitate creation and usage of understandable semantics).
14. Necessity of content ordering/filtering/aggregating (which may be applied not only with search engines, but with semantics itself).
15. Browsing is a part of semantics. Really, each referring changes context of information environment. And each modifying of context may change a scope of references (by filtering them).

Everything is semantics, if it has some meaning for a human being. Not only interface, but browsing itself, any data, any application, etc. Today, applications expresses meaning in data and formats, but usually it is revealed only partially to a user (through graphical interface) because it is considered as too complex or not secure to disclose it. However, there are a lot of user data which by definition are not complex and secure for a user oneself (because often they were created by him or her). In result, we have the situation, when, meaning inside computer is detached from meaning inside humans. Of course, such detachment is evident problem, which is after fixed by other means.

For example, realize some application starts and shows a dialog, which proposes to install update and asks whether you want to update the application or not. Of course, such attention to a user is appropriate and necessary, however, there are a lot of questions, which ignored by conception of "detached meaning", in general, and "simple interface", in particular:
- what is included into an update and why each change required?
- what size an update has and where it is downloaded (is there any space on disk)?
- is there any options (do not download updates, download only updates on security, etc)?
- is there any video lesson, which will help to configure an update more finely?

Of course, such advanced management of update can be implemented with already existing tools and environments. However, each advanced management requires more resources and time of developers, which, in own turn, affects a user, because less resources and time will be spent on the very updates, etc. Can this vicious circle be broken? Yes, but for that we need new principles of interface, operating systems, and applications.

All the questions about the update, which were asked above, in fact, are semantics of the update. Therefore we need to attribute meaning to applications and their parts, data, interface, text, video, files and their parts, as well as real things and concepts in more advanced way. Today, we already have one way of attributing meaning: it is hypertext. But it can be enhanced as follows: (1) include semantics (identification and associating with relations), (2) allow fragmented and detached semantic wrappers for any piece of information, which in minimal form could consists from one identifier. In result, we would have not hypertext but rather hyperinformation, because not only text but any information (applications, data, any file, etc) can be used by it. You may imagine the whole operating system (or rather semantic ecosystem) as one semantic canvas, where any element (like a fragment of video or an interface control) may be easily linked with each other.

What technology should be used for hyperinformation? Traditional Web is too oriented for graphical interface. Semantic Web is too oriented for meaning in a form of data and formats. Both technologies are inappropriate here, because it is impossible to create personal application for any personal use case of any user, similarly it is impossible to create personal data and formats for any case. Namely therefore we need the technology which would allow creating personal meaning, that is, simplified Semantic Web.

Ideas behind simplified Semantic Web are simple: (1) human-friendly identification as a balance between precise computer identification (e.g. by hyperreferences) and ambiguous natural language identification, (2) human-friendly representation of semantic relation through legacy of hypertext, (3) restricted set of semantic relations to be human-friendly too, (4) semantic wrappers for any data or code. Because of these ideas we come to understanding that namely browser can play the key role in future operating systems.


As base tool for surfing traditional Web, browser can easily migrate for hyperinformation. In own turn, it can be easily adapted for lightweight solutions and semantics. It does not mean it should operate with any format or be able to represent any graph of Semantic Web. However, browser should work with:
- "Web of Things" (or more correctly "Web of Things and Conceptions"), that is, allow navigation between things, conceptions, and information resources on them;
- legacy data and applications;
- interface;
- Big Data.

For that future browser should incorporate following features:

1. To be semantic aware.

Browser should be able to refer to anything (unlike hypertext which can refer only to information resources). The only alternative (which can refer to anything) now is natural language, which, unfortunately, is ambiguous (mostly because it uses not unique identifiers). For example, "Springfield" as a city, can be found in different states of USA. When you are using map application, then software propose to solve this ambiguity manually. However, search engine cannot resolve it fully, because sometimes it is not clear from text and context which exact Springfield is mentioned. The only solution: to have a unique identifier, which can be explicitly used. For example, "Springfield, IL".

Realize, you have a document with description of economy of Springfield, IL. Today, the trendiest way to facilitate access to it is key words (or tags), because, evidently, the traditional search by text is considered as not quite fruitful. We can guess as such a set of tags could look like "Springfield, Illinois, USA, economy, business, trade, employment". Shortcoming of them are: (a) they are only associations, which you can recall (though, apparently, a set of associations can be never complete and is subjective by itself), (b) some tags are a part of other tags, which makes them redundant, (c) meaning of the document is blurred, basing on these tags: is it "economy"? is it "type of economy?"? is it "trade in USA and Illinois?", etc.

The alternative for key words is only more precise identification as a part of simplified Semantic Web. In our example, if you really want to make access to the document faster, you should use the aggregated and precise identification of the document. That is, such identification should (a) convey concise meaning of the document, (b) use as much precise identifiers as possible (not text). Of course, you may object to that, because usually information has several meanings. However, this the case for all things in Universe, cities has many meanings too, however they should be identified precisely. Similarly, any information should be precisely identified (and not by a title, which often tended to be attractive and full of metaphors), whereas all possible associations (that is, additional meaning) should be calculable.

In our case, the document should have the identifier, which consists of two precise identifiers: "Springfield, IL" which refers to the city, and "economy" (which is the precise identifier, though has vague area of definition what is included into this term), linked between each other (which means this document namely about "economy OF Springfield, IL", not about "Springfield, IL and economy" or "economy document written in Springfield, IL", etc). The role of browser is to be able to use precise identifiers (to both computer entities and real things and conceptions) and relations between them.

That is, when you surfed to such document, you can surf to either other information resources or real thing or conceptions (and not only). Why do we need that? In fact, traditional surfing only helps to research random links to documents, which describes concerned entity, and related things and conceptions. The situation can be improved with an article in an encyclopedia. However, neither encyclopedia can cover all things in the world, because a number of such things and conceptions much greater than ability of encyclopedia creators. Namely therefore, we surfing through real things and conceptions are needed. It would help to understand the role of concerned thing, its associations with other things, and surf between them.

2. To work with legacy data and applications.

Today, browsers are not used instead of file managers, mostly because hypertext support for files is resource and time consuming. For example, for directory you need to create a hypertext, which describes it, and then synchronize each time when you create/move/delete some files. To semantize files and directories, we can wrap them with meaning. A semantic wrapper should identify content, which would allow including legacy data (files and their parts, applications and their functions and controls) in semantic ecosystem. Semantic wrapping is the key principle of hyperinformation, which consider any information as a separate entity available for referring.

This opens new perspectives, sometimes in unexpected areas. Today, a lot of information is duplicated across the Internet. Unfortunately, there are no ways for machine to define if information is duplicate or not (unless file identifiers coincide). Semantic wrapping may change that. Unlike arbitrary file identifiers (for example, the document about Springfield, IL may be named as "Spr.doc" or "Sprinfield_IL.doc", etc) semantic identifiers are more unequivocal. Therefore, this would make possible to download only a wrapper from untrusted source, and after this to download the information (which is referred by this wrapper) from trusted source. We can do the same today, but in explicit way, whereas semantic wrapper saves could save our time. For example, you read about some book in blog, click on a reference to semantic wrapper, which automatically redirects you to e-book download from your favorite shop (or proposing several sources).

3. To interact with semantics.

Semantics based on references to things and conceptions, whereas the role of interface is interaction with such references. Similarly, natural language is interaction with ambiguous word-references to everything, command-line interface is one with (mostly) unique references to computer resources, graphical interface is one with visual references (which, unfortunately, is hardly formalized in symbols) to computer resources, at last, interface of hypertext is one with unique references to information resources. Evidently, we need both references to computer resources and information resources and real things and conceptions. Therefore, the question is can we combine advantages of all these interfaces?

In fact, hypertext already combines some features of graphical interface and natural language. Any hyperreference is covered with ambiguous reference of natural language and uses unique reference to information resource. Conception of hyperinformation, simplified Semantic Web, and semantic ecosystem add new tints. Hyperinformation can uniquely refer to things, conceptions, and computer resources. But changes in interface are needed to extend usage of hypertext beyond "point-and-click" metaphor.

Phrase that "interface is semantics" now sounds not so abstract: really, instead of the graph of web sites, interaction with semantics is one with the graph of everything (things, conceptions, sites, data, functions, etc). And if "point-and-click" is enough for the graph of sites, the graph of everything requires quite different approach. It is evident by the fact of success of search engines. Really, why are they needed? Because, mere "point-and-click" is not appropriate for navigating through graph of everything, and in the case of search engines we use natural language for navigating.

The next step is using hyperinformation for navigating the graph. Each navigating operation is a comparison of a graph of a query with the graph of everything, or if put simply, it is a comparison of a question with available answers. Such navigating is very similar to command-line interface, except of (1) not commands are used but hyperinformation (that is, natural language with unique identifiers behind it), (2) result is hyperinformation too (that is, references to things, conceptions, data, UI controls, functions, etc). By this way, as you may note, all types of interfaces are merged into one, because hyperinformation may wrap elements of all types of interface. In the result, even graphical interface may change (though semantic interface may operate in parallel with it). For example, hyperinformation allows atomic UI operations, which would involve any separate UI control, which means any UI control may be reused in navigation by the graph of everything.

If talking about the example with the update dialog from above, semantic interface may change it in following ways:
- the dialog may refer to semantic wrapper of an update, which can be downloaded to your computer;
- an update may refer to own components;
- an update may refer an application options or even UI controls;
- a precise reference to an update can help to retrieve references to it;
- an update itself is a graph of hyperinformation, which may be queried by semantic interface.
That is, for example, to turn off updates you may query an update with "Turn off" string (which can become an identifier as soon as it matches an update graph), which may response with UI control, which allows turning off downloads.

4. To tame Big Data (micro and macro data management).

The problem of big data is not created by computer age, in fact, it is known for any living creature: thinking is processing of big data (representation of billions atoms in a receptor) and aggregating it into compact inner representation (a memory or a word, which refers to something). And if you want to live with any big data, the answer is always aggregation. Aggregate, if data is still big, aggregate aggregations, repeat as many times as needed until the result is appropriate.

Aggregation is crucial for not only really big data but also for local data: whereas Big Data is hard to be swallowed by computers, local big data is usually the problem for users, which are lost even in own personal data, and cannot find necessary information. Can the problem of locally big data be resolved by search? Not quite, because search uses own algorithm of aggregation. Whereas, a user needs own ordering, which can be supported by self-aggregation and interface.

Really, why any information should be categorized each time it is copied to a new computer? Realize, you download a document about Springfield, then save it to disk in "Documents/Geography/USA" folder. It seems you will find this document easily in future. It is true, but only until the moment you will have thousands documents in this folder. That is, you need to create new folders with subdivision by states or by cities, but then you can have problems with folders themselves (if their quantity will be big enough). What's worse, in some cases you may not have time for categorizing a new information, and then chances to find "Downloads/2011/Springfld_2.doc" are even less.

Now realize, that the document is pre-categorized (which should be supported by semantic ecosystem) or pre-aggregated (which should be done by the document creators) before it reaches your computer. The aggregation of the document can be expressed in reference to "Springfield, IL". Categorization is derivable of the aggregation, that is, the topic has derivable relations to geography, Earth, North America, USA, Illinois, Springfield itself, etc. Now, if you have predefined folder for geography or USA, the document will be placed here. But are folders needed at all? In fact, folder is constant categorizing of information, but if we can place information automatically, then only identification of information matters, whereas categorization is always derivable.

But aggregation should be supported by interface. This is where context come into play. What is the context? You can consider it as following versions of folder conception:
- dynamic folder: context should cover all information which matches the given topic (however because of possible performance issues with fully dynamic contexts, number of contexts can be restricted, as it done with modern desktops);
- hyperinformation folder: context should cover not only files and subfolders, but any information wrapped with semantics (part of file, application, function, interface control, web site, web page, etc);
- symbiotic folder: changing context may help to reach information, upon reaching information, in own turn, context may change (to find a document about Springfield, IL you go to the context of Illinois, but upon reaching it, the context can be changed to one of Springfield, IL);
- filtering folder: because information may belong to multiple contexts, context can be finely tuned to reach information you need;
- associative folder: context may store information shared between contexts (for example, a document about Springfield, IL goes to USA context, Illinois one, Springfield one, trade one, etc);
- semantic folder: context allows only information, which relates to meaning of it (that is, if you have context of USA, you cannot put a file on France to it).


So, what answers can we give to answers from the beginning of this article? Can browser-centric operating system replace traditional one completely? Of course not, because operating system is interface between hardware and software, whereas browser is a part of interface between software and users. On the other hand, browser may surpass traditional or low-level user interface (files, graphical interface, etc) with semantic user interface, which may change the principles of interacting humans with computers.

Do we observe the rise of alternative lightweight operating systems, which will exist in parallel with traditional heavyweight operating systems? Possibly. Lightweightness may be forced, if we talk about devices, which are oriented for media content or Web surfing. However, in general, any lightweight solution tends to take heavyweight features in long run. The same effect we can observe in many successful applications, which become successful thanks to lightweightness, but which became heavyweight monsters after more and more features requested.

Will interface change in oncoming years? Definitely yes, and changes should concern rather not new visual features, but rather combination of visual and semantic ones, which would allow dramatically improve learnability of interface.

Is file deprecated as information storage unit? Yes and no. Yes, it is deprecated as low-level feature of operating systems, which should be represented as high-level entity for users. But no, it is not deprecated as atomic unit of information (there is no reason to replace one atomic unit by another).

Then which paradigm shift we are talking about? We are talking about new semantic level of operating system architecture, which is placed over traditional user level. Whereas the role of browser as "run anywhere" tool is to navigate through semantic level and complement traditional user interface. That is, not browser vs. operating system, but browser over operating system. Apparently, it would not be browser as we used to see it: surfing the Web would be only one of many other functions like identifying of things and conceptions, working with semantics in general and context in part, efficient management of local information, etc, etc.

четверг, 25 августа 2011 г.

Simplified Semantic Web

The battle goes on. In heads and souls. While the whole world thought Google can find everything, the very search engine was enhanced with social features. Not the least because machine search is not perfect. While social networks and Wikipedia show the potential of communities in technology world, Semantic Web (or "Web of data", which focuses on processing by intelligent agents) advances. Web is constantly rolled over by waves of technical and social overconfidence. The conventional Web with its personal pages, blogs, and social networks is the asylum for social overconfidence. Semantic Web is for technical one.

Any overconfidence (bordering with blind faith) is the way to nowhere, true is always in middle ground. Actually, processing of information by machines and humans (as well as worlds of data and text) are inseparable parts of the whole. Look at any application. Application lifecycle starts from requirements (written in natural language), which turn into design and implementation (in the form of data, models, and code), to be incarnated in user interface and documentation (which, again, use natural language).

However, even so, computer- and human-oriented structures are separated in different layers of abstraction. Separation of layers is one of main principles of development, which makes development more efficient but destroy links between different layers of information. After this, these links are emulated by manual synchronization of different components of applications, and documentation. In result, we see numerous desynchronization problems of requirements ("what an user expects"), code ("what a developer does"), and product ("what an user gets"), etc.

There are even more deep causes of this separation. Natural language (as human thinking in general) usually generalizes information, to reduce its volume and time, required for communication. Whereas, computer processing usually specifies everything as precise as possible. The problem is you cannot solve both tasks (generalization and specifying) at the same time. In fact, it is known since publishing of Goedel incompleteness theorem, which declares, if simply put, that any system cannot be complete and consistent at the same time. The same concerns any information too: it cannot be ordered (by generalization) and complete (by specifying) at the same time. Complete information includes a lot of signals, which we receive on some thing, but it has no sense in this form. As soon as we start to order this information, we ought to throw away a part of information, therefore we lose completeness.

Look now at data and natural language. Data formats define a set of rules, which guarantee consistency (integrity) of data, but nobody can ensure completeness (moreover, usually data formats do not aim to provide full data and provides beforehand restricted frames for data). Natural language is based on restricted set of rules, which use almost infinite set of identifiers. Which cannot safeguard consistency (in some cases, you can understand grammatically incorrect sentence), but usually we expect completeness (though, it is not guaranteed by natural language itself, but it can be achieved with it).

Data formats provide advantage, when they order homogeneous information, which has similar characteristics, which allows to distinguish generalized classes and properties of things. However, any generalized characteristics is always subjective, because we generalize according with some criteria or arbitrarily at all. Thus, we throw away "unnecessary", in our point of view, information. For example, we can have database of countries and include capital in one field as just string. Whereas any capital may be described by a book at least.

The first problem of data formats is borders. Any data format should stop somewhere and draw borders (that is, "we define capital as a string"), which makes usage of it by machines more efficient, though we lose completeness. Especially, it is evident when we apply them to heterogeneous information. In this case, formats usually have so called "custom fields", which use... natural language. Which proves this problem really exists and that we cannot order information with the speed of its creating.

The second problem is data formats are monolithic. That is, you can work with the format only if you can interpret it completely. For example, to retrieve country list from geographic database, you should know database format or have application which knows that. This exemplifies coarse-grained information compatibility vs. fine-grained information compatibility of natural language (if you don't know one word from a sentence, you need interpretation only for this word).

However, despite of these shortcomings, formats are efficient in ordering information, which cannot be achieved by natural language. Which does not order information so tightly. Which is usually redundant and full of ambiguities.

The same concerns search. The assumption that machine search can find everything is dubious. At least. Because search (which is consistent) works with text (which is not). Natural language abounds with generalizations, which ignore or miss a part of information. Can we resolve such ambiguities with some context? Rather not, than yes. Because natural language has questions exactly because context cannot help in all cases. There is a lot of situations, when you cannot understand information completely. Therefore, you ask questions. Because of these restrictions, modern search succeeds only with simple or popular questions. But it fails as soon as we asks questions with several entities, linked between each other ("Is India larger than Bhutan?").

Can social search help in this case? In fact, only partially, because it can cover only a part of infinite questions which we can ask. And don't forget, that unlike data, answers in natural language are unordered, therefore if you find indirect answer, you have to reorder information, which would require additional time.

Such duality of machine vs. human processing of information will remain for years ahead. The cause is machines are faster in processing, but they can process only in consistent way (which is not available for all areas of knowledge). Capabilities of humans are restricted, but they are better with ordering information. We may see that both approaches rather complement each other.

If so, why conventional Web and Semantic Web are separated still? The latter is built on good idea of providing semantics, but why it concerns only data and machines? We need semantics. Here and now. And who said humans are not able to deal with it? It may concern machine-oriented forms of it, but not natural language. Quite possible, that even experienced user cannot work with complex classifications, taxonomies, and ontologies, but any human can easily identify things and conceptions (otherwise, he or she could not communicate with natural language).

There is the abyss between worlds of text and data. And this is the result of machine-friendly features. We work with cryptic file names (or rather abbreviations), we should remember file paths (if a computer is similar to a library, why there is no librarian which gives me a book on request and does not force me remember long path to it?), any user interface is rather human-friendly patch, which exposes application functions, and any questions about its singularities are answered with "read the manual" (why I should read it if I need one function of this application per entire life?).

Humans are underestimated. It seems that information is created by humans and for humans, but there is still no simple tools for meaning. Semantic Web accentuates on data (which are to be processed by intelligent agents) and complex formats (which are to be created by experts). But what is about an ordinary user? Unfortunately, even developers consider Semantic Web too complex and cumbersome (and which, by the way, makes it too expensive).

Stop it. We need rather simple forms of semantics. We need simplified Semantic Web. The approach, which should be a mediator between conventional and Semantic Web. The approach, which should make natural language more precise and make data more human-friendly. And it should be based on several simple ideas: rich human-friendly identification, human friendly representation of semantics, and restricted set of human-friendly semantic rules.

1. Idea of human-friendly identification is simple: we should go beyond natural language identifiers and make them precise and unambiguous. Identification is the base of semantics: it answers "What is it?" question. All the rest like hierarchies, associations, taxonomies, ontologies are derivables (which only help to order information). You may not know how your vehicle is classified, but you, without doubt, know what identifier it has. The fact, that you vehicle is SUV is derivable from its identifier.

Human-friendly identification supposes simple and compact way of specifying value. For example, conventional Semantic Web proposes to use URI, so to distinguish different meanings of "opera" word, we need to use URIs like http://en.wikipedia.org/wiki/Opera and http://en.wikipedia.org/wiki/Opera_(web_browser). Shortcoming of such approach are we depend on computer resources (site and path). But humans usually do not know about the source of an identifier (do you know where your favorite encyclopedia defines "opera"?). Instead of that, human-friendly identification should provide uniqueness of meaning (not uniqueness within computer system).

That is, instead of arbitrary URI http://en.wikipedia.org/wiki/Opera, we need identifiers, which are close to natural language but more precise. For example: opera or Opera, that is, any natural language idenfier may be made as precise as needed and without duplication. But how even this identifier will be routed to the specific location or server? This is responsibility of semantic cloud, which will route identifying requests, will retrieve identifier derivables, etc. The role of computer is nevertheless important: it may hint which identifier is ambiguous and how we can resolve it, but the result should be human-friendly.

2. Idea of human-friendly representation of semantic relation is simple too: let's merge hypertext and semantics. In fact, it is quite natural alliance: hypertext as convenient form of representation, marked up with semantics. This way, everyone may use and add meaning to text, with zero reconfiguration (well, maybe not zero but about to).

3. Restricted set of rules is necessary, because, unfortunately, there is no way to precisely draw borders between groups of meaning in sentence. For example, a human can understand the difference between "I like movie about medieval Scotland" and "I like movies, medieval, and Scotland". That is, the simplest rule may allow to just link identifiers, for example: "I like movie about medieval Scotland" vs. "I like movies, medieval, and Scotland". Extended set of rules may include ones for generalizing and specifying, etc.

Moreover, simplified Semantic Web may enhance even existing conventional Web, which has following restrictions: (1) loosely links information (especially within a computer or a network), (2) rarely used for binary data (the same concerns Semantic Web, which proposes to convert data to own format but has not simple way of semantifying already existing data), (3) provides restricted means for integrating heterogeneous application and data, (4) creation and publishing of hypertext is not easy enough.

Of course, you may say there are no such problems. Why you cannot describe a binary file with hypertext? First, hypertext and binary files are separate entities, though description does not make sense without described information. Second, hypertext editor and Web server are not must for any computer (especially the latter, which should be correctly configured and used). Third, links in hypertext are different when used inside and outside Web server (namely because they dependent on computer entities like sites and paths).

What can simplified Semantic Web do? (1) binary files may have semantic wrappers, which should be coupled somehow with files (and maintain integrity of meaning), (2) semantifying through intuitive identification is simpler than hypertext editing, (3) identification should cover all entities of both real and computer worlds, (4) as soon as identifying is not linked with computer-oriented resources, publishing is facilitated too (because the fact may be represented as plain text with fragments of semantic markup).

Of course, both human-friendly and computer-friendly Web is nevertheless necessary. But the bridge between them is inevitable too. This way or another.

пятница, 10 июня 2011 г.

среда, 26 января 2011 г.

Самые большие промахи современной компьютереальности

Человеческое мышление представляет собой неразрывный сплав процессов упрощения множества деталей реальности и обратной детализации созданных обобщений. Две простые цитаты хорошо выражают сложность этого процесса: "Все модели неправильны, но некоторые из них полезны" (Джордж Бокс) и "Делай все как можно проще, но не проще, чем это необходимо" (Альберт Эйнштейн). Вся суть интеллекта и заключается в этом извечном балансировании между простотой и точностью.

Тоже самое относится и к любой технологии: она может быть достаточно простой и пренебрегать деталями, либо же достаточно точной и сложной. У любой технологии всегда имеются какие-то недостатки (ограничения), поэтому их эволюция может быть практически бесконечной (ограничиваясь лишь желанием и возможностями людей). Но прогресс не может быть постоянным: эволюция (в технологиях и в Природе) представляет собой перемежающиеся краткие периоды изменений с относительно долгими периодами стабильности. Для изменений нужны затраты ограниченных ресурсов, поэтому они вызывают сначала некое подобие хаоса (т.к. изменения делают систему временно "открытой"), но который рано или поздно опять сменяется равновесием и стабильностью.

Сейчас мы живем в период технологической стабильности, граничащей с ожиданием перемен. Причин для ожиданий много: какие-то технологии работают не совсем так, как мы того ожидаем, какие-то технологии всё никак не заработают, на какие-то их недостатки мы закрываем глаза, но наши потребности-то растут. Но главная причина заключается в том, что существующие компьютерные технологии имеют целый ряд фундаментальных противоречий, решения для которых могут предложить альтернативный путь развития. Именно их мы и рассмотрим далее.

* Компьютерный поиск может найти всё

Это утверждение может быть верным, если мы наложим ограничения на "всё". Так машинный поиск хорошо работает с простыми запросами (которые можно выразить при помощи одного-трех слов и которые помощь к поиску и рекомендует использовать) или те, которые достаточно уникально идентифицируют предмет. Но как только речь заходит о двусмысленных запросах, запросах состоящих из множества слов - у поиска начинаются проблемы. Что неудивительно, так как современный поиск ищет по тексту, а мы всегда стремимся обобщать и использовать двусмысленные слова, т.к. у нас не всегда есть время на детализацию всех фактов. Поиск не может восстановить потерянную таким образом информацию, поэтому в некоторых случаях он заведомо обречен на неудачу.

Обобщение рождает многие проблемы поиска, но может ли решить эти проблемы ранжирование страниц? Ранжирование представляет собой обобщение важности страниц, в соответствии с некоторым алгоритмом. Разумеется, что он не будет работать в некоторых случаях: например, на запрос "напиток" поисковик может дать ответы "Кола" или "Пепси" только лишь потому, что с эти напитки важны для большинства пользователй. Некоторые исследователи возлагают надежду на персонализацию, которая поможет вычислять важность на основе персональных предпочтений. Но эти предпочтения тоже являются обобщением: вы можете любить апельсиновый сок, но именно сейчас хотеть яблочный.

Решение: Точный семантический поиск

Ларри Пейдж однажды описал "совершенную поисковую машину" как такую, которая "понимает точно, что вы имеете ввиду и дает ответ точно такой, как вам нужно". По иронии, пока поисковые машины делают всё возможное, чтобы не соответствовать этому образу. Они основываются на изощренных математических методах там, где они иногда не смогут ничего дать в результате (без конкретного уточнения никакой алгоритм не может вывести к чему относится слово "оно"). Именно поэтому, поиск должен основываться только на смысле, который уточнять должны сами люди.

* Семантика, ограниченная экспертами, машинами, форматами и технологиями

Семантика частично уже давно доступна людям, причастным к программированию: по сути, объекты и действия приложения и составляют семантику некой предметной области. Эта семантика может использоваться людьми только при помощи пользовательского интерфейса, который связывает объекты и действия с образами или словами естественного языка. Семантический Веб сделал семантику еще более доступной. Однако, к сожалению, только для экспертов и машин. Его стандарты слишком сложны для восприятия обычным пользователем (и сами эксперты отмечают, что Семантический Веб не имеет удобного представления для пользователя). Другая проблема заключается в том, что, по сути, Семантический Веб только продолжает то, что начали бинарные форматы и XML: он представляет данные в более универсальном формате. Это касается именно данных, которые обычно упорядочены. Но в мире достаточно много и неупорядоченной информации, что делать с ней?

Отдельные технологии и форматы пытаются представить семантику собственными средствами, но это еще больше ограничивает ее. Так EXIF может содержать дополнительные "метаданные" для изображения, набор которых всегда ограничен. Онтология Семантического Веба может хорошо описывать определенную предметную область, но мы часто легко пересекаем границы областей (например, онтология музыкальных дисков может подразумевать имена артистов, но кто может гарантировать, что нам не понадобится выборка артистов, которые жили в определенном городе, в определенный период времени?). Metaweb (который сейчас является частью Google) предлагает автоматическое определение сущностей, каждой из которых может соответствовать множество различающихся написаний. Однако на их сайте написано, что они оперируют немногим больше дюжины миллионов таких сущностей. При этом сейчас живет несколько миллиардов людей (каждый из которых и является такой "сущностью"), мы используем несколько миллионов географических названий, произведений искусств, и т.п. и т.п. Готова ли данная технология к миллиардам сущностей и готова ли она предоставить алгоритм, который сможет точно отличать совпадающие имена в автоматическом режиме?

Решение: Доступная семантика

Нам необходима форма, которая сделает семантику доступной обычным пользователям. Эта форма должна связывать представления, понятные простому пользователю, с семантикой. Сейчас подобные связи устанавливают, в основном, приложения, но ведь для каждой идеи не создашь отдельное приложение. Более того, семантика должна представлять собой не набор ограниченных форматов, а, скорее, быть набором атомарных форматов, которые связаны между собой. На самом деле, каждое слово естественного языка является таким атомарным форматом. Именно поэтому вы можете знать, что такое "заяц" без углубленных познаний в зоологии или физиологии.

* Ложная аналогия для представления информации

Обычно, файловая система сравнивается с документами или книгами в офисе (поэтому мы используем такие термины как файл/папка/каталог). Однако эта аналогия соответствует внутреннему механизму управления документами, когда вы должны знать в какой папке находится та или иная информация. Но ситуация меняется, когда вы находитесь извне системы: тогда вы просто запрашиваете информацию и получаете ее (также, как ум не знает в какой части мозга расположена нужная информация). Этой "внешней" схеме запроса информации соответствует поиск, однако он несовершенен и быстро перенаправляет нас к внутренней схеме, предоставляя URI или пути файлов, которыми мы должны оперировать в последствии. В Интернете мы пользуемся все той же схемой: Веб серфинг по сути представляет собой простое копирование файлов.

Решение: Идентификация

Роль имен файлов и URI заключается в идентификации информации. Но так как эта идентификация ставит целью уникальность в пределах определенной компьютерной сущности (каталога или системы доменов), и на нее не накладывается никаких семантических ограничений (вы можете положить музыку в папку "Видео"), она становится бессмысленной. Вместо этого нам нужна идентификация, которая будет не только совпадать с той идентификацией, которую мы используем в реальной жизни, но также и расширять ее (так например, в реальной жизни у нас просто нет возможности идентифицировать отдельные эпизоды фильма или отдельные части приложения, которые мы обычно описываем при помощи нескольких предложений). Главная цель идентификации: отвечать на вопрос "Что это?" по отношению к любой информации, будь это веб страница или файл. Причем, результаты идентификации не должны теряться при копировании на другой компьютер, а, следовательно, она должна давать результаты, которые будут поняты на любом другом компьютере.

* Интерфейс, приложения, документация, коммуникации существуют в параллельных мирах

Интерфейс является частью семантики, абстрагируя смысл в той форме, которая удобна для пользователя или программиста, и т.п. Но в данный момент, интерфейс пользователя не является семантикой - скорее, он связан с семантикой приложения, представляя отдельный ее слой. Графический интерфейс, являясь вещью в себе, не имеет проблем только тогда, когда количество визуальных элементов ограничено (и когда мы их еще можем запомнить), при увеличении же их количества, у нас возникают проблемы: (а) невозможно выразить всё при помощи исключительно графики, поэтому всегда прибегают к помощи естественного языка, (б) невозможно работать с большим количеством элементов графического интерфейса (что решается при помощи вложенности, которая скрывает часть приложения, из-за чего мы с трудом находим возможности приложения), (в) невозможно переиспользовать элементы графического интерфейса. Простота графического интерфейса обычно достигаться путем либо (1) уменьшения количества возможностей приложения, (2) автоматизацией некоторых возможностей (при этом, некий "оптимальный" алгоритм при помощи обобщения опять же уменьшает возможности).

Документация тоже является частью семантики, о чем часто забывают создатели приложений. Впрочем, это не их вина, что до сих пор не предложен способ использования одной и той же семантики различными частями приложений.

Мода на блоги, социальные сети является одним из главных трендов Веба 2.0. Однако, хотя они и задумывались для коллаборации чаще они, как и другие средства коммуникации становятся вещами в себе: мы используем их только для общения, а их коллаборационный аспект практически не осваивается. Если так, то уникален ли и сам тренд? А есть ли принципиальная разница между Вебом, электронной почтой, обменом сообщениями, блогами? На самом деле, во всех случаях идет обмен информацией разными способами между отправителем и получателем. Веб: получатель (машина) посылает запрос, отправитель (машина) посылает ответ (хотя можно сказать, что получателем выступает пользовательская сессия в браузере). Почта: отправитель (почтовый ящик) посылает сообщение, получатель (почтовый ящик) проверяет наличие обновлений. Обмен сообщениями: отправитель (аккаунт) посылает сообщение и получатель (аккаунт) сразу получает его. Блог: отправитель (аккаунт) публикует сообщение, получатель (сессия в браузере) читает сообщение по ссылке. Все эти способы являются лишь различными видами транспорта для информации, и чьи внутренние особенности должны нас волновать только поскольку они доставляют информацию разным способом (как телеграф, телефон и обычная почта).

Решение: Семантика, как движущая сила

Семантика является ядром любой информации, но этого недостаточно без ее представлений и коммуникации. Первый аспект мы традиционно называем интерфейсом (в качестве которого выступают как естественные языки, так и бинарные форматы, графический интерфейс пользователя, Веб страницы, так и программные интерфейсы, и т.п.), второй аспект является транспортом (куда входят гипертекстовый протокол, электронная почта, блоги, социальные сети, и т.п.). В целом же механизм может работать так: (1) вы обнаружили ошибку в программе, (2) первым делом, вы идентифицируете саму ошибку и область ее применения, (3) само описание ошибки может быть не достаточно точным, но идентификация на шаге 2 позволяет другим пользователям точно сопоставить эту ошибку с другими ошибками в данной области приложения, или с документацией для данной области приложения, (4) также вы можете послать описание ошибки другим людям и это описание будет автоматически слито с их персональным пространством фактов.

Но стоит заметить, что проблемы семантики, в свою очередь, тесно завязаны на интерфейсе. Одна из ключевых функций интерфейса является ограничивание смысла (так каждое новое слово в предложении ограничивает контекст информации, а список в графическом интерфейсе ограничивает возможные значения). С ограничиванием же смысла тесно связано понятие контекста (часть информации, в рамках которой мы действуем в данный момент). Персонализация, поиск и каталог файлов являются лишь частными случаями контекста. Использование контекста может решить проблему множества взаимосвязанных фактов, с которой обычно сталкиваются люди, работающие с семантикой. В этих случаях семантическая композиция может или заменить приложение или облегчить проблему семантики на странице.

Как мы видим, большинство из существующих проблем так или иначе связаны со смыслом. Что и говорить, если до сих пор существуют понятия программирования и дизайна, то есть, с представлением информации в компьютерно-ориентированной форме (данные) и ее обработкой (код), а также с представлением информации в человеческо-ориентированной форме (интерфейс), но нет понятия семантической композиции, то есть, то, как информация организована сама по себе. С точки зрения семантической композиции, программирование и дизайн может делегировать ей часть своих функций. В самом деле, любая информация (например, веб страница) представляет собой мини-приложение, которое может отвечать на входные данные (запросы или вопросы) выходными данными (ответами), что, аналогично, например, простейшим приложениям для доступа к базе данных. Та же самая веб страница может иметь великолепный графический дизайн, но быть ужасной семантической мешаниной, в которой невозможно найти то, что вам нужно.

Нельзя не заметить, что описанные выше проблемы тесно взаимосвязаны:
- Семантическая композиция зависит от возможности точного поиска, который должен быть эквивалентным на разных системах
- Точный поиск зависит от идентификации, которая позволит точно сравнивать смысл запроса и информации
- Эффективность идентификации зависит от использования контекста, т.е. ограничений, накладываемых на интерфейс
- Контекст зависит от использования семантических отношений
- Использование же семантических отношений должно быть достаточно доступно для обычного пользователя, что подразумевает семантическую композицию

Поэтому ни одно из отдельных решений (как эффективный поисковик, система сущностей для идентификации, Семантический Веб, и т.п.) не могут решить все проблемы сразу. Необходим общий подход, который мы и описали выше. Конкретные же пути решения можно найти в статье "Как решить проблемы современного поиска?".

Great blunders of modern computereality

Human thinking is unseparable mix of simplification of multiple details of reality and reverse detailing of created generalizations. Two simple quotations express complexity of this process very well: "All models are wrong, but some are useful" (George Box) and "Make everything as simple as possible, but not simpler" (Albert Einstein). The very essence of intellect is constant balancing between simplicity and precision.

Similar balance concerns any technology too: it can be either simple enough and ignore some details, or precise enough and be complex. Therefore, any technology could have shortcomings (restrictions), which make its evolution almost infinite (restricted only with aspiration and abilities of human beings). But progress cannot be constant: evolution (both in technology and Nature) is alternating short periods of changes and quite long periods of stability. Changes cost some resources, therefore they result in a sort of chaos (because system becomes open for changes for a while), but which sooner or later lead to balance and stability.

Nowadays, we live in a periof of technological stability which borders with expectations of changes. Such expectations invoked by several causes: some technologies do not work as we want, some technologies still cannot work, some shortcomings could be ignored at the moment but our needs are growing every day. But the main cause is existing computer technologies have several fundamental contradictions, which will be considered further.

* Computer search can find everything

This statement can be true, if we restrict "everything". Machine searching works quite well with simple queries (which can be expressed with a few words and which are recommended to be used by search help) or which uniquely identifies something. But a search has problems as soon as it concerns ambiguous or quite complex queries. It is no wonder, because contemporary search deals with text. Using natural language, we always generalize and use ambiguous words, because we have no enough time for detailing all facts. A search cannot restore all lost information, therefore in some case it just cannot succeed.

Generalizing creates many problems of searching, but can page ranking solve these problems? Page ranking is generalizing of page importance, according with some algorithm. Of course, it won't work in some cases: for example, "drink" query can return "Cola" and "Pepsi" only because these drinks are important for many users. Some researchers hope personalization will be able to help in importance calculation basing on personal preferences. But these preferences are generalizations too: you can like orange juice but want apple one at this moment.

Solution: Precise semantic search

Larry Page once described the "perfect search engine" as one, which "understands exactly what you mean and gives you back exactly what you want". Ironically, search engines make everything possible to not match this definition. They based on sophisticated mathematical methods, which generalize things (and generalization is opposed to precision) and which could give incorrect result (for example, in some cases, any algorithm cannot deduct what "it" means). Namely therefore, a search should be based on exact meaning, which precision can be made mostly only by humans.

* Semantics restricted with experts, machines, formats, technologies

Semantics is available for programming long ago: in the essence, objects and actions of any application is semantics of some domain. This semantics may be used by humans only with the help of user interface, which links objects and actions with images or words of natural language. Semantic Web made semantics even more available. Unfortunately, only for experts and machines. Its standards are too complex for ordinary users (and experts underlines that Semantic Web has no usable representation for users). Also, you may note that Semantic Web just continues what was started by binary formats and XML: it is just yet another data format, though universal for whom accepted it. But it concerns only ordered data. What we should do with unordered information?

Some technologies and formats try to represent semantics by own means, which restricts it even more. Thus, EXIF may contain additional "metadata" for a picture, but a set of them is always restricted. Ontology of Semantic Web can describe some domain, but we often violate borders of domain (for example, ontology of musical albums may imply artists, but who can guarantee we won't need a query of artists which lived in certain city, in given period of time?) Metaweb (which is already a part of Google) offers automatic detection of entities, which may have multiple spellings. However, their site declares that they operates with dozens millions of such entities. At the same time, there is several billions people (each is such "entity"), we use millions of geographical names, names of art pieces, etc, etc. Is this technology ready to deal with billions of entities? Is it ready to provide an algorithm, which would exactly recognize similar names automatically? And do we need such algorithm if humans can easily do it explicitly?

Solution: Affordable semantics

We need a form, which will make semantics affordable for ordinary users. This form should link semantics with representations, which can be understood by an user. Today, such links are established mainly by applications, but we cannot create a separate application for each our idea. Moreover, semantics should be not a set of restricted formats, but rather a set of atomic formats, which can be linked with each other. Similarly, each word of natural language is such atomic formats. Namely therefore, we can know what "hare" is without deep knowledge in zoology and physiology.

* Implicit conventions

Usually, file system is compared with documents or books in office (therefore we use such terms as a file/folder). However, this analogy corresponds to internal mechanism of document exchange, when you should know which folder has concerned information. But the situation is different if we outside of the system: then, you request information (book) and just receive it (similarly mind does not know which part of brain contains requested information). A search corresponds to this "external" mechanism, however, it is not perfect and quickly redirects us to the same "internal" mechanism by providing URI or file path. We use the same mechanism in Internet too: Web surfing is, in fact, a mere file copying.

But the situation is even worse because each file identifier is just text. Which means, we cannot use it meaningfully, therefore we create names with some implicit conventions like "Braveheart (1995).avi". The same story is with all identifiers which used in applications, which cannot be extended: usually, then we use some additional signs of meaning like prefixes or suffixes like "[Ru] Braveheart (1995)".

Solution: Explicit identification

File name and URI is needed to identify information. But because this identification aims to make information unique only inside certain computer entity (like folder or domain system), and because it has no semantic restrictions (you can copy an audio file in "Video" folder), it is meaningless. Instead, we need identification which will coincide with that we use in real life, but also, which will extend it (for example, in real life, we cannot identify episodes of a movie or a part of an application, which usually identified with several sentences). The main goal of identification: to answer to "What is it?" relating to any information (an entire file or just a part of a web page). And results of this identification should be copied together with information to be shared. Therefore, identification mechanism should be equivalent and available at any computer system.

* Interface, applications, documentation, communicating live in parallel worlds

Interface is a part of semantics, it abstracts meaning in the form, which is convenient for an user or a developer, etc. But today user interface is not a part of semantics but it is rather linked by code with application semantics. Additionally, it has own shortcomings. For example, graphical interface has no problems only when a quantity of visual elements is restricted (and when we can memorize them). As soon as their quantity grows we confronts with that: (a) it is impossible to express everything with pure graphics, therefore natural language is used, (b) it is impossible to work with huge quantity of visual elements, therefore nested elements used, but which hide some elements, and which makes it difficult to find some application features, (c) it is impossible to reuse graphical interface elements. Simplicity of graphical interface is usually achieved by (1) decreasing of application features, (2) automatic handling (some "optimal" algorithm restricts available features by generalizing).

Documentation is a part of semantics too, which sometimes is forgotten by application creators. Though, it is not their fault there is no way to use semantics by different parts of an application.

Social networks are one of main trends of Web 2.0. However, though they were created for collaboration, often they used only for communicating, whereas collaboration is ignored. If so, is this trend unique? And is there big difference between Web, email, instant messaging, and social networks? In all cases, we may see different ways of information exchange between a sender and a recipient. Web: a recipient (browser session) sends a request, a sender (computer) responds. Email: a sender (mail box) transmits message, whereas a recipient (mail box) checks for updates. Instant messaging: a sender (account) transfers a message to a recipient (account). Blog: a sender (account) publishes a message, a recipient (browser session) reads a message by a link. All these ways are just different transports for information, and their internal features should be our concern only because they deliver information differently (as with real life telegraph, phone, or regular mail).

Solution: Semantic-driven system

Semantics is the core of any information, but it is not enough without its representation and communication. The first aspect is usually called as an interface (which includes both natural languages, and binary formats, graphical user interface, Web pages, programming interface, etc). The second aspect is a transport (which includes hypertext protocol, email, social networks, etc). Semantics and its aspects will work together, for example, as follows: (1) you detect an error in an application, (2) you identify the error and the area of occurence (which means, further you work with precise identifiers), (3) the error description can be not precise enough, but precise identifiers allows other users correspond the error with other errors in concerned area of an application, or with documentation for this application area, (4) you can send the error description to other people and this description will be automatically merged with their personal space of facts (it could be possible because you can share the same identifiers with equivalent semantic links).

But semantics would not work by itself: it should be supported by interface. One of key interface features is meaning restriction (for example, a word restricts information context, a combobox restricts possible values), which implies the conception of context (a part of information which affects the current situation). Personalization, a search, and file folder are particular cases of context. And namely context usage can solve the problem of the breadth of numerous facts, which confronts everyone who works with semantics.

As you may see, the most of existing problems relate to semantics this way or another. But strangely enough, today there is no way of dealing with semantics, similarly as programming deals with information in computer-oriented form (data) and process it (code), or as design which deals with human-oriented form (interface). Moreover, namely programming and design can delegate a part of own functions, to establish the conception of semantic composing. Indeed, any information (for example, a web page) can be envisioned as a mini-application, which may responds to input (queries or questions) with output (responses), which is similar to simple application for accessing database. But an application usually requires more effort, whereas identifying and establishing semantic links inside a piece of information can be much quicker. The same web page can have gorgeous graphical design, but represents terrific semantic mess, which can be resolved with more strict semantic composing.

Eventually, semantic developing can influence information technologies even more. Today, we are in the very beginning of semantics usage: usually, different web pages rival over possibility to represent some general information. This is the result of the nature of modern search engines, which can search only general information. The real development of semantics will start only when they will be able to find unique information. And this can make semantic development even more serious, because it will force us to develop any piece of information more strictly. You can imagine it as the process of design, development, and testing of even quite simple Web page as a black box, which should give (unique) answers to some (unique) questions.

Problems described above are mutually affected:
- Semantic composing depends on exact search, which should be equivalent in different systems
- Exact search depends on identification, which allows exactly compare a request and a content
- Efficiency of identification depends on usage of contexts, which restricts interface
- Context depends on semantic relations
- Usage of semantic relations should be affordable for an ordinary user with semantic composing

Namely therefore, none separate solution which affects only one problem area (as an efficient search engine, an entity system, or Semantic Web) may solve all problems. The integral solution is needed, which affects all points described above. More specific description of the solution can be found in "How to solve problems of contemporary search?".

вторник, 11 января 2011 г.

Critique of Semantic Web and Semantic Programming

Occasional surfing led me to Semantic Programming page, where I have found quite interesting "Introduction to Semantic Programming". A new paradigm in programming? It is always exciting, though not all proposals were finally embodied in real life language or approach. Have you heard about proposals of Victoria Livschitz: "The Next Move in Programming" and "Envisioning a New Language"? 5 years passed but unfortunately there is still no metaphor language, no metaphor approach. At that time, I've had my own "pattern" approach which in the essence was "semantic markup" of any programmming language to link the code with meaning.

But after several years of lukewarm activity I have understood that the accent of "semantic markup" should shift from programming rather to broader context (though I had such thoughts even before, but at first I've thought that the accent on programming is more necessary). Of course, this idea is not new and can be traced back to ancient philosophers, but only now, we have appropriate possibilities to implement it. The most severe problem for such language in the past was a paper which allows mainly linear arrangement of information. Computer allows much more, though only in 1990s hypertext proved to be successful. But it was clear that it does not fit well for representing data, which led to the raise of XML and Semantic Web. But even they have shortcomings which made me to propose the alternative (to XML at that moment) "Universal Abstract Language" back in 1999, but honestly it was not mature enough. My attempts continued and I've supported my page at GeoCities but after that it was closed and I have not restored it (though I kept it locally). Finally, in 2009 I've started my new period of activity in "On meaning..." blog. Now, my personal shift in paradigm concerns any information in general and programming in particular (as it is only one kind of information representation).

But you may ask why we need new paradigm if we already have Semantic Web (and microformats)? Semantic Web claims to be "Web of data" (and you can consider at the development of XML standards). But data is great for machines, what's about human beings? Semantic Web can help some company to create application for data but personal information is often too unique to create applications (or even ontology) for each kind of such information. Actually, namely therefore, Semantic Web still has no good representation of own data for humans, which only underlines its machine-oriented nature. Moreover, it is not clear if usage of triples is expedient at all. Creators of Semantic Web claims that triples can represent any sort of semantic data, which proven in AI researches long before the era of Internet. In fact, a triple is not only abstraction which may claim this, the same may relate to object-oriented or relational model which may represent just any data. Why to put a lot of efforts in creating, for example, new triple based databases if relational ones may represent any data too?

Why we need to transform unary, binary, and other kind of relations to triples? Only because usual order of natural language sentence is "subject-predicate-object"? But it is not true in natural language too: we can have a phrase with one word or quite complex sentences with more than three elements. But why namely "subject-predicate-object"? Why natural language order usually includes "subject" and "predicate" at least? This order reflects space-time dualism of our reality. But because any action, in fact, is an interaction, therefore usually we also have "object" (to which some action is applied). But actions can be complex therefore we may have more than one "object": "I've send a letter to my friend by email". Are Semantic Web triples reflects time-space dualism? Obviously, no. The example from Semantic Web specification includes "sky" as "subject", "has color" as "predicate", and "blue" as "object". But "has color" is not action it is a verb, it does not occur (during some time period). A verb needed here only because we should comply a sentence with language rules, therefore we use "has color" (but, in fact, in some language, like Russian, a verb can be omitted at all, so literally such sentence would sound like "Sky blue"). Why "blue" is an object? True is such triple just a sort of abstraction, which can be applied to any entity even ignoring its real life features (similarly in object-oriented programming, everything is "object": objects, actions, attributes, etc). Theoretically, there is a lot of abstractions which may represent any kind of data, but why triple is preferred? Because of semblance with natural language rules? Why then triples ignore all richness of natural language, and ignores natural language rules ("blue" should be "attribute" not "object", which are different). In fact, Semantic Web specification states that "Much of this meaning will be inaccessible to machine processing and is mentioned here only to emphasize that the formal semantics described in this document is not intended to provide a full analysis of 'meaning' in this broad sense; that would be a large research topic." [RDF Semantics, Patrick Hayes, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210] Why then triples were chosen without full analysis of meaning?

Moreover, Semantic Web uses URI for identification. And again, we see not expedient usage of abstractions. URI uniquely identifies computer resources, why to use it for real life objects too? Computer identifiers is quite different in several aspects from natural language identifiers, because computer ones aimed to identify things precisely, whereas natural language ones can identifiy things as brief as possible, and precision is attained with the help of composite identifiers. At last, Semantic Web does not accentuate on abstracting (generalizing and specifying) which at least looks strange.

Shortcomings of Semantic Web are shown to prove there is a sense to have alternative way of handling meaning. But the next question is: can semantic programming propose such alternative? To answer this question we will examine "Introduction to Semantic Programming", starting from motivation indicated in it:
* Currently programs are thought of as text we could write with pen and paper.
* Programs are complex graphs we should manage with computers.
* Currently programs manage data in the local memory of one computer.
* All data should be managed as if it's persistent, remote data.
* Currently we're tempted to think about ever larger buckets of data.
* We should think of data as a graph distributed across many systems.
* Currently we mostly think of programs and data separately.
* Programs and data are just different parts of the same distributed graph.
* Currently we're tempted to build one ontology for the world.
* We should always expect to have to handle multiple ontologies.
Of course, all these points are reasonable but do they relate only to programming? All or almost all motivation items applied not only to programs but to any information. Therefore, we should think of not only programs and data separately, but also of information and programs too. Which means, for example, requirements, use cases, help, and other information should be parts of distributed graph too.

To understand what is semantic programming, we need to understand what is meaning. And the introduction gives the answer that "Meaning arises from the relationships between things and the behaviours of things" Correctness of this answer depends on meaning of words which it consists of. For example, what is "thing"? Is it only real world thing or conceptions and ideas included? In fact, there is real space-time and many layers of abstract space-times, which refer to other space-times (inlcuding real one), so, "thing" of real space-time is a thing, but any "thing" of other space-times is always a reference. Also, the introduction mentions "agents" similarly to Semantic Web, but isn't it evident that semantics is needed for real life human beings too (though later "agents" applied for humans too, but only abstractly)? Finally, there is no apparent statement that "relationship" is a reference too.

Further the article considers meaning as a set of more specific postulates:
  • A 'reference' has no atomic meaning in and of itself, but has meaning only in so far as an agent is able to manipulate, act upon or in some other way 'understand' the reference.
  • An agent understands the meaning of a reference if it is able to use it effectively.
  • The meaning of a given reference can depend on the context in which it's being used.
  • An agent can come to understand the meaning of a given reference in a given context in one of three ways:
    • Axiomatically, or in other words by having the meaning of the reference 'hard wired' for a given context.
    • Deductively, through the reference's relationships to other known references within the given context (e.g. if the agent knows that 'X is the same as Y' and already understand the meaning of 'Y' then it can derive the meaning of 'X')
    • Behaviourally, through the reference's behaviour in the given context (if we can probe the reference's referent and get responses back then we learn something new about the meaning of the reference)

This definition is not full (though it seems the author does not pretend to give full definition and call such notion "naive") and have some inexactitude behind:
- Axiomatic understanding is, in fact, identifying, which is the quickest way of understanding, by mere comparing by reference equality.
- Not all references indicate the exact meaning, "bird" can refer to a generalized representation of a bird, to the entire class of birds, etc. This is the main cause why meaning can be ambiguous, therefore a notion of generalizing/specifying and similarity (as a partial equality) should be a part of any (even naive) semantic theory.
- Any sentence (like natural language ones) is a set of references, which represents a composite reference (or a graph of semantic relations).
- There is no definition of a context. In fact, a context is also a composite reference (or a graph of semantic relations) by itself. And any agent is an fact is a set of contexts.

You may notice that at least third point creates a contradiction between how meaning is handled in natural language and in programming. Programming prefers unambiguous identifiers (like GUIDs), whereas humans would prefer composite identifiers which consists of ambiguous identifiers (words). But, in fact, the purpose of constructing composite identifiers is uniqueness too, though there it is not hard "global uniqueness" but rather a balance between uniqueness and brevity. Speaking with the author terminology, the more coinciding context two agents have, the less references should be provided for communicating meaning. Or in other words: even a part of composite identifier has meaning by itself. URI is the example of some middle ground between GUID and composite identifier, because it consists of identifiers which can have and have not meaning, but which is globally unique. What should be preferred? At least, it is clear that applications are created for humans, so we can't avoid the topic of composite identifiers. On the other hand, nothing prevents us from having translation between GUID and composite identifiers.

Further the introduction talks about specific traits of semantic programming, but we will stop here, because it is not quite clear if it is worthy to raise the problem of integration between programming and information. On this, depends how we see semantic programming: either it is yet another paradigm in namely programming or it should be a part of one semantic infrastructure, which should have broader vision:
- it should be the bridge between natural language and machine-oriented and optimized for domain formats/languages/protocols;
- we should use not only formats/ontologies which encompass the entire domains, but also atomic formats, which can be restricted with even one word, which implies usage of underrated identification (which understood by humans easier than even simple language of advanced search);
- semantics should be available not only for experts and machines, but also for general audience (but it also implies it should be accessible enough and supported by interface to be used by an ordinary user);
- semantics can be applied to any information (a web page, programming code or data, a binary file, etc);
- semantics should be supported in such way, it would provide testability of semantics for any information (that is, information of a web page can be tested with some questions, and a search engine or programming code should be able to find this information with equivalent queries in corresponding languages).

Can semantic programming satisfy this broader vision? It seems like no, but possible it even had not to intend to. Can Semantic Web satisfy it? Unfortunately, no. For that, it should shift from "intelligent agents" to "intelligent humans", which is so far not envisaged even in theory. The problem of semantics concerns not only everything which starts from "semantic" word, but also problems of a search, which is still efficient only in cases of simple and straightforward queries. But even here, we still see only quite humble attempts to implement parts of broader approach.

понедельник, 10 января 2011 г.

Is perfect search engine possible?

Did you ever think why a search retrieves million pages even for a quite simple query? The answer is generalization. We always generalize things because of many reasons: we have no time for details, we imply that others know what we know, we cannot make it more precise just because we are unable to do this, etc. A search cannot avoid generalizations, therefore it cannot magically "understands exactly what you mean and gives you back exactly what you want" (as Larry Page once described the "perfect search engine"). This is not a search fault it is just a guess, based on complex mathematical methods for finding words in database of billions pages. We cannot avoid generalizations and a search cannot avoid them too. Even if computers will be many times more powerful than they are today they won't be able to understand exactly what we mean and want, because generalization always lose a part of information, which cannot be recovered.

In the essence, generalization is simplification. No surprise, search engines try to avoid this problem by forcing users to make queries simpler and shorter. This follows the beginning of the famous quotation but not the ending: "Make everything as simple as possible, but not simpler". Only reaching a goal can determine if something was simpler than necessary or not. The goal for a search is to receive answer for any question. But how we can do it if we are advised to simplify queries? It is a way far from reality, where we ask as complex question as necessary. Such advices is similar to the situation when you ask someone with quite general words, he or she answers with general words too, everyone means something specific, which can coincide with other's specifics or contradict it completely. The bottom line: you have some chances to get correct answer, but it will always be only chances.

But what can be done with this situation? We need to be able to make information more precise (including questions like search queries). And solution should be both complex and simple simultaneously. In fact, it can be based on several simple ideas:

1. Anything has meaning.
2. Identified once, mean anywhere.
3. Generalize, specify, and combine information as necessary.

These ideas imply many things behind: (a) existing formats like email give meaning only to specific parts of information (subject, recipient, body), but there is meaning inside of a text in an email body too, (b) identification is simpler than specification, but it can matter more, for example, you know that a bird is a bird even without deep knowledge in ornithology, so "bird" as a name give you more information, than specifics like "it has one head and two wings", which is not always precise, can have exceptions, and other implications, (c) anything should be a part of one knowledge space vs. the current situation when information can be scattered in emails, information systems, files, etc, etc (d) reusing information is not less important concept than reusing code in programming.

More details can found at http://on-meaning.blogspot.com/2010/12/how-to-solve-problems-of-contemporary.html.