The Semantic Web

The traditional web as we know it may not be around in its current form for too much longer. With the invention and evolution of semantically-based processing algorithms and artificial intelligence, we may soon be entering an era in which computers intelligently process and deliver personalised information to us, instead of merely parsing and providing documents.

The traditional Web architecture that is predominately in use today relies on the storage and retrieval of documents. A client-side Web browser requests a document from a server, which may make dynamic updates to that document as the client is viewing it, but the key medium of interchange remains a document. The computer knows nothing about the data on each document – it simply parses the documents, identifying key words and tags, and outputs the results in the specified human readable format. This traditional view of the Web is known as the Syntactic Web.

Search engines in the Syntactic Web, or Web 2.0, perform searches on these documents by locating search words in each document and returning results according to relevance. A typical search engine ranks results based on how many words were matched and the weight of each of those key words matched. Search terms are often given weight based on the frequency of their appearance in an entire document set. For example, a search for ‘Java Network Programming Book’ would assign far more weight to the words ‘Java’ and ‘Network’ than the word ‘Book’, which is a relatively common word.

Some problems with this kind of approach exist. One of the problems is that search results are very dependent on the terms used, so search results for ‘Archos graphics driver’ might miss or rank as low priority a highly relevant webpage with only the terms ‘Archos video driver’. The Semantic Web aims to reduce or eliminate such problems by allowing computers to understand complex human requests and provide the most relevant related data from sources all across the Internet.

In the Semantic Web, computers can find meaning in the data in documents by consulting a sort of ontological dictionary of key terms and using artificial intelligence algorithms, the computers can find, classify, and combine related data thus creating ‘Webs’ of semantically related data.

This process of ‘automated reasoning’ can be achieved by using a knowledge representation system whereby the computer can use a set of rules to make inferences about data it encounters. An example of the kind of rule set that could be applied to such a system is a data classification rule set such as ‘A hex-head bolt is a type of machine bolt’. By structuring data in this way, it is hoped that computers can construct webs of related information, perhaps spanning across hundreds of databases, allowing human users to employ computers to perform complex tasks for them (such as searching for podiatry specialists within a 25 km radius of a certain area) with a minimum of effort.

Some of the possible applications of the Semantic Web include data integration, in which related data stored in different locations and in different applications across the Internet are aggregated and integrated into a seamless entity. For example, a user might log into his online calendar to check anything is scheduled for a particular date.

Using the same application, he could view a past date and view photos uploaded onto Facebook on that date, or even view payments made out of his bank account on that date. Another possible application is resource discovery and classification – the ability to provide extremely accurate search engines within specific domains; such as a medical literature database.

There are, however, critics of the practical functionality and success of the Semantic Web. One such critic, Luciano Floridi of the University of Hertfordshire argues that, philosophically, the use of protocols and ontologies to describe relationships between data does not constitute ‘semantics’ in the real sense of the word – that is, it does not constitute an ability of the AI to discover meaning in the data or allow it to intelligently process data the way a human does.

Critics also suggest that the ontologies required to support the Semantic Web are far too complex to be able to be realistically implemented, given the almost infinite levels of abstraction at which humans can conceptualise data. For example, a human’s opinion about a restaurant is not only influenced by the types of food they offer – but by the value for money, location, demeanour and personality of the staff, décor, and an endless list of abstractions about the system itself, a restaurant. Critics suggest this type of complexity is too difficult to feasibly suggest it can be catalogued with current technical solutions such as RDF and XML.

Article by Tom Sprudzans


photo credit: alles-schlumpf via photopin cc

You may also like...