Knowledge Representation for the Web

Not all of the work in the Center is in the area of hardware agents. We are also interested in the behavior of software agents which carry out tasks in abstract spaces such as the World-Wide Web. A major issue emerging in this area is the format of “metadata” or data about the content of web pages and other sources of information and computation. Such metadata must be expressed in formal languages, and will be useless unless standard languages that many different agents can process with reasonable efficiency are developed. Drew McDermott and his students are focusing on the issues that arise in this area.

A classic trade-off in representation systems is between expressivity and tractability. The more that can be said in a representation system, the harder it is to figure out what a particular expression in the system is saying. Hence there is pressure to specialize representation systems to particular domains. For example, the paper industry is likely to develop a notation for talking about attributes of paper, measurements of paper volume, ordering procedures for batches of paper, and so forth; while the real-estate industry develops notations for properties of houses, bidding systems, negotiation procedures, and so forth. Standards are now being hammered out for making sure that these notations share at least the same syntactic framework. The dominant syntax is likely to be XML, the eXtensible Markup Language. But work on semantics is still in its infancy. The need for coherent semantics is likely to become more urgent for two reasons: There may well be legal consequences of providing unclear metadata; and there will be serious interoperability problems as agents that speak different languages try to communicate. The main thrust of McDermott’s group is on translation techniques that address the second of these looming problems, but developing these techniques inevitably involves clarifying fundamental semantic issues.