The need for interactive dialogue-based software
Computers can store very large quantities of information. But how can we access that information? The challenge is to find a system that anticipates your needs, taking into account:
- your level of knowledge; including what you have already learnt from the system
- your level of language linguistic ability: when you currently search on the Internet, sometimes the language is inappropriate, or you may not have the level of understanding needed to read it.
One way to address these issues is simply to look into the database itself. There are two problems with doing that:
- Information in databases is not usually presented in a way that is easy to understand. It's usually a lot of facts and figures in a whole range of fields, and you might need special knowledge to be able to look at this information and put it together to make sense of it.
- If you are seeking specific information, you will see all the information the database has on that thing, instead of just what you want to know. For example, when you conduct an Internet search, even though your topic might be quite refined you will still get lots of irrelevant information. You'll have to wade through a lot of information to find what you're after.
What is needed is a way of asking computers for particular information, and have the computer then take that information out of the database and present it in a way that is easy to understand.
Interactive and dialogue-based software is needed to select the relevant information, tailor it to our needs, and then automatically generate documents presenting that information.
- Predict the hardware requirements to do this.
- What are the issues in the relationship between the hardware and software?
Natural language generation
Natural language generation (NLG) techniques already exist that can take information from a database and dynamically produce and deliver documents to the user. But traditional NLG techniques are limited. They can pick out certain information you ask for and automatically create a document showing you this information. Natural language generation systems have been built to:
- tailor the document, that is, give you some basic background information if you're new to the topic, or skip that if you already know a bit about it
- take into account user knowledge
...the problem with... all encyclopaedias, is that the description you get of Beethoven is the same whoever it is that's reading it. So, if you are an expert in classical music or if you are a complete novice on classical music, it's still the same description. (Robert Dale)
- take into account previous dialogue that they just showed you on a previous page
- generate the document in the language you may need.
What we needed was to extend the functionality of NLG systems to be able to compare different items in the database, generate hypertext pages and take into account what was shown in a previous page.
A limitation of traditional NLG systems is their reliance on data immediately amenable to be used for language purposes. This is one of the major issues we wanted to tackle, as, without addressing this issue, traditional NLG techniques are much less flexible and useful than would be ideal.
The Power project challenge
The problem the Power project tackled was how to make a system that could take information out of a database and dynamically generate text presenting that information in a much more interactive way. This meant designing NLG software that went way beyond what the existing software could do.
Robert Dale explains:
The idea is that you have some underlying source of data, which might be numeric or symbolic but generally is not actually text; so it might be some database of some kind. So you've got an underlying source of data and you want to have technology that can take that data and present it to people in language, whether that be spoken or written. Just one benefit, amongst many, is that if your data is suitably well structured and abstract then you could consider having a machine produce descriptions of that data in different languages. So you could have an English description, a French description and so on. So that's what text generation technology is about; it's about being able to flexibly construct new text that has never been seen before, by a machine, on the basis of underlying symbolic data.
An earlier project carried out with the Scottish Museum in Edinburgh had built a system called ILEX. This was designed to write labels for museum objects; but not just the kind of basic labels that you currently see on museum exhibits. Robert Dale continues:
If you wanted the labels to be different depending on who read them, for example their age or where they came from, then you could do that using text generation technology. And if you wanted the labels to somehow be linked in terms of the history of objects that the visitor had seen or the objects they might see, then again the machine could do this in a much more flexible way than the human curator might be able to do.
These ideas were the starting point for the Power project.
Predict the goals of the project.
What effect will the following factors have on the finished program:
3. development process
Maria Milosavljevic was a student who joined us in '95 who was very interested in this area, one that Cécile and I had both worked in for a long time, says Robert. And so we started looking for a project in that general area, and the thing we came up with at the time was the idea of using that kind of technology to build descriptions of objects; in particular, where you compare those objects with other objects. The whole thing that drove this idea was that, if you have, say, a hundred objects that you know about, if you wanted to compare each one with each other one then you've got nearly ten thousand different possible comparisons. You have a lot of descriptions, and you couldn't conceivably do this write these by hand if you had a lot of objects to compare; comparing each object with every other object would be just too onerous a task to take on for an individual.
However, our idea was this; if the machine knows about all the objects then perhaps it could actually produce comparisons of these objects. You could just say to the machine, well I'm interested in how (a) this thing is different from (b) that thing , I'm interested in how (a) this thing is different from (x) some other thing, and in each case the machine can automatically come up with some comparative description. Maria's PhD work was focussed on that topic, and as a consequence we ended up with a demonstration system PEBA, that was an attempt to provide what was effectively a virtual encyclopaedia if you like that presented information in a dynamically changing way.
That had come out of some thinking we'd been doing about these technologies and how you might use them. Around about that time, I had read a review somewhere of an encyclopaedia, where the reviewer had said the problem with this encyclopaedia, as with all encyclopaedias, is that the description you get of Beethoven is the same whoever it is that's reading it. So, if you are an expert in classical music or if you are a complete novice on classical music, it's still the same description. And so we thought: well, you know maybe five years, ten years down the road we might have encyclopaedias which are much more dynamic, that would change depending on what you might know. Maria's work really was driven by those sorts of ideas, and so then we developed the PEBA system, which described animals in an encyclopaedic context. And one of the things this system can do is compare animals; and so you could say, "OK, how is an echidna different from a porcupine?" and the system would try and give you a characterisation of the difference, perhaps taking into account where you come from, for example.
The Power project set out to take the ideas from ILEX and PEBA-II, and develop those ideas further. Visit the project's web site.
So, in summary, what were the goals of the Power project?
The overall goal of the Power project was to design a system that could take real data and automatically generate documents tailored to the user's needs.
To do this the Power research team decided to work with real data from an existing database. Therefore they would be designing a system that would take into consideration the problems of an existing database rather than an artificially constructed database developed in a laboratory. To do that the research team had to confront the issue of how to get data from an existing database and make it available for a hypertext system without having to re-enter all the data by hand.
They approached the Powerhouse Museum, Sydney with a view to applying the software to a database of objects in the museum's collection. This meant taking knowledge contained in the Powerhouse Museum 's database of objects, and developing an abstract representation of that knowledge about individual objects, their properties, and the relationships among them. It meant developing a system that could automatically generate hypertext descriptions of individual objects and comparisons of pairs of objects from this underlying representation.
We knew roughly where we wanted to go, says Robert Dale. We had a rough idea of the kinds of functionalities that we wanted to have.
We focused in on the objects, how to describe them, how to compare them. And we wanted, in particular, to use the real data to do that and we wanted to present interesting descriptions. Those were our kind of top level requirements, somewhat vague, but they were enough to guide us in what we were doing.
In summary the goals of the project were:
- to work out ways to dynamically generate documents that vary to suit the user's needs
- to design and develop an architecture to dynamically generate online hypertext documents
- to work out ways of dynamically generating documents in different languages
- to find a way of automatically converting information from existing databases into a form that could be utilised by the natural language software.
Based on the information presented so far, explain what needs to be done for each of the following stages of the development process:
1. defining and understanding the problem
2. planning and designing the software solution
3. implementing the software solution
4. testing and evaluating the software solution
5. maintaining the software solution.