Next: Baseline model performance Up: A Functional Theory of Previous: Discussion

Evaluation

Examinations are formidable even to the best prepared, for the greatest fool may ask more than the wisest man can answer.
-Charles Caleb Colton

Observing a computer model reading a story might appear to be compelling evidence that the overall theory which it instantiates is correct. Unfortunately, this is a false intuition. One problem is that the model may not be an accurate implementation of the theory being developed. In this case, the chapters describing the theory and the chapter giving the details of the implementation have shown that the ISAAC model is a close implementation of my theory of creative reading. Areas which are underdeveloped are clearly explicated, while areas of theoretical importance are described and design decisions are defended. But, simply having an ``accurate'' implementation does not permit one to conclude that the theory is correct. One still needs to evaluate the performance of the model as it executes the behavior it is designed to accomplish. Simply observing a running program is a subjective measure--it is possible to believe that the program is performing well but there is no objective proof of this. In fact, it is dangerous to rely on this method of evaluation; the ELIZA program ([#!ai:weizenbaum1!#]) convinced many people of its ``intelligence'' which actually resulted from rather simple pattern matching techniques. A more objective method of evaluation than observation of performance is needed. The next section establishes an empirical method of determining the model's baseline performance. Then, Section 8.2 uses this model evaluation to make some conclusions about the validity of the theory itself. Taken together, these two methods of evaluating the research shows the overall range of both theory and model.

Next: Baseline model performance Up: A Functional Theory of Previous: Discussion

Kenneth Moorman
11/4/1997