applications vs. underlying level
Comments for the Eagles evaluation group:
>A question from George Doddington:
>Several presenters have argued for moving evaluation from the technical
>(lower) level toward the application (user) level. If evaluation is
>viewed as a means of making progress, this implies that these presenters
>believe that it is more important (or easier?) to make progress at the
>application level than on the underlying problems. Is this a fair
>representation of the presenters' perspective?
I have no idea what the presenters' perspective(s) were on this point, but
here is mine:
It seems to me that George's question frames an unnecessary and potentially
confusing dichotomy. I think we all agree that if we can solve the
"underlying problems" then we'll be able to apply those solutions to a wide
variety of "applications". Our real difficulty is that we don't know how
to articulte what the "underlying problems" *are*. For example, in the
area of spoken language dialog systems the question of how to measure
whether or not a system is really "understanding" a user is still open. A
general measure of "understanding" quality has never been developed.
Everyone agrees that this is critical, but there's wide disagreement about
what to measure and when. This is not because picking a measure is hard,
but rather because we don't really know what we mean by "understanding".
If we can agree on what we mean by "understanding" then we'll be able to
agree on a measure for it. One good way to get clearer on what we mean by
"understanding" is to make and test definitions in a variety of application
contexts. This approach has the advantage of yielding both
intermediate-term concrete results (as in ATIS) and an increment in
long-term comprehension of the overall problem.
Getting applications-level evaluation results is not necessarily juxtaposed
with making important progress on the underlying issues.
Thanks for listening,