Thumbnail
Access Restriction
Open

Author Macherey, Klaus ♦ Och, Franz Josef ♦ Ney, Hermann
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword German Train Timetable Information System ♦ Reusability Occurs ♦ Philip In-house Taba Corpus ♦ Stochastic Grammar ♦ Nlu Component ♦ Statistical Machine Translation ♦ Manual Effort ♦ Past Year ♦ Text Corpus ♦ Different Alignment Model ♦ Application Scenario ♦ Sole Usage ♦ Telephonebased Machine Inquiry System ♦ Target Sentence ♦ Large Part ♦ Conceptual Annotated Corpus ♦ Rule-based Method ♦ Natural Language Understanding ♦ Automatic Speech Recognizer ♦ New Domain ♦ Automatic Dialogue System ♦ Source Sentence ♦ Dialogue Manager ♦ Formallanguage Target Sentence ♦ Investigated Approach ♦ System Consist
Description In European Conf. on Speech Communication and Technology
Over the past years, automatic dialogue systems and telephonebased machine inquiry systems have received increasing attention. In addition to an automatic speech recognizer and a dialogue manager, such systems consist of a natural language understanding (NLU) component. Some of the most investigated approaches to NLU are rule-based methods as Stochastic Grammars, which are often written manually. However, the sole usage of rule-based methods can turn out to be inflexible and the problem of reusability occurs. When extending the application scenario or changing the application's domain itself, a large part of the set of rules often must be rewritten. Therefore, techniques are desirable which help to reduce the manual effort when building up an NLU component for a new domain. In this paper we investigate an approach to NLU, which is derived from the field of statistical machine translation. Starting from a conceptual annotated corpus, we describe the problem of NLU as a translation from a source sentence to a formallanguage target sentence. Doing this, we will mainly focus on the quality of different alignment models between source and target sentences. Even though the usage of grammars cannot be totally avoided in NLU-systems, it is our goal to reduce their employment and learn the dependencies between words and their meaning automatically. Experiments were performed on the Philips in-house TABA corpus, which is a text corpus in the domain of a German train timetable information system. 1.
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2001-01-01