Description

Extracting factual information from news articles and other text sources is an important application task for natural language processing (NLP) technology. Information extraction (IE) systems have been developed for government, corporate, and personal information needs, such as extracting facts about terrorism, plane crashes, vehicle launches, management succession, joint ventures, corporate acquisitions, job postings, rental ads, and web pages. Information extraction technology has improved dramatically in the last several years, but the accuracy of IE systems is still relatively low. A major problem faced by IE systems is that natural language texts often contain both factual and non-factual information.

Subjective language expresses an opinion, a judgement, or an assessment that is not certain. Subjective language is extremely common in news stories and web pages, which are exactly the text resources most commonly used by most information extraction applications. Subjectivity analysis is potentially useful for many NLP applications, such as answering questions from multiple perspectives, filtering inflammatory messages from listservs and email, text categorization, and summarization. The resources developed for subjectivity analysis will be made freely available, so other researchers can experiment with subjectivity in NLP applications.

The goals of our research are to use subjectivity analysis to create more accurate information extraction systems, as well as to improve the state-of-the-art in subjectivity analysis by using information extraction techniques.

This project is funded by the National Science Foundation.