Thumbnail
Access Restriction
Subscribed

Author Conlon, Sumali J. ♦ Simmons, Lakisha L.
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data
Subject Keyword Edgar. ♦ Information extraction ♦ Sublanguage analysis ♦ Business intelligence
Abstract CAINES, Content Analysis and INformation Extraction System, employs a semantic based information extraction (IE) methodology through a design science approach to extract unstructured text from the Web. Our system was knowledge-engineered and tested on an active business database by experts who use the database regularly to perform their job functions. We believe that by heavily involving business experts, we are able to advance our thinking about IS research. CAINES extracts information to meet three objectives that were deemed important by our experts: (1) understand what current market conditions impacted the growth of certain balance sheets (2) summarize management's discussion of potential risks and uncertainties (3) identify significant financial activities including mergers, acquisitions, and new business segments. These objectives were developed based on the advice of financial experts who regularly analyze financial reports. A total of 21 online business reports from the EDGAR database, each averaging about 100 pages long, were used in this study. Based on financial expert opinions, extraction rules were created to extract information from financial reports. Using CAINES, one can extract information about global and domestic market conditions, market condition impacts, and information about the business outlook. User testing of CAINES resulted in recall of 85.91%, precision of 87.16%, and an F-measure of 86.46%. Speed with CAINES was also greater than manually extracting information. Users agreed that CAINES quickly and easily extracts unstructured information from financial reports on the EDGAR database. This study highlights the significance of creating a semantic based IE system that addresses practical business issues and solves a true business problem with the knowledge of business experts.
Description Affiliation: University of Mississippi, Oxford, Mississippi, USA (Conlon, Sumali J.) || Belmont University, Nashville, Tennessee, USA (Simmons, Lakisha L.)
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2007-02-28
Publisher Place New York
Journal ACM SIGMIS Database: the DATABASE for Advances in Information Systems (DATB)
Volume Number 44
Issue Number 3
Page Count 15
Starting Page 34
Ending Page 48


Open content in new tab

   Open content in new tab
Source: ACM Digital Library