Thumbnail
Access Restriction
Subscribed

Author Lin, J.C.-W. ♦ Wensheng Gan ♦ Tzung-Pei Hong
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2013
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science ♦ Social sciences ♦ Commerce, communications & transportation ♦ Natural sciences & mathematics ♦ Physics ♦ Electricity & electronics ♦ Modern physics ♦ Earth sciences ♦ Geology, hydrology & meteorology ♦ Technology ♦ Medicine & health ♦ Engineering & allied operations ♦ Applied physics ♦ Other branches of engineering
Subject Keyword Sequential analysis ♦ Data mining ♦ Database systems ♦ Pattern recognition ♦ Tree data structures ♦ Algorithm design and analysis ♦ data mining ♦ Sequence deletion ♦ pre-large concept ♦ dynamic databases ♦ FUSP-tree structure ♦ data mining ♦ Sequence deletion ♦ pre-large concept ♦ dynamic databases ♦ FUSP-tree structure
Abstract Among the discovered knowledge, sequential-pattern mining is used to discover the frequent subsequences from a sequence database. Most research handles the static database in batch mode to discover the desired sequential patterns. In the past, the fast updated (FUP) and Fast UPdated 2 (FUP2) concepts were adopted to, respectively, maintain and update the discovered sequential patterns with sequence insertion and sequence deletion based on the designed FUP sequential pattern (FUSP)-tree structure. Based on the FUP or FUP2 concepts, original customer sequences are required to be rescanned if it is necessary to maintain and update the unpromising (small) sequences from the original database. In the past, pre-large concept was designed to keep the prelarge itemsets as the buffer to avoid the database rescan each time whether transaction insertion or deletion in the dynamic databases. In this paper, the prelarge concept is adopted to handle the discovered sequential patterns with sequence deletion. An FUSP tree is first built to keep only the frequent 1-sequences from the original database. The prelarge 1-sequences are also kept in a set for later maintenance approach. When some sequences are deleted from the original database, the proposed algorithm is then performed to divide the kept frequent 1-sequences and prelarge 1-sequences from the original database and the mined 1-sequences from the deleted customer sequences into three parts with nine cases. Each case is then processed by the designed algorithm to maintain and update the built FUSP tree. When the number of deleted customer sequences is smaller than the safety bound of the prelarge concept, the original customer sequences are unnecessary to be rescanned, but the sequential patterns can still be actually maintained and updated. Experiments are conducted to show the performance of the proposed algorithm in terms of execution time and the number of tree nodes.
Description Author affiliation :: Dept. of Comput. Sci. & Inf. Eng., Nat. Univ. of Kaohsiung, Kaohsiung, Taiwan
Author affiliation :: Innovative Inf. Ind. Res. Center, Harbin Inst. of Technol., Shenzhen, China
ISSN 21693536
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2014-01-01
Publisher Place U.S.A.
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Volume Number 2
Size (in Bytes) 5.32 MB
Page Count 10
Starting Page 1374
Ending Page 1383


Source: IEEE Xplore Digital Library