Access Restriction

Author Tatti, N.
Sponsorship Toshiba
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2013
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data
Subject Keyword Itemsets ♦ Data mining ♦ Random variables ♦ Standards ♦ Vectors ♦ Reactive power ♦ real-valued itemsets ♦ pattern mining ♦ itemsets
Abstract Pattern mining is one of the most well-studied sub fields in exploratory data analysis. While there is a significant amount of literature on how to discover and rank item sets efficiently from binary data, there is surprisingly little research done in mining patterns from real-valued data. In this paper we propose a family of quality scores for real-valued item sets. We approach the problem by considering casting the dataset into a binary data and computing the support from this data. This naive approach requires us to select thresholds. To remedy this, instead of selecting one set of thresholds, we treat thresholds as random variables and compute the average support. We show that we can compute this support efficiently, and we also introduce two normalisations, namely comparing the support against the independence assumption and, more generally, against the partition assumption. Our experimental evaluation demonstrates that we can discover statistically significant patterns efficiently.
Description Author affiliation: Dept. of Inf. & Comput. Sci., Aalto Univ., Espoo, Finland (Tatti, N.)
ISBN 9780769551081
ISSN 15504786
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2013-12-07
Publisher Place USA
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 2.31 MB
Page Count 10
Starting Page 717
Ending Page 726

Source: IEEE Xplore Digital Library