Access Restriction

Author Stilo, G. ♦ Velardi, P.
Sponsorship IEEE Comput. Soc.
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2014
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Special computer methods
Subject Keyword Algorithm design and analysis ♦ Symbolic Aggregate approXimation ♦ event discovery ♦ Aggregates ♦ Time series analysis ♦ Clustering algorithms ♦ Twitter mining ♦ Twitter ♦ temporal text mining ♦ Complexity theory ♦ Acceleration
Abstract Temporal text mining (TTM) has recently attracted the attention of scientists as a mean to discover and track in real-time discussions in micro-blogs. However current approaches to temporal mining suffer from efficiency problems when applied to large micro-blog streams, like Twitter, now reaching an average of 500 million tweets per daay. We propose a technique, named SAX (based on an algorithm named Symbolic Aggregate Approximation) to discretize the temporal series of terms into a small set of levels, leading to a string for each terms. We then define a subset of "interesting" strings, i.e. Those representing patterns of collective attention. Sliding temporal windows are used to detect clusters of terms with the same string. We show that SAX is more efficient (by orders of magnitude) than other approaches to temporal mining in literature. In this paper, we experiment SAX on the task of event discovery over one year 1% world while Twitter stream.
Description Author affiliation: Dipt. di Inf., Sapienza Univ. di Roma, Rome, Italy (Stilo, G.; Velardi, P.)
ISBN 9781479941438
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2014-08-11
Publisher Place Poland
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 1.09 MB
Page Count 8
Starting Page 186
Ending Page 193

Source: IEEE Xplore Digital Library