Thumbnail
Access Restriction
Subscribed

Author Hershberger, John ♦ Shrivastava, Nisheeth ♦ Suri, Subhash
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Copyright Year ©2009
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data
Subject Keyword Convex hull ♦ Data streams ♦ Geometric data
Abstract We consider the following problem: given an on-line, possibly unbounded stream of two-dimensional (2D) points, how can we summarize its spatial distribution or $\textit{shape}$ using a small, bounded amount of memory? We propose a novel scheme, called $\textit{ClusterHull},$ which represents the shape of the stream as a dynamic collection of convex hulls, with a total of at most $\textit{m}$ vertices, where $\textit{m}$ is the size of the memory. The algorithm dynamically adjusts both the number of hulls and the number of vertices in each hull to best represent the stream using its fixed-memory budget. This algorithm addresses a problem whose importance is increasingly recognized, namely, the problem of summarizing real-time data streams to enable on-line analytical processing. As a motivating example, consider habitat monitoring using wireless sensor networks. The sensors produce a steady stream of geographic data, namely, the locations of objects being tracked. In order to conserve their limited resources (power, bandwidth, and storage), the sensors can compute, store, and exchange ClusterHull summaries of their data, without losing important geometric information. We are not aware of other schemes specifically designed for capturing shape information in geometric data streams and so we compare ClusterHull with some of the best general-purpose clustering schemes, such as CURE, $\textit{k}-medians,$ and LSEARCH. We show through experiments that ClusterHull is able to represent the shape of two-dimensional data streams more faithfully and flexibly than the stream versions of these clustering algorithms.
ISSN 10846654
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2009-02-01
Publisher Place New York
e-ISSN 10846654
Journal Journal of Experimental Algorithmics (JEA)
Volume Number 13
Page Count 25
Starting Page 2.4
Ending Page 2.28


Open content in new tab

   Open content in new tab
Source: ACM Digital Library