Thumbnail
Access Restriction
Subscribed

Author Marinov, Martin ♦ Nash, Nicholas ♦ Gregg, David
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Copyright Year ©2016
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data
Subject Keyword Algorithms ♦ Dataset ♦ Extremal sets ♦ Itemset ♦ Memoization ♦ Parallel ♦ Practical
Abstract The minimal sets within a collection of sets are defined as the ones that do not have a proper subset within the collection, and the maximal sets are the ones that do not have a proper superset within the collection. Identifying extremal sets is a fundamental problem with a wide range of applications in SAT solvers, data mining, and social network analysis. In this article, we present two novel improvements of the high-quality extremal set identification algorithm, $\textit{AMS-Lex},$ described by Bayardo and Panda. The first technique uses memoization to improve the execution time of the single-threaded variant of the AMS-Lex, while our second improvement uses parallel programming methods. In a subset of the presented experiments, our memoized algorithm executes more than 400 times faster than the highly efficient publicly available implementation of AMS-Lex. Moreover, we show that our modified algorithm's speedup is not bounded above by a constant and that it increases as the length of the common prefixes in successive input $\textit{itemsets}$ increases. We provide experimental results using both real-world and synthetic datasets, and show our multithreaded variant algorithm outperforming AMS-Lex by 3 to 6 times. We find that on synthetic input datasets, when executed using 16 CPU cores of a 32-core machine, our multithreaded program executes about as fast as the state-of-the-art parallel GPU-based program using an NVIDIA GTX 580 graphics processing unit.
Description Author Affiliation: Susquehanna International Group, Dublin, Ireland (Nash, Nicholas); Trinity College Dublin, Dublin, Ireland (Marinov, Martin); Lero, Trinity College Dublin, Dublin, Ireland (Gregg, David)
ISSN 10846654
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2016-04-05
Publisher Place New York
e-ISSN 10846654
Journal Journal of Experimental Algorithmics (JEA)
Volume Number 21
Page Count 21
Starting Page 1
Ending Page 21


Open content in new tab

   Open content in new tab
Source: ACM Digital Library