Access Restriction

Author Yinglung Liang ♦ Yanyong Zhang ♦ Hui Xiong ♦ Sahoo, R.
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2007
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data
Subject Keyword Nearest neighbor searches ♦ Data mining ♦ Fault tolerance ♦ Support vector machines ♦ Support vector machine classification ♦ Runtime ♦ Predictive models ♦ Accuracy ♦ Supercomputers ♦ System performance
Abstract Frequent failures are becoming a serious concern to the community of high-end computing, especially when the applications and the underlying systems rapidly grow in size and complexity. In order to develop effective fault-tolerant strategies, there is a critical need to predict failure events. To this end, we have collected detailed event logs from IBM BlueGene/L, which has 128 K processors, and is currently the fastest supercomputer in the world. In this study, we first show how the event records can be converted into a data set that is appropriate for running classification techniques. Then we apply classifiers on the data, including RIPPER (a rule-based classifier), Support Vector Machines (SVMs), a traditional Nearest Neighbor method, and a customized Nearest Neighbor method. We show that the customized nearest neighbor approach can outperform RIPPER and SVMs in terms of both coverage and precision. The results suggest that the customized nearest neighbor approach can be used to alleviate the impact of failures.
Description Author affiliation: Rutgers Univ., Newark (Yinglung Liang; Yanyong Zhang)
ISBN 9780769530185
ISSN 15504786
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2007-10-28
Publisher Place USA
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 151.66 kB
Page Count 6
Starting Page 583
Ending Page 588

Source: IEEE Xplore Digital Library