Access Restriction

Author Niculescu-Mizil, Alexandru ♦ Caruana, Rich
Source CiteSeerX
Content type Text
Publisher AUAI Press
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Roc Area ♦ Decision Tree ♦ Isotonic Regression ♦ Full Decision Tree ♦ Usual Exponential Loss ♦ Neural Net ♦ Good Accuracy ♦ Decision Stump ♦ Poor Performance ♦ Weak Model ♦ Logistic Correction ♦ Calibration Method ♦ Platt Scaling ♦ Log-loss Work ♦ Complex Model ♦ Posterior Probability
Description Boosted decision trees typically yield good accuracy, precision, and ROC area. However, because the outputs from boosting are not well calibrated posterior probabilities, boosting yields poor squared error and cross-entropy. We empirically demonstrate why AdaBoost predicts distorted probabilities and examine three calibration methods for correcting this distortion: Platt Scaling, Isotonic Regression, and Logistic Correction. We also experiment with boosting using log-loss instead of the usual exponential loss. Experiments show that Logistic Correction and boosting with log-loss work well when boosting weak models such as decision stumps, but yield poor performance when boosting more complex models such as full decision trees. Platt Scaling and Isotonic Regression, however, significantly improve the probabilities predicted by both boosted stumps and boosted trees. After calibration, boosted full decision trees predict better probabilities than other learning methods such as SVMs, neural nets, bagged decision trees, and KNNs, even after these methods are calibrated.
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2005-01-01
Publisher Institution in Proc. 21st Conference on Uncertainty in Artificial Intelligence (UAI’05