Since its creation in 2001, random forest methodology has emerged as a powerful technique for making predictions in classification and regression problems. This nonparametric approach is especially popular when working with complex datasets that contain many predictor variables. In this talk, we examine how random forests are grown and discuss their use in predicting student retention in science, technology, engineering, and mathematics (STEM) majors at Iowa State University. We then explore the impact of outliers on random forest predictions and describe a new method for improving the robustness of random forest regression.
Refreshments at 3:00 pm, outside of Davis 216