You found me!

Short-term Prediction of Mortgage Default

Master's Thesis, Slippery Rock University


Reference  
Title Short-term Prediction of Mortgage Default using Ensembled Machine Learning Models
Date July 2018
Link Research Gate
Author(s) Jesse Sealand

Abstract

The predictive abilities of machine learning models depend on the quality of the data on which they are trained. The model must be trained with data representative of the conditions of which future predicted events will occur, in order to be accurate and reliable. Current mortgage default models are typically predicted over a longer term than representative training data is available. The greatest number of mortgages default during the second year of the loan therefore we restrict our prediction window from the lifetime of the loan to just 12 months into the future. We see that this method is both effective and reliable but the caveat is that this method only becomes useful if the models are reusable. We will determine how reusing pretrained machine learning models, on new datasets, affects their predictive capabilities over time.

Conclusion

This method of using short term predictive models trained on annual datasets results in both high precision and high recall. The best performing algorithms are a mix of boosting classifiers and ensembled decision trees with parameter optimization resulting in significant performance increases, especially for boosting algorithms. Model ensembling resulted in a few select cases with higher f1 scores than single models, where averaging class probabilities is a preferential method of combining class predictions. Ultimately, reusing pretrained models is shown to be useful in certain cases, but the effect of their predictive capabilities over time are either irregular or inconsistent.