Skip to content

victorcai101/loan_delay_detection

Repository files navigation

This is my code for an in-class Kaggle Competetion. The competition was modified form a loan-delay-detection competition by erasing the column names and adding random noise as new features.

For preprocessing, I divide the features into categorical and numerical and implement the null values with either most frequent values (for categorical features) or mean values (for numerical features). I then dummy encoded the categorical features and standardize the the numerical features.

For feature engineering, I emplement backward elemination on a fine tuning random forest model. PCA was not quite useful in this case.

My final best model is a two-level stacking model. I used Extra Trees, AdaBoost, Gradient Boost and Random Forest as first level classifier, and used another Random Forest as second level classifier.

Releases

No releases published

Packages

No packages published

Languages