The Titanic competition is based on the infamous shipwreck of Titanic in . The goal is to create a model that predicts which passengers survived the Titanic shipwreck.
- I did a detailed analysis of the input features in order to understand what impact does each feature has on the target.
- After selecting suitable features, I developed strategies to fill in missing values and created suitable encoding schemes for non-numerical features.
- I developed a complete pipeline to automatically perform the necessary data preprocessing.
- Then, I tested a lot of commonly used classification models. From which, I found out that Support Vector Machine Classifier and Random Forest Classifer are the most promising.
- Finally, I ran grid searches on these two classifiers to find the best set of parameters. The best model achieved roughly 82% accuracy on cross-validation, and I used this model to predict which passengers survived in the test set.