π½π Performance of data analysis in taxi trips in NYC and creation of a Random Forest Regressor in order to predict the duration of taxi trips. This project was part of the course "Data Mining Techniques", as taught in 2022 by professor Dimitris Gunopulos.
The project covers the following main topics:
- Data cleansing
- Data analysis on variables (number of passengers etc.) and visualization of the results
- Clustering (and finding the optimal number of clusters)
- Random Forest Regression
- GridSearchCV
- Map creation with pickup and dropoff points
This project is the result of a collaboration with Giorgos Nikolaou.