Efficient Modeling and Prediction of E-commerce Recommendation Systems: A Case Study on Amazon Product Reviews
This project explores the predictive potential of e-commerce review data by analyzing Amazon beauty product reviews (2004–2018). It employs machine learning to predict product ratings and builds a recommendation system to provide insights into customer preferences and product popularity.
- Rating Prediction: Binary classification of reviews (5-star vs. 1-4 stars).
- Recommendation System: Collaborative filtering based on reviewer-product engagement.
- Feature Engineering: TF-IDF vectorization, sentiment scoring, and datetime transformations.
- Dimensionality Reduction: Principal Component Analysis (PCA) to improve efficiency and prevent overfitting.
- Yulong Dong
- Daniel Sitompul
- Khanh Thai
- Research Question: Can customer reviews predict product ratings?
- Models Used: Logistic Regression, Random Forest, and Stacking Models.
- Performance Metrics: ROC Curve, AUC, Confusion Matrix, and RMSE.
- Video Presentation: Watch here
The dataset includes 5.2k reviews from Amazon beauty products, featuring:
- Review Text
- Summary
- Product Style
- Reviewer Information
- Ratings and Votes
Data Source: UCSD's open-source Amazon Review dataset.
- Exploratory Data Analysis: Identifying missing data and reviewing feature distributions.
- Feature Engineering:
- Textual Data: Sentiment analysis and TF-IDF vectorization.
- Temporal Data: Transforming Unix timestamps to datetime features.
- Model Training:
- Logistic Regression as baseline.
- Random Forest for non-linear classification.
- Dimensionality reduction using PCA.
- Recommendation System: Built using collaborative filtering and Singular Value Decomposition (SVD).
- Model Performance:
- Random Forest outperformed Logistic Regression, achieving an AUC of 0.99.
- PCA reduced model complexity by 85% while preserving accuracy.
- Recommendation System:
- Provided highly correlated product suggestions based on user-product engagement.
For questions or feedback, contact:
- Daniel Sitompul: daniel_sitompul@berkeley.edu
Let me know if you need further assistance!