Recommender Systems challenge - Politecnico di Milano, 2022

In this repository there is the code I used for the challenge of the Recommender Systems course at Politecnico di Milano.

The goal of the competition was to create the recommender system TV programs by providing 10 recommended products to each target user.

Link to the official website of the challenge

Evaluation

I arrived 2nd in the competition with 83 participating team. My final MAP in the private leaderboard is 0,06110.

Data

The dataset represents the interactions between the users and the items of a streaming platform. The item can have different types and different length (movies, TV series, ...).

A full description of the data is available at the challenge webpage. The main source file is interactions_and_impressions.csv which contains the interactions of each user with the items, for example:

ItemID	Impressions	Data
11	0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19	1
21		0
21		0
21	20,21,22,23,24,25,26,27,28,29	0

Where:

Data is 0 if the user watched the item, 1 if the user opened the item details page.
Impressions: string containing the items that were present on the screen when the user interacted with the item in column ItemID. Not all interactions have a corresponding impressions list.

Recommender

The recommender architecture is roughly the following:

The main body of the recommender is a linear hybrid which composes the item weights of its base recommenders (SLIM elastic net, item KNN, EASE_R, RP3beta, iALS).

The key point is that each base recommender uses a different tuned URM. The URM are composed taking into account the number of views and the number of "opening the details page" each user-item pair has, using the following formula: $$URM[user, item] = \log_b{(w_{views} \times n_{views} + w_{details} \times n_{details} + 1)}$$

Where the weights $w_{views}$ and $w_{details}$ are fund by hyperparameters tuning. To do this the base recommenders are first tuned on a default URM which uses $\log_2{(1 \times n_{views} + 0.8 \times n_{details} + 1)}$ and then the best model is tested against different values of $w_{views}$ and $w_{details}$ to find the best URM for that specific model. I have used the $log()$ function because it follows the behavior that the first interactions with an item are more important.

Once the best URMs have been found the models are composed in a linear hybrid and their weights found by hyperparameters tuning.

The last step is to include impressions. This is done using the impression discounting technique in which each item which was recommended but not chosen by the user is penalized; this has been implemented multiplying the score by the exponential of the impressions: $$new = old * e^{\alpha*k+\beta}$$ Where $k$ is the number of impressions while $\alpha$ and $\beta$ are found with hyperparameters tuning. Note that $\alpha < 0$ which means that the score of an item gets exponentially smaller with the increasing of the impressions.

Run this code

Prerequisites

Python 3.8
Poetry
Cython configured (C compiler)

Setup

Install packages with Poetry
In the Poetry environment compile the Cython modules with python run_compile_all_cython.py

Train and tune

To train and tune the recommender the following file have to be run sequentially:

1_tune_base_recommender_{RECOMMENDER_NAME}.py: trains and tune the base recommenders
2_tune_URM_{RECOMMENDER_NAME}.py: tune the URM of each recommender
3_tune_linear_hybrid.py: tune the linear hybrid of the base recommenders
4_tune_impression_discounting.py: tune the parameters of the impression discounting step
5_train_final_model_all_data.py: train the final model with all the data and create submission

Tuning

The hyperparameters tuning was done using a CCX31 Hetzner cloud instance which has 8 dedicated vCPU and 32 GB of RAM.

Credits

The code in this repository is inspired by MaurizioFD/RecSys_Course_AT_PoliMi, a repository used during the Recommender Systems course at Politecnico di Milano.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
Recommenders		Recommenders
cython_compiler		cython_compiler
data		data
data_manager		data_manager
evaluation		evaluation
hyperparameter_tuning		hyperparameter_tuning
images		images
utils		utils
.gitignore		.gitignore
1_tune_base_recommender_EASE_R.py		1_tune_base_recommender_EASE_R.py
1_tune_base_recommender_KNN.py		1_tune_base_recommender_KNN.py
1_tune_base_recommender_RP3beta.py		1_tune_base_recommender_RP3beta.py
1_tune_base_recommender_SLIM.py		1_tune_base_recommender_SLIM.py
1_tune_base_recommender_iALS.py		1_tune_base_recommender_iALS.py
2_tune_URM_EASE_R.py		2_tune_URM_EASE_R.py
2_tune_URM_KNN.py		2_tune_URM_KNN.py
2_tune_URM_RP3beta.py		2_tune_URM_RP3beta.py
2_tune_URM_SLIM.py		2_tune_URM_SLIM.py
3_tune_linear_hybrid.py		3_tune_linear_hybrid.py
4_tune_impression_discounting.py		4_tune_impression_discounting.py
5_train_final_model_all_data.py		5_train_final_model_all_data.py
LICENSE		LICENSE
README.MD		README.MD
first_data_exploration.ipynb		first_data_exploration.ipynb
impressions_exploration.ipynb		impressions_exploration.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run_compile_all_cython.py		run_compile_all_cython.py
split_impressions.ipynb		split_impressions.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommender Systems challenge - Politecnico di Milano, 2022

Evaluation

Data

Recommender

Run this code

Prerequisites

Setup

Train and tune

Tuning

Credits

About

Languages

License

paolobasso99/RecSys_PoliMi_challenge_2022

Folders and files

Latest commit

History

Repository files navigation

Recommender Systems challenge - Politecnico di Milano, 2022

Evaluation

Data

Recommender

Run this code

Prerequisites

Setup

Train and tune

Tuning

Credits

About

Topics

Resources

License

Stars

Watchers

Forks

Languages