It was our project in the Modern Information Retrieval course at the Sharif University of Technology. It's in three phases.
The explanation of each phase and its implemented parts is as below:
In this phase, we implemented the base information retrieval algorithms. The algorithms are ltn.lnn
, ltc.lnc
, and okapi25
. We also implemented some compressions types (Gamma Code
and Variable Byte Code
). At last, we evaluate our algorithms' implementations.
In this phase, we implemented Naive Bayse classification, classification with neural networks, classification with language models, and transformer-based classification. We also enhanced our search engine. There are some evaluations for the classifications, too.
In this phase, first, we implemented a crawler for semanticscholar.org website. Then we implemented a personalized PageRank algorithm and by using it, we create a personalized search. After that, we ranked the authors of our crawled papers. The next part is a recommender system with collaborative filtering and content-based algorithms and their evaluations. At last, we create a beautiful UI which name is Amoogle
(a combination of Amir Mohammad and Google) for the search engine in the first phase.
It's a React project. Run the below commands respectively:
- First, run the last 6 cells of the
Pahse 3/ir-phase-3.ipynb
file. It's somehow the backend of the UI. - Then run
npm install
in thePahse 3/search_engin_ui
directory. - At last
npm start
in thePahse 3/search_engin_ui
directory.
The implemented UI in the third phase for the first phase's search engine is as below: