Zoekmachines - Final Project

Organized by uva_zoekmachines - Current server time: April 18, 2024, 9:59 a.m. UTC

First phase

Milestone 01 Inverted Index and Evaluation
Sept. 18, 2023, midnight UTC


Competition Ends
Oct. 31, 2023, midnight UTC

Milestone 01 Inverted Index and Evaluation

Start: Sept. 18, 2023, midnight

Description: For this milestone, we ask each team to implement the disk-based inverted index and adapt the current Starting Kit to work with their implemented inverted index. Each team should use the implemented disk-based inverted index to improve the existing frequency-based retrieval model. Furthermore, we ask each group to implement additional metrics such as P@K, nDCG@K, and Recall@K and report the performance of their model on the validation set in terms of these metrics.

Milestone 02 Pre-processing and Advanced Ranking

Start: Oct. 4, 2023, midnight

Description: In this milestone, we ask you to pre-process the documents (e.g., stemming, stopping, etc.) and rebuild your inverted index. First, we want to see the effect of pre-processing on the current ranking method and implement more advanced lexical ranking algorithms such as TF-IDF, BM25, QL, etc. Each team should implement at least two new ranking methods and compare the results with each other. Also, experiments on the effect of model parameters is required.

Milestone 03 Learning to Rank

Start: Oct. 11, 2023, midnight

Description: For this milestone, you would take the ranking of the previous milestone and re-rank for better performance. To do this, there are various options to choose from. Teams can focus on better feature selection/engineering and use the RankNet implementation in the starting kit. Furthermore, teams can investigate the performance of more advanced learning-to-rank models such as ListNet and LambdaMart. Also, you can test the model’s performance with different values of hyper-parameters.

Milestone 04 Hyper-paramater Tuning

Start: Oct. 18, 2023, midnight

Description: This is the last milestone where you will focus on improving your results even better. Now your pipeline is complete, and it’s time to investigate where you can reach better improvement. There are various options you can explore, such as: - testing different pre-processing techniques and checking the result; - tuning the hyper-parameters of your lexical rankers; - testing different sets of features for your learning to rank model; - tuning the hyper-parameters of your learning to rank model; - test the performance of your model with respect to different metrics. At the end of this milestone, you will make the final submission.

Competition Ends

Oct. 31, 2023, midnight

You must be logged in to participate in competitions.

Sign In