Semestral Projects
Project Topics
Each group will be allocated a unique project topic. We strongly encourage groups to choose one of the topics stated in the following list. However you are allowed to pick a topic of your choosing; nevertheless, it will be reviewed by the teacher’s committee and could be rejected/amended.
- Generating personalised diets: the food dataset is composed of recipe details and reviews from Food.com. Included in the data are culinary recipes and review texts. The objective is to develop personalized diets based on the user’s preferences. (Rodrigo)
- Debiasing reviews-based explicit feedback: users rate items based on their own discretion. However, some users are known to be more generous than otehrs. For instance, user A may give a product five stars with the comment "This is a nice product." However, user B would rate the identical product with the same language as a three stars. This project aims to develop NLP algorithms for removing bias from user evaluations based on users' reviews. (Rodrigo)
- (Deep) Inductive matrix factorization for cold start users (items) with implicit feedback: In a cold start scenario, a recommender must recommend items (users) for users (items) with whom it has never observed interaction. The objective of this project is to create (deep) matrix factorization for cold start scenarios. (Rodrigo)
- Comparison of optimization algorithms: Different optimization algorithms show a very different performance on different problems and architectures of neural networks. The objective of the project is to compare multiple optimization algos in different problems. Compare SGD, Adam, Lion (https://arxiv.org/abs/2302.06675) and one second order method, use BackPACK (https://github.com/f-dangel/backpack) to implement it. The comparison should be done on a few problems: Cifar10, some recurrent networs and transformers. We want to compare training speed, testing and training accuracy and memory requirements. (Petr)
- Image superresolution: Satellite data and weather predictions usually come in various spatial resolutions. Can we increase the resolution of weather predictions without a very difficult numerical procedure? The objective of this project is to try to improve the superresolution method applied to precipitation (https://gmd.copernicus.org/articles/14/6355/2021) using a method from https://arxiv.org/pdf/2209.13131.pdf. (Petr)
- Interpretable & Explainable Regression: Why should we stop using black box models (ttps://hdsr.mitpress.mit.edu/pub/f9kuryi8/release/8)? Because they can fail spectacularly and severe consequences, e.g. in health-care or criminal justice, and no one will notice. The aim of this project is to pick suitable datasets and compare performance of SOTA black-box models, their explanations and interpretable models. (Vojtech)
- Causal Discovery from Observational Data: Understanding causal relationships can help us build more robust models that can make accurate predictions and provide better explanations for their predictions. In many fields, such as medicine, economics, and social science, understanding causal relationships is essential for making effective interventions and improving outcomes. The task of this project is to pick suitable datasets and try SOTA methods and tools to discover causal relations from observational data, i.e. data that was not collected with causal discovery in mind. (Vojtech)
- Reconstruction of 3D scans of human pelvic bones: The goal of this project is to design a neural network that would be able to reconstruct the entire pelvic bone from bone fragments. We assume the neural network would be trained on a set of 3D models of complete pelvic bones and tested on virtually fragmented identical pelvic bones. In cooperation with Faculty of Science, Charles University - full description in Czech (Zdenek)
- Self-parking controller: Evolve a automatic self-parking neural networks-based controller for the JetRacer platform (https://www.waveshare.com/jetracer-ai-kit.htm). The car has a Jetson nano onboard computer and one front facing RGB camera. The idea is to mark a parking spot by a contrast marker (e.g. white sticker) and let the car automatically find the spot and park. (Zdenek)
- LLM: Explore and demonstrate effects of fine tuning on selected large language model. Select one of the smaller language models and corpuses and demonstrate effects of the fine tuning on various metrics. (Mirek)
- Retrieval Augmented Generation (RAG): Explore options and approaches to fact based text generation. Focus on accuracy and veracity of generated responses. Review and explore different options for accurate document subset selection. (Mirek)
Formal Requirements and Dates
The final project sumbission consists of: a report, presentation and the code.
The report, in 4-5 pages, should summarise your project and specifically answer following: - what problem you are solving and what data are you using, - what approaches you tried and why, - how they worked and any supporting material (graphs, tables), - what conclusion you can draw from the experiments.
Presentation will take place on the last week of the semester (or just before the first exam). It should in 10-15 minutes summarise your report.
Project Grades
The project will be graded as the following:
- Final report: 20 points
- Implementation / source code: 20 points
- Final presentation: 10 points