This repository contains my completed recruitment task, divided into three main sections. Each task has been broken down into subtasks, with detailed instructions and outputs documented in the individual README files within the respective task directories.
This task involved building computer vision and NLP models on the provided datasets for a Kaggle competition. Both tasks were completed successfully.
I developed a CNN model for DeepFake detection. Initially, I incorrectly assigned labels to the images, but I corrected this with a simple logic after the model predictions, avoiding the need for retraining. I also focused on using weight initialization techniques. Initially, I was only achieving 50% validation accuracy, with no improvement in loss during training. However, after experimenting with weight initialization, the model's accuracy jumped to 95%, confirming the importance of initialization in this project.
For the NLP subtask, I built a classifier model using Logistic Regression. I used GridSearchCV for hyperparameter tuning, optimizing the model for better performance.
This task involved developing models for image enhancement, with a comparison of different approaches based on evaluation metrics.
I trained a Variational Autoencoder (VAE) model to enhance underwater images, improving their clarity and quality.
I implemented a Generative Adversarial Network (GAN) model on the MNIST dataset, successfully generating new digit samples as part of this experiment.
For this subtask, I implemented a GAN model on the provided dataset. One key aspect of this project was working with a specified pix2pix loss function, which was fascinating to learn and apply.
I implemented a Diffusion Model for image enhancement. The base architecture was provided, and I built on that to enhance underwater images.
In this task, I developed a complete Retrieval-Augmented Generation (RAG) system. One challenge I faced was accessing the Llama model through an API key, which required access to a gated repository. After learning this was the wrong approach, I pulled the Llama3 model locally using Ollama and integrated it with Langchain. This task was a deep dive into RAG systems, Langchain, and AI agents, which significantly enhanced my understanding of these technologies.
This recruitment task provided an incredible learning experience, allowing me to dive into various new topics. I gained practical coding experience and developed real-world models in just a week—an opportunity that would have otherwise taken much longer if I had tackled these topics individually. It also highlighted the importance of applying theoretical knowledge in a hands-on setting.
Thank you for this opportunity!