image_caption_multi-label_classification

COMP5329 Deep Learning - Assignment 2

15 Apr 2021 - 31 May 2021

The aim of this study is to apply deep learning algorithms that could achieve high performance to classify images based on image information and caption into multiple labels. (The size of the model required less than 100MB.)

Applied fusion techniques to combine a pre-trained image-based classification model EfficientNet-B4 with a text-based classification model Bi-LSTM to make the classifier gain information from both image and caption contents.

Achieved a mean F1-score of approximately 90 within a reasonable amount of time and the final model size is 80.1MB. This project obtained a High Distinction mark.(98/100)

Code Instructions

There are four Python notebook files:

Best_Model_Full_Version.ipynb
Best_Classifier_Model.ipynb
Best_Image_Model_EffNetB4.ipynb
Best_Text_Model_Bi_LSTM_5329A2.ipynb

The "Best_Model_Full_Version.ipynb" file can automatically download the relevant models which are pre-trained in the other three files.

Please run the code with the following steps:

Upload the "Best_Model_Full_Version.ipynb" file to the Google Colab and open it.
Click "Run all" in the "Runtime" tab or press "cmd/Ctrl+F9" to run code blocks from the beginning.
In the "load and process dataset" section under the "Preparation" section, click the given link and follow the instruction to sign in to your Google Account, then copy and paste the google verification code back to the input area, the whole dataset will be downloaded and imported automatically.
The following sections will be run automatically and the running process should be completed within approximate 50 minutes. After that, you supposed to see the generated CSV file with predicted labels called "Predicted_labels.csv" in the "Files" area which is on the left-hand side.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

image_caption_multi-label_classification

COMP5329 Deep Learning - Assignment 2

15 Apr 2021 - 31 May 2021

Code Instructions

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Best_Classifier_Model.ipynb		Best_Classifier_Model.ipynb
Best_Image_Model_EffNetB4.ipynb		Best_Image_Model_EffNetB4.ipynb
Best_Model_Full_Version.ipynb		Best_Model_Full_Version.ipynb
Best_Text_Model_Bi_LSTM_5329A2.ipynb		Best_Text_Model_Bi_LSTM_5329A2.ipynb
README.md		README.md

janerjzou/image_caption_multi-label_classification

Folders and files

Latest commit

History

Repository files navigation

image_caption_multi-label_classification

COMP5329 Deep Learning - Assignment 2

15 Apr 2021 - 31 May 2021

Code Instructions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages