-
Import Libraries:
- Import necessary libraries including TensorFlow, Keras layers and models, ImageDataGenerator for data augmentation, Matplotlib for plotting, and NumPy for numerical operations.
-
Load CIFAR-10 Dataset:
- Load the CIFAR-10 dataset which consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. Split into 50,000 training and 10,000 test images.
-
Normalize Images:
- Normalize pixel values of the images to be between 0 and 1 for better convergence during training. This is done by dividing the pixel values by 255.
-
Split Training Data:
- Use
train_test_split
from scikit-learn to split the training data into training and validation sets. Set aside 20% of the training data for validation purposes. This helps in monitoring the model's performance on unseen data during training.
- Use
-
Build the CNN Model:
-
Initialize a Sequential model and add the following layers:
-
First Convolutional Layer: Add a Conv2D layer with 32 filters, a kernel size of 3x3, ReLU activation function, and input shape of (32, 32, 3). Follow this with a MaxPooling2D layer with a pool size of 2x2.
-
Second Convolutional Layer: Add a Conv2D layer with 64 filters and a kernel size of 3x3, ReLU activation function. Follow this with a MaxPooling2D layer with a pool size of 2x2.
-
Third Convolutional Layer: Add a Conv2D layer with 64 filters and a kernel size of 3x3, ReLU activation function. Follow this with a MaxPooling2D layer with a pool size of 2x2.
-
Flatten Layer: Flatten the output from the convolutional layers to feed it into the dense layers.
-
Dense Layer: Add a Dense layer with 64 units and ReLU activation function.
-
Output Layer: Add a Dense layer with 10 units (one for each CIFAR-10 class) and softmax activation function to output probabilities for each class.
-
-
-
Compile the Model:
- Compile the model using Adam optimizer, sparse categorical cross-entropy loss (suitable for integer labels), and accuracy as the evaluation metric.
-
Show Model Summary:
- Display the model architecture with
model.summary()
to visualize the layer structure and parameter counts.
- Display the model architecture with
-
Data Augmentation:
- Create an ImageDataGenerator instance to perform data augmentation. This includes random rotations, shifts, shear, zoom, and horizontal flips to artificially expand the training dataset and improve model generalization.
-
Train the Model:
- Fit the model using the augmented data generated by
ImageDataGenerator
. Train for 30 epochs, with steps per epoch calculated as the number of training samples divided by the batch size. Validate on the validation dataset created earlier.
- Fit the model using the augmented data generated by
-
Evaluate the Model:
- Evaluate the trained model on the test dataset to calculate the test loss and accuracy. Print the test accuracy.
-
Plot Training and Validation Accuracy/Loss:
- Plot the training and validation accuracy and loss over epochs using Matplotlib to visualize the model's performance and check for overfitting or underfitting.
-
Save the Trained Model:
- Save the trained model to a file named
cnn_image_classification_model.h5
for later use.
- Save the trained model to a file named
-
Load the Saved Model:
- Load the saved model from the file to make predictions on new data.
-
Preprocess and Predict New Images:
- Define functions to preprocess new images (resize to 32x32, convert to array, and normalize) and to predict the class of a new image using the trained model.
-
Example Predictions:
- Uncomment and modify the lines provided to test the prediction function on new images by specifying the image paths. The predicted class is printed for each image.
This step-by-step explanation covers the entire process from data loading and preprocessing, model building and training, to saving, loading, and using the trained model for predictions.