Computer Vision Model

Training and validating a model to make a prediction

Overview

This computer vision model can be of service in agriculture, where farmers can take pictures of their plants, upload them to our website and obtain a result regarding the health of their crops. We trained the image classification model on a public dataset from Kaggle that contains images of healthy and infected plants. We chose to use a convolutional neural network algorithm that gets an accuracy of 90%. However, due to the large data dimensions, we had to resize the photos to obtain a shorter running time and fit in memory. We consequently normalized the data so that all values were between 0 and 1.

Data

We found a corn leaf infection dataset on the open-source platform Kaggle. It contains images of healthy and infected plants in portrait and landscape formats. There are 14 GB of data, but we resized all of the pictures so we would get a shorter running time. We decreased the photo's width and length to about 12 times smaller than their original size. We have approximately 1000 healthy images and slightly over 1100 infected photos.

Processing

The next step was normalizing and processing the data, which brought all the values between 0 and 1. The algorithm used is a convolutional neural network model, in which we have three max-pooling layers and a dropout layer. Adding a dropout layer solved the problem of overfitting. We also used dense layers to classify the results into two categories (infected or healthy). The data was split into three batches: 50% training, 20% validation, and 30% scoring. Our final accuracy was 91%.