Flora.ai: Developed for Smart India Hackathon 2020


Flora.ai was developed for the Smart India Hackathon 2020. The problem statement we received from ISRO (Indian Space Research Organisation) was as follows:

Develop a mobile application that can identify crop using only field photo of a crop. The team must target at-least 10 different crops for demonstration. The application will allow the user to take photos and automatically identify the crop. The photo and crop information along with geolocation information should be stored in an internal database that can be exported/emailed.

The problem statement in and of itself was rather clear and simple. We needed a mobile application that could identify crops using a field photo of the crop. The problems that arose in our head was two-fold. A convolutional neural network would be the natural path to take to identify crops from photos, for which we'd need a large enough dataset of field crops of some kind. One would need a large number of field photos of important enough crops (like rice, wheat, etc.) for this to make any sense in the Indian perspective, which was what the basis of this problem statement was. On a second note, the logistics of integrating a CNN into a mobile app come into the picture. The application must be small enough in size for being useful for an agricultural worker to be able to conveniently download and use, even in conditions with bad internet and assuming their phones aren't running on the best hardware either.

A block diagram
Our team of 6 was split into 4 major tasks for most of the duration. The first task was to build an API in Flask that'll act as a middleware between the frontend and ML backend. The second task was dataset collection, which was a mixture of looking for datasets on Kaggle, scraping images from Getty and Google Images, as well as working on data augmentation and pre-processing to make a robust and useful dataset for as many crops as was possible. The third and fourth tasks, of course, were building the app UI on Flutter and making the neural network for the images respectively.

The dataset we ended up using was a modification of the PlantVillage dataset. While the PlantVillage dataset is more of a disease detection dataset, we ended up pooling together the images of the dataset into a single type of crop, and found that it works just fine that way, while also leaving space in our future scope for the project to be able to detect diseases in plants. We took the liberty of altering the definition of field images of crops to consider images of leaves of crops. The dataset contained images of 10 different plants (potato, tomato, bell peppers, etc.), most of which were relevant to Indian agriculture, and thus made the cut.

The neural network used 4 Conv2D+MaxPooling2D layers, which were then passed through 2 Dense (relu and softmax) layers. The dataset was split 80-10-10 using a train test split. The final accuracy obtained for the model as a whole was 93%. Our approach was loosely based around the work of [1] Stefan Gang Wu, using a probabilistic neural network with image and data processing techniques to implement a general purpose automated leaf recognition for plant classification (of course, a lot more diluted in its usage and implementation).

While admittedly, the model itself does look like beginner's work, it was one of my first projects where I implemented a CNN from scratch, visualising not just the layers for feature extraction, but also dealing with the inconsistencies of the dataset which we had scraped together from Google Images and Getty before we ended up using the PlantVillage dataset.

Finally, the app ended up looking like this.

The Prototype UI

References

[1] S. G. Wu, F. S. Bao, E. Y. Xu, Y. Wang, Y. Chang and Q. Xiang, "A Leaf Recognition Algorithm for Plant Classification Using Probabilistic Neural Network," 2007 IEEE International Symposium on Signal Processing and Information Technology, Giza, 2007, pp. 11-16, doi: 10.1109/ISSPIT.2007.4458016