Help us improve by providing feedback or contacting
Research Problem
Rationale / Hypothesis
Real World Application

This publication has active red flags for:

  • - Misrepresentation

Plant Disease Prediction using Deep Learning

Publication type:Research Problem
CC BY 4.0
Peer Reviews (This Version): (0)
Peer Reviews (All Versions): (0)
Red flags:


Sign in for more actions

Tomato plant disease detection

Sahil, Anshu Anand, Shallu Bashambu

Maharaja Agrasen Institute of Technology, Rohini, Delhi, India

Abstract: Tomato is a widely used crop in India. It plays a significant role in agriculture because it grows in less time and yields more tomatoes. Tomatoes are great source of vitamin C, potassium, vitamin K and they’re also the major dietary source of the antioxidant lycopene, which is linked to many health benefits, including cancer and heart attack. To increase the production of tomatoes it’s necessary to detect disease in it. There are many diseases present in tomato plants like early blight, late blight, leaf mold etc. Deep learning is useful in many sectors and it’s one of them, using deep learning we can the disease present in plants at early stages so that it can be cured. Deep learning model EfficientNet V2B3, ResNet50 and VGG16 are used in this problem. The three architectures are compared in which Efficient net gave better results than other two models.

  1. Introduction

Not only do plant diseases have a negative impact on agricultural production but they also pose a large-scale risk to food security, but they are also harmful to small-scale farmers whose lives rely on their crops for survival [1]. Tomato is one of the most caring food crops of India. This plant is grown in 0.458 M/ha area with 7.277 M mt production and 15.9 mt/ha productivity. The tomato crop is cultivated in all seasons but typically during winter and summer seasons. The crop cannot resist severe frost. It nurtures well under an average monthly temperature range of 21°-23°C but commercially it may be grown at temperatures ranging from 18°C to 27°C. [2]. The tree affected by diseases has stunted growth and may die in 6 years. The effects of this issue are evident in parts of the Southern US, including Alabama and Georgia. Early detection could have made a significant difference in these situations. Currently, the primary method for detecting plant diseases is through visual inspection by experts. This approach requires a substantial team of specialists and constant plant monitoring, which can be quite costly, especially for large-scale farms. Furthermore, in some countries, farmers lack the necessary resources or knowledge to seek expert advice. As a result, consulting with experts can be both expensive and time-consuming [3].

Machine learning and artificial intelligence algorithms like deep convolutional neural networks are helpful in many sectors and the plant disease is one of them because of their ability to learn the features from an image automatically and be able to predict the disease in the plant [4].

  1. Machine learning techniques

Machine learning enables the computers to learn on their own, like how humans learn, without any specific instructions. With the help of mathematical equations and statistics, machine learning can perform tasks on their own that are usually done by humans. In a machine learning model, the layers are inter-connected to each other and the previous layer output acts as the input in next layer [5]. There are many types of machine learning problems like:

  1. Supervised learning: In this kind of problem, the system is provided with both input and its output labels. This means that the computer is trained on pre-labeled data, enabling it to learn patterns and make correct predictions.

  2. Unsupervised learning: In this type of problem, the system is provided with input data, but without any corresponding output labels. This implies that the system must independently discern patterns or structures within the input data.

  3. Semi-supervised learning: It’s a type of problem that uses a combination of a small amount of labeled data and a large amount of unlabeled data for training. The goal is to use the unlabeled data to improve the learning accuracy of the model trained with the labeled data. This approach is particularly useful when labeling data is costly or time-consuming.

  4. Reinforcement learning: It’s a type of problem where an agent learns to make decisions by taking actions in an environment to achieve a goal. The agent is rewarded or penalized with a reward signal for its actions, with the aim of maximizing its total reward.

  5. Tools and techniques

Transfer learning is a powerful machine learning technique that leverages knowledge gained while solving one problem and applies it to a different but related problem. This approach is particularly beneficial in scenarios where labeled data is scarce or when training a large model from scratch is computationally intensive. By utilizing models pre-trained on extensive datasets, we can significantly reduce the training time and computational resources required.

In our study, we have used transfer learning on plant disease detection. We selected models such as EfficientNetV2B3, ResNet50, and VGG16 due to their proven track record in image classification tasks. These models have been pre-trained on the ImageNet dataset, which contains millions of images across thousands of categories.

  1. Deep learning models: To solve the plant disease problem we used models such as EfficienNetV2B3, ResNet50 and VGG16.

(i) EfficientNetV2B3: It’s a part of the EfficientNetV2 family which is known for smaller models and faster training. They can train faster than the state-of-the-art models and they’re also 6.8x smaller than those models. They focus on parameter efficiency, aiming to achieve higher accuracy with a smaller number of parameters. Some more recent works aim to improve training or inference speed instead of parameter efficiency. Examples, RegNet, ResNeSt, TResNet and EfficientNet-X [6].

(ii) ResNet50: It stands for residual network, it’s 50 layers deep in length which was introduced in 2015 by Kaiming. The model consists of 48 convolutional layers, one max pooling layer and one average pooling layer. It uses a bottleneck design for building blocks, which includes 1X1 convolutions that reduce the number of parameters and matrix multiplications.

(iii) VGG16: It consists of 16 layers and 3 fully connected layers, the filters used by the convolutional layers 3X3 with the stride of 1 pixel and same for padding, while max pooling layers have 2X2 filter with the stride of 2. It’s a simple model and it makes it easier to understand and implement.

(2) Data preprocessing: The images used in the project were resized to a uniform size of 128x128 pixels. It’s important for the effective training of the model. To optimize the training process, a batch size of 32 was used. This means that the model updates its weights after processing each batch of 32 images. The loss function used was sparse categorical cross-entropy, which is particularly suitable for multi-class classification problems where the classes are mutually exclusive. Additionally, the pixel values of the images were normalized to a range of 0 to 1. This normalization aids in the convergence of the model during training.

(3) Performance metrics: The model demonstrated strong performance across key metrics. It achieved an accuracy of 0.966, indicating correct predictions about 96.6% of the time. The recall and precision were approximately 0.965 and 0.966 respectively, suggesting a high rate of correct positive identifications and a high proportion of true positive predictions. The F1 score was 0.965, reflecting a balance between precision and recall. The confusion matrix further confirmed the model’s effective performance across all classes.

  1. Conclusion

This project demonstrated the effective application of the EfficientNetB3 model in predicting various tomato plant diseases with a high-test accuracy of 96.55%. The ability to accurately identify diseases such as the Mosaic virus and yellow leaf curl virus underscores the significant role of such technological advancements in preventing potential crop losses. The use of Convolutional Neural Networks (CNNs) and machine learning in this context marks a notable shift towards technologically driven agriculture. Looking ahead, there is immense potential for further research in enhancing the robustness of the model, expanding the scope of the dataset, and exploring the development of real-time disease detection systems. This project thus stands as a testament to the transformative power of machine learning in modern agriculture.


No sources of funding have been specified for this Research Problem.

Conflict of interest

This Research Problem does not have any specified conflicts of interest.