CitDet: A Benchmark Dataset for
Citrus Fruit Detection

Jordan A. James1, Heather K. Manching2, Matthew R. Mattia3, Kim D. Bowman3,
Amanda M. Hulse-Kemp2,4, and William J. Beksi1
The University of Texas at Arlington1, Robotic Vision Laboratory
North Carolina State University2, Hulse-Kemp Laboratory
Subtropical Insects and Horticulture Research Unit, USDA Agricultural Research Service3
Genomics and Bioinformatics Research Unit, USDA Agricultural Research Service4

Abstract

Despite the fact that significant progress has been made in solving the fruit detection problem, the lack of publicly available datasets has complicated direct comparison of results. For instance, citrus detection has long been of interest to the agricultural research community, yet there is an absence of work, particularly involving public datasets of citrus affected by Huanglongbing (HLB). To address this issue, we enhance state-of-the-art object detection methods for use in typical orchard settings. Concretely, we provide high-resolution images of citrus trees located in an area known to be highly affected by HLB, along with high-quality bounding box annotations of citrus fruit. Fruit on both the trees and the ground are labeled to allow for identification of fruit location, which contributes to advancements in yield estimation and potential measure of HLB impact via fruit drop.


Dataset

The CitDet dataset is composed of images captured at the USDA Agricultural Research Service Subtropical Insects and Horticulture Research Unit in Fort Pierce, Florida, between October 2021 and October 2022. The orchard contains a large assortment of citrus tree species and is used for genomics and phenotyping research involving citrus infected with HLB. The dataset consists of over 32,000 bounding box annotations for fruit instances contained in 579 high-resolution images. While imaging the trees, we faced the camera in a portrait orientation directly centered on the tree of interest. All images were taken at the edge of the soil in the tree row to simulate a ground-based robot imaging the tree while moving between two rows of trees.

Data was collected over the course of one year to allow the fruit to be imaged at different stages of the ripening cycle during citrus production from October to March. Due to the nature of HLB and the mixture of fruit maturity, the dataset contains fruit of different colors and sizes. All trees were imaged from both the sunny and shady sides of the tree row and imaging was done over multiple days to account for variations in weather and lighting conditions. The imaged trees consist of over 50 different varieties of citrus. All images include a timestamp, tree ID, and the side of the tree imaged in the file name.


Citation

If you find this project useful, then please consider citing both our paper and dataset.

@article{james2024citdet,
  title={CitDet: A Benchmark Dataset for Citrus Fruit Detection},
  author={James, Jordan A and Manching, Heather K and Mattia, Matthew R and Bowman, Kim D and Hulse-Kemp, Amanda M and Beksi, William J},
  journal={arXiv preprint arXiv:2309.05645},
  year={2024}
}
@data{T8/QFVHQ5_2024,
  title={{CitDet}},
  author={James, Jordan A and Manching, Heather K and Mattia, Matthew R and Bowman, Kim D and Hulse-Kemp, Amanda M and Beksi, William J},
  publisher={Texas Data Repository},
  version={V1},
  url={https://doi.org/10.18738/T8/QFVHQ5},
  doi={10.18738/T8/QFVHQ5},
  year={2024}
}

License

The source code associated with this project is licensed under the Apache License, Version 2.0. The CitDet dataset is available for non-commercial use under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License ("CC BY-NC-SA 4.0").