Presentation + Paper
7 June 2024 Combining simulated data, foundation models, and few real samples for training object detectors
Friso G. Heslinga, Thijs A. Eker, Ella P. Fokkinga, Jan Erik van Woerden, Frank A. Ruis, Richard J. M. den Hollander, Klamer Schutte
Author Affiliations +
Abstract
Automatic object detection is increasingly important in the military domain, with potential applications including target identification, threat assessment, and strategic decision-making processes. Deep learning has become the standard methodology for developing object detectors, but obtaining the necessary large set of training images can be challenging due to the restricted nature of military data. Moreover, for meaningful deployment of an object detection model, it needs to work in various environments and conditions, in which prior data acquisition might not be possible. The use of simulated data for model development can be an alternative for real images and recent work has shown the potential for training a military vehicle detector using simulated data. Nevertheless, fine-grained classification of detected military vehicles, using training on simulated data, remains an open challenge.

In this study, we develop an object detector for 15 vehicle classes, containing similar appearing types, such as multiple battle tanks and howitzers. We show that combining few real data samples with a large amount of simulated data (12,000 images) leads to a significant improvement in comparison with using one of these sources individually. Adding just two samples per class improves the mAP to 55.9 [±2.6], compared to 33.8 [±0.7] when only simulated data is used. Further improvements are achieved by adding more real samples and using Grounding DINO, a foundation model pretrained on vast amounts of data (mAP = 90.1 [±0.5]). In addition, we investigate the effect of simulation variation, which we find is important even when more real samples are available.
Conference Presentation
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Friso G. Heslinga, Thijs A. Eker, Ella P. Fokkinga, Jan Erik van Woerden, Frank A. Ruis, Richard J. M. den Hollander, and Klamer Schutte "Combining simulated data, foundation models, and few real samples for training object detectors", Proc. SPIE 13035, Synthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications II, 1303509 (7 June 2024); https://doi.org/10.1117/12.3013375
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer simulations

Object detection

Detector development

Deep learning

Scene simulation

Back to Top