Presentation + Paper
12 April 2021 Analyzing a human-in-the-loop's decisions for the detection of data poisoning
Samantha S. Carley, Stanton R. Price
Author Affiliations +
Abstract
Human-in-the-loop (HITL) is the process of combining the power of a machine or computer system and human intelligence to develop human-aware machine learning (ML) models. The HITL process creates a continuous feedback loop between human and machine, enabling the trained model to continuously improve as edge cases present themselves without the need to fine-tune the model from scratch. Several advantages of utilizing HITL ML systems are to avoid bias, ensure consistency and accuracy, improve efficiency, and provide transparency. However, adding human involvement also invites human mistakes. Occasionally, a HITL system may actually degrade the algorithm rather than improve it: mislabeling an object in an object detection algorithm, incorrectly scoring an algorithm’s output, making misclicks and typos, and other human errors cause the HITL to make a mistake based on the facts presented to it. These errors being made by the user, intentional or not, can be considered a form of data poisoning. To understand the effects of a HITL’s choices on an ML model, several pieces of information during the HITL process can be observed, i.e.,: the time taken by the user to provide input on an output or on a specific object class, as well as an evaluation of the consistency of submitted valid inputs, among other factors. Information extracted from the HITL’s decision-making process can provide insights into whether poor choices are being made by the user (i.e., data poisoning) and identify where, when, and why these choices are being made. Many state-of-the-art models can be utilized for this work, such as ResNet-50, DarkNet-53, Xception, among others. However, for this work, we are less focused on the model being used and more focused on the procedure for tracking HITL performance to maximize model improvement. Nevertheless, this work will consider a pretrained model, though the approach will be model agnostic. The dataset used in this research is the publicly available “Flowers Recognition” dataset available on Kaggle.1
Conference Presentation
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Samantha S. Carley and Stanton R. Price "Analyzing a human-in-the-loop's decisions for the detection of data poisoning", Proc. SPIE 11746, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III, 1174616 (12 April 2021); https://doi.org/10.1117/12.2586260
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Systems modeling

Computing systems

Detection and tracking algorithms

Intelligence systems

Performance modeling

Feedback loops

Machine learning

Back to Top