Presentation + Paper
7 June 2024 Integrating image-based LLMs on edge-devices for underwater robotics
Prabha Sundaravadivel, Preetha J. Roselyn, Vedachalam Narayanaswamy, Vincent I. Jeyaraj, Aishree Ramesh, Aaditya Khanal
Author Affiliations +
Abstract
Image-based Large Language Models (LLMs) are AI models that can understand the captured images and generate textual content based on the analysis of images or visual data. Incorporating the LLMs for assessing water quality, pressure, and environmental conditions can help analyze historical data and predict potential risks and threats in underwater environments. This can improve the intervention of autonomous underwater vehicles ( AUV) and remotely operated vehicles ( ROV) during emergencies where the visual data must be interpreted to make informed decisions. While LLMs are primarily associated with processing and generating text, they can be integrated with images through a process known as multimodal learning, where text and images are combined for tasks that involve both modalities. Implementing such frameworks is challenging when deployed in low-power microcontrollers primarily used in monitoring systems. This research proposes evaluating multimodal tokens to enable edge computing in bio-inspired robots to monitor the underwater environment. This can help break down large real-time videos into tokens of text-based instructions associated with the description of images. The mini-robots will transmit the collected “tokens” to the nearest AUV or ROV, where the image-based LLM will be deployed. We propose to evaluate this image-based LLM in our NVIDIA Jetson Nano-based AUV. In the proposed architecture, the mini-robots can move along the length of the water column to capture images of the underwater environment. Our proposed model is evaluated to generate texts for boat and fish images. This proposed framework with integrated image-based tokens can significantly reduce the response time and data traffic in underwater real-time monitoring systems.
Conference Presentation
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Prabha Sundaravadivel, Preetha J. Roselyn, Vedachalam Narayanaswamy, Vincent I. Jeyaraj, Aishree Ramesh, and Aaditya Khanal "Integrating image-based LLMs on edge-devices for underwater robotics", Proc. SPIE 13034, Real-Time Image Processing and Deep Learning 2024, 130340E (7 June 2024); https://doi.org/10.1117/12.3014446
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Data modeling

Visual process modeling

Image classification

Education and training

Image processing

Machine learning

Back to Top