Paper
10 October 2023 Large-scale taxi trajectory data processing based on spark distributed computing
Kai Sun, Changxin Song
Author Affiliations +
Proceedings Volume 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023); 1279915 (2023) https://doi.org/10.1117/12.3006166
Event: 3rd International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 2023, Kuala Lumpur, Malaysia
Abstract
With the rapid development of urban transportation, taxi trajectory data has become an important resource. However, processing large-scale taxi trajectory data requires addressing issues such as error analysis and processing, and traditional data processing techniques are no longer sufficient to meet these requirements. Therefore, in order to efficiently process these massive amounts of data, this paper proposes a preprocessing model for massive taxi trajectory data based on Spark distributed computing. This model designs a dataset based on the elastic distributed dataset RDD, which is used to record the structural information of taxi trajectory data. At the same time, the model provides SQL statements for manipulating and managing taxi trajectory data and returns the processed results in the form of a dataframe. By utilizing Spark's parallel computing power and distributed storage characteristics, this model can effectively solve the preprocessing problem of massive taxi trajectory data. The experimental results show that the model has high reliability and processing efficiency in processing massive taxi trajectory data. It lays a solid foundation for subsequent trajectory data mining and analysis.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Kai Sun and Changxin Song "Large-scale taxi trajectory data processing based on spark distributed computing", Proc. SPIE 12799, Third International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2023), 1279915 (10 October 2023); https://doi.org/10.1117/12.3006166
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data processing

Distributed computing

Data modeling

Global Positioning System

Transportation

Parallel computing

Statistical analysis

Back to Top