Scalable Containerized Pipeline for Real-time Big Data Analytics
التاريخ
2022المؤلف
Aurangzaib, RanaIqbal, Waheed
Abdullah, Muhammad
Bukhari, Faisal
Ullah, Faheem
Erradi, Abdelkarim
...show more authors ...show less authors
البيانات الوصفية
عرض كامل للتسجيلةالملخص
With the widespread usage of IoT, processing data streams in real-time have become very important. The traditional data-stream processing systems are inefficient in processing big data for detecting anomalies, classifications, clustering, and prediction in real-time using minimal resources. In this paper, we address this limitation by proposing a scalable pipeline for real-time processing of big data streams. Our proposed solution is capable of dynamically managing resources for different components of the pipeline using automatic scaling. The pipeline is containerized and deployed on a Kubernetes cluster. The proposed scalable pipeline is evaluated using a case study of anomaly detection in IoT data. The proposed solution yields a x 1.31 to x 2.4 increase in throughput, and x 32 to x 80 decreased latency compared to the commonly used static resource allocation strategy for data pipelines. 2022 IEEE.
المجموعات
- علوم وهندسة الحاسب [2382 items ]