Treffer: Scalable and Cost-Efficient ML Inference: Parallel Batch Processing with Serverless Functions

Title:

Scalable and Cost-Efficient ML Inference: Parallel Batch Processing with Serverless Functions

Authors:

Barrak, Amine, Ksontini, Emna

Source:

ICSOC 2024, 22nd International Conference on Service-Oriented Computing

Publication Year:

2025

Collection:

Computer Science

Subject Terms:

Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning

Document Type:

Report Working Paper

Access URL:

http://arxiv.org/abs/2502.12017

Accession Number:

edsarx.2502.12017

Database:

arXiv

Weitere Informationen

As data-intensive applications grow, batch processing in limited-resource environments faces scalability and resource management challenges. Serverless computing offers a flexible alternative, enabling dynamic resource allocation and automatic scaling. This paper explores how serverless architectures can make large-scale ML inference tasks faster and cost-effective by decomposing monolithic processes into parallel functions. Through a case study on sentiment analysis using the DistilBERT model and the IMDb dataset, we demonstrate that serverless parallel processing can reduce execution time by over 95% compared to monolithic approaches, at the same cost.

Treffer: Scalable and Cost-Efficient ML Inference: Parallel Batch Processing with Serverless Functions

Weitere Informationen

Links

Zusatz-Funktionen