Treffer: OptiFlexSort: A Hybrid Sorting Algorithm for Efficient Large-Scale Data Processing

Title:
OptiFlexSort: A Hybrid Sorting Algorithm for Efficient Large-Scale Data Processing
Contributors:
Department of Computer Science, Regentropfen University College, Upper East, Ghana., Department of Computer Science, C. K. Tedam University of Technology and Applied Sciences, Navrongo, Ghana.
Source:
Journal of Advances in Mathematics and Computer Science. 40(2):67-81
Publisher Information:
CCSD; Journal of Advances in Mathematics and Computer Science, 2025.
Publication Year:
2025
Original Identifier:
HAL: hal-05049169
Document Type:
Zeitschrift article<br />Journal articles
Language:
English
ISSN:
2456-9968
Relation:
info:eu-repo/semantics/altIdentifier/doi/10.9734/jamcs/2025/v40i21970
DOI:
10.9734/jamcs/2025/v40i21970
Accession Number:
edshal.hal.05049169v1
Database:
HAL

Weitere Informationen

Efficient sorting of massive datasets is a cornerstone of data-intensive applications, yet traditional sorting algorithms often face scalability challenges as data volumes grow exponentially. This study introduces OptiFlexSort, a novel hybrid sorting algorithm designed to enhance scalability while maintaining the inherent efficiency of Quicksort. OptiFlexSort incorporates an optimized last-element pivot selection strategy, leveraging median-of-three considerations to improve pivot quality, and an adaptive partitioning mechanism that dynamically adjusts partition sizes based on data distribution characteristics, using a threshold-based approach to balance partition efficiency. To evaluate its performance, comprehensive experiments were conducted on randomly generated integer datasets ranging from 1,000 to 1 million elements. Implemented in Python, OptiFlexSort was benchmarked against established algorithms, including Merge Sort, Heapsort, Radix Sort, and external merge sort implementations (STXXL and TPIE). Each test was repeated twenty times to ensure statistical consistency. The results demonstrate that OptiFlexSort achieves a 10-15% improvement in execution time over Merge Sort and Heapsort across all dataset sizes. For datasets of 50,000–100,000 elements, its performance was statistically indistinguishable from Radix Sort, with differences of less than 2%. For datasets exceeding 200,000 elements, OptiFlexSort achieved a 5-8% reduction in execution time. Notably, for datasets exceeding hundreds of thousands of elements, it outperformed advanced external merge sort implementations, highlighting its robustness and scalability. This study contributes to the field of sorting algorithm design by presenting a highly efficient, scalable, and adaptive solution tailored to the demands of modern big data applications and large-scale data processing. OptiFlexSort represents a significant step forward in addressing the challenges posed by exponentially growing datasets, offering a practical and efficient solution for large-scale data processing. While the algorithm excels on uniformly distributed integer datasets, future work will explore its adaptability to other data types and distributions, further broadening its applicability.