Manuscript received October 20, 2022; revised December 7, 2022; accepted March 29, 2023.
Abstract—We have designed and developed an efficient priority queue data structure that utilizes buckets into which data elements are inserted and from which data elements are deleted. The data structure leverages hashing to determine the appropriate bucket to place a data element based on the data element’s key value. This allows the data structure to access data elements that are in the queue with an O(1) time complexity. Heaps access data elements that are in the queue with an O(log n) time complexity, where n is the number of nodes on the heap. Thus, the data structure improves the performance of applications that utilize a min/max heap. Targeted areas include big data applications, data science, artificial intelligence, and parallel processing. In this paper, we present results several applications. We demonstrate that the data structure when used to replace a min/max heap improves the performance applications by reducing the execution time. The performance improvement increases as the number of data elements placed in the queue increases. Also, in addition to being designed as a double-ended priority queue (DEPQ), the data structure can be configured to be a queue (FIFO), a stack (LIFO), and a set (which doesn’t allow duplicates).
Index Terms—Priority queue, Buckets data structure, big data, heap, performance
The authors are with Western Michigan University, Kalamazoo, MI 49008 USA.
Cite: James Rhodes* and Elise de Doncker, "An Efficient Priority Queue Data Structure for Big Data Applications," International Journal of Machine Learning vol. 13, no. 2, pp. 55-63, 2023.Copyright @ 2023 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).