Introduction
Running median, moving median, continuous median, or median from the dynamic stream of integers are all the names for the same and well-known coding problem. You are given with dynamic stream of integers, coming one after another randomly and unsorted and you have to find the median of the current received set of integers.
1. Let’s first define what is median
The median is the “middle” value of a sorted set of numbers. To find the median, you must first sort your set of integers in non-decreasing order. Then, if there is:
- odd number of integers, the middle element is the median. For example, in the ordered set:
2, 5, 6, 8, 10
the median is6
. - even number of integers, there’s no middle element; the median is computed as the average of the two middle elements. Example in the ordered set:
3, 4, 7, 8, 10, 15
the median is(7 + 8) / 2 = 7.5
.
2. Formalize the dynamic stream statement
We need to write a function to get a median number of dynamic stream. Let’s think about the dynamic stream (running/moving/continuous) median as an array of numbers that you are reading in one after another and after each number you want to print the median of all numbers.
How we are going to do this?
3. Heap Data Structure
One of the most effective ways of solving this is a Heap Data Structure.
A Heap is a special Tree-based data structure in which the tree is a complete binary tree. There are in general two types of heap Max-Heap and Min-Heap.
In a Min-Heap:
- The root node has the minimum value.
- The value of each node is equal to or greater than the value of its parent node.
In a Max-Heap:
- The root node has the maximum value.
- The value of each node is equal to or less than the value of its parent node.
Actually, the Heap approach is the perfect solution for our problem because it allows us to efficiently pull out the biggest element (maximum value) or smallest element (minimum value):
When a number comes we will first compare it with the current median and put it to the appropriate Heap. If the new integer value is less than the current median, we put it into the max-heap else we put it to the min-heap.
4. Let’s turn to the code
In Java, the PriorityQueue
class represents a heap. As per definition PriorityQueue in Java is a special type of queue wherein all the elements are ordered as per their natural ordering or based on a custom Comparator supplied at the time of creation. Let’s divide the solution into 4 main steps.
STEP 1. getMedians
function
That’s going to take an integer array and return an array of doubles like this:
STEP 2. addNumber
method
that will take in number, priorityQueue
of the lowers and highers like this:
STEP 3. rebalance
method
Rebalancing works by moving the largest element from the max-heap to the min-heap, or by moving the smallest element from the min-heap to the max-heap:
STEP 4. getMedian
method
This method will look into two Heap sizes, if they are different take the top element from the larger Heap. If they are the same size we will need to average them:
Thank you for reading!
Github repo can be found here.
To connect with me please check my Github, LinkedIn or Twitter.
暂无评论内容