What is an algorithm? I like to think that an algorithm is a set of rules specifying how to solve a problem. An algorithm can be defined outside of computer science, in Mathematics, nature or traffic. In computer science, an algorithm is abstract and does not have a machine implementation.
The most important ‘tool’ to manipulate while coding is data structures
, and precisely that is the starting point of the road to algorithms. Data structures are simply a way to organize data. However, to be more specific, who’d say better than Wikipedia, right?
A data structure is a collection of data values, the relationships amongst them, and the functions or operations that can be applied to the data.
It is the right time to tackle the topic of complexity analysis
. This is a process of determining how efficient an algorithm is. Therefore, we shall focus on two dimensions, time and space complexity.
Time complexity is a measure of how fast an algorithm runs, and it is expressed using Big O
notation.
Space complexity is a measure of how much auxiliary memory an algorithm takes, and it is also expressed by using Big O
notation.
Memory is a big topic, therefore we could define here some base-lines. Let’s say that memory is the place, or more precisely a layer where all data is stored. How? Well, there are the following points:
- Data stored in memory is stored in bits and bytes.
- Bytes are able to point to other bytes in memory. (Referencing other data).
- Byte sizes, like 4 bytes, or 8 bytes in the case of 32-bit or 64-bit integers.
Big O notation is the most important tool to generalize the time or space complexity of an algorithm. These values aren’t fixed, so they may vary depending on the input. Therefore, we measure from the most performant to the least performant as, O(1) as constant complexity, or O(log(n)), O(n), O(n^2), O(n^3)..etc..etc.
Data structures are generally presented in four forms:
- Linear: arrays, lists.
- Tree: binary, heaps, space partitioning etc.
- Hash: distributed hash table, hash tree etc.
- Graphs: decision, directed, acyclic etc.
Having these tools mentioned and collected in our ‘backpack’, we have pretty much everything to tackle, define, and then write our algorithms in the most efficient way to a given occasion.
Hands on..
How we are going to use all these items that we have in our ‘techy backpack’? Well, we need a problem to start. Fair enough, here is one, an easy one to begin.
Here is the problem from LeetCode . Exactly what we need.
Two Sum
Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
You can return the answer in any order.
For an instance:
Input: nums = [2,7,11,15], target = 9
Output: [0,1]
Output: Because nums[0] + nums[1] == 9, we return [0, 1].
Enter fullscreen mode Exit fullscreen mode
Now, when we have understood our problem, we shall cheer. ‘Yeeees!!!’
We know how to tackle the problem, we open our ‘backpack’ and choose our tools.
class Solution {
public int[] twoSum(int[] nums, int target) {
for (int i = 0; i < nums.length -1; i++) {
int firstNumber = nums[i];
for (int j = i + 1; j < nums.length; j++) {
int secondNumber = nums[j];
if (firstNumber + secondNumber == target) {
return new int[] {firstNumber, secondNumber};
}
}
}
return new int[0];
}
}
Enter fullscreen mode Exit fullscreen mode
‘WooooooW’, it works!
Hmmm… Yes, it works indeed, but as we are beginners on this road, we have used brute force
technics, which means that we were passing through our array with two integers, respectively i
and j
, and then we have tried to sum up every combination between the two. Therefore we have used, two nested for loops to iterate throughout the array. Shall we look at the already mentioned Big O
and its time and space complexity?
Alright, the space complexity we’ve got is O(1)
. Splendid, couldn’t be better, a constant space, which means no additional memory was used to store the data. Now then, we end up with an O(n^2)
time complexity. The reason for this was iterating through two nested loops in regards, to sum up, every combination. Well, we have to say that our algorithm seems to be very slow. What if we would have an n-sized array instead of just four integers in our example. Well, that would be very slow. So, can we do it better, now?
Let’s give it another try.
class Solution {
public static int[] twoSum(int[] nums, int target) {
Set<Integer> numbers = new HashSet<>();
for(int number : nums) {
int match = target - number;
if (numbers.contains(match)) {
return new int[] { match, number };
}
numbers.add(number);
}
return new int[0];
}
}
Enter fullscreen mode Exit fullscreen mode
It works again! We are doing fantastic, I must say. Let’s talk again to our Big O
friend. Now we have a slightly better situation. We have improved our time complexity from O(n^2)
quadratic time to the O(n)
time. We got rid of the inner for loop, and we iterate only once throughout an array. We have introduced the HashSet data structure, and it helps us to store the potential matching combinations, and after every iteration, we look up the hash table to see if we summed up to the target number. Way faster than the previous solution. We have improved the time complexity significantly, and our algorithm is way faster now. However, by using the HashSet, we allocated some memory. Remember, we had to store our matching combinations somewhere, and that means our space complexity disimproved. It is O(n)
right now, instead of O(1)
in the previous example.
So what’s the better approach now?
Well, in the first example we had O(n^2) time | O(1)
space complexity, and
now we have O(n) time | O(n)
space complexity. I think we have improved our algorithm in general. Now, let’s see if we can do even better.
class Solution {
public static int[] twoSum(int[] nums, int target) {
Arrays.sort(nums);
int left = 0;
int right = nums.length - 1;
while (left < right) {
int sum = nums[left] + nums[right];
if (sum == target) {
return new int[] { nums[left], nums[right] };
} else if (sum < target) {
left++;
} else if (sum > target) {
right--;
}
}
return new int[0];
}
}
Enter fullscreen mode Exit fullscreen mode
The third time success! Marvellous manoeuvre! Let’s go and talk to our Big O
friend there.
- Ok,
Big O
, what do you have for us this time? - Well, I have good news for you.
We might know that a good sorting algorithm like quicksort, mergesort or heapsort can run in O(n*log(n))
time. So that explains sorting an array straight on the top, and then we can find the sum in O(n)
time, and we do not use any additional space, so again we have improved the space complexity back to O(1)
. With this set up we are allowed to do the next thing. As the array has been immediately sorted, we can surely put the left pointer at the beginning of an array, and the right one at the very end of an array. We will sum up the elements in the first cycle, and compare if the sum is equal to the target number. In the case of truth, we simply return the positions of the pointers, accordingly, if the sum is less than the target number we will have to increase the left pointer for one position to increase the sum of pointers. If the sum is, however, bigger than the target number, that would mean that we have to move the right pointer to the one position to the left, in regards to decrease the sum of pointers.
To sum up, we have sorted an array immediately, and then we have iterated one time throughout the array. Finally, we were able to find our third solution in O(n*log(n)) time | O(1)
space complexity. This is a very cool solution but was it better than the previous one? Well, it depends on what are our needs. Do we want to care more about time or space complexity performance? I’ll just let you decide for yourself.
Final thoughts
This was a showcase of how we can apply various technics, memory, and data structures to write performant algorithms. We’ve started with brute force technics which is usually the least performant one, and then we have tried to see what could be the possible improvement, searching for the better tools inside of our ‘backpack’.
I like to use this step-by-step solution. “Try to improve!” An old adage in carpentry is to Measure Thrice, Check Twice and Cut Once!
暂无评论内容