Data structures and algorithms walk through in java
Part 2,
In part one of this series we introduce search algorithms, Linear and Binary search of input set and finalize that Binary search is a better approach but can only be use on sorted array and that had introduced us to a discussion of a new topic, Sorting algorithms and that’s exactly what we’re going to cover in this part 2 of the series. If you missed part 1 you can check it here.
GOALS:
• To be able to know the different ways of determining the run time complexity of a program (timing, counting operations and big ohh)
• why big ohh
• be able to form an equation from any given program and express it in asymptotic notation
• To be able to write selection sort in no time
Note:
It contains exercises. If you’re learning, you’re encourage to do the exercises.
“You only get it by doing and not by watching”__Fabala Dibbasey
Quick review:
In part 1 we’ve seen something like O(1), O(n), O(log n)! But what really are they? Big Ohh notation, exactly you’re right and what’s big ohh notation? Maybe google can help us.
Big ohh notation is use to express the growth of program’s run time as
input size grows. It does not need to be precise, it’s uses order of and not exact. It measures an upper bound on the asymptotic growth, aka order of growth.
Our Goal for measuring program Efficiency:
• To evaluate algorithms
• To evaluate scalability
• To evaluate in terms of input size
Why Big Ohh?
Surely, we can use different approaches such as timing and counting operations to measure the efficiency of a program. But why Big Ohh?
We’re going to use these three different approaches to measuring program efficiency on the sample program below.
public static long evaluateEfficiency(int inputSize) {
long sum = 0;
for(int i = 0; i <= inputSize; i++) {
sum += i;
}
return sum;
}
Timing:
We can use stop watch to time our program. Learn to use stop watch in java here.
// call the method in your main method with different inputSize.
startTime
evaluateEfficiency(30000000);
endTime
timeTaken = endTime – startTime
We noticed that time varies with different inputSize, It increases as inputSize increases. Good!
What else can you notice?
If nothing, not a big deal! But believe me if you’re using another machine different from mine you’re likely to have less timeTaken for every the same inputSize we passed in. For example, if we both pass in 30000000 into our method it took 52 second to complete in my machine. If it took less second than that in your machine is exactly what I expected else hahahah! I laughed at you because my machine is better than yours, Maybe not! The point here is,
Time varies in different Machines. Bad!
Implement the same program using recursion and time it?
There will be time different with the same input. Hint: use milliseconds instead of seconds.
That’s to say time varies with different Implementations of the same program, another limitation of timing to measures the program efficiency.
Limitation of timing approach for measuring program efficiency:
◦ time varies with different machines or computers
◦ time varies with different implementations
◦ time cannot express the relation between input and time
◦ time is not predicable base on small input
Due to these limitations timing is discourage. Now lets see Counting Operations to measuring program efficiency.
Counting Operations:
let us assumes that every operation is 1.
Operations are + - * / % = == < <= > >=
Note: we’ll also assume that System.out.println() and return is 1 operation.
public static long evaluateEfficiency(int inputSize) {
long sum = 0;
for(int i = 0; i <= inputSize; i++) {
sum += i;
}
return sum;
}
from the method, evaluateEfficiency:
before for loop:
long sum = 0; 1 Operation
for loop:
int i = 0; 1 Operation
i <= inputSize; 1 Operation
i++ 2 Operations, adding and assigning (i.e i = i + 1)
Inside for loop, sum += i; is 2 Operations, adding and assigning (i.e sum = sum + i)
All these Operations in for loop will be carry out n times where n is the inputSize.
Total Operation 1 + 1 + 2 + 2 = 6 for n times aka 6n
Outside for loop:
return sum; 1 Operation
Overall Operations:
operations before for loop + operations in for loop + operations after for loop
1 + 6n + 1 and that’s the total operation of our program.
Ohh Yes! Counting is better because:
• It depends on algorithms
• The same for any computer and
• It varies between different inputSize and can express the relationship between the input and count.
but still
• there’s no real definition of which operations to count
• varies for different implementations. Help me prov that please. Count the operations in the recursive implementation of the method, evaluateEfficiency(int inputSize). If you have not implement the recursive version of this method please just do that right now and count the operations in it.
We still need better way for measuring program efficiency, an approach which meet all of Our Goal for measuring program Efficiency, we specified above.
Big Ohh Notation:
Unless you jumped straight to this section, big ohh is not new to us anymore!
public static long evaluateEfficiency(int inputSize) {
long sum = 0;
for(int i = 0; i <= inputSize; i++) {
sum += i;
}
return sum;
}
Recalled we count this method under Counting Operations and this 1 + 6n + 1 was our equation.
Big Ohh notation which measures upper bound on the asymptotic growth. We said it does not need be exact but on the order of growth.
Ignores the additive and multiplicative constants and focuses on dominant terms.
O(n) Linear time complexity
More examples:
n^2 + 2n + 2 ignores both additive and multiplicative constants. Will be left with n^2 and n. n^2 dominate hence O(n^2) Quadratic time complexity
n^2 + 99999n + 72000 : O(n^2) ==> Quadratic time complexity
log(n) + n + 100 : O(n) ==> Linear time complexity
3*n*log(n) + 10n : O(n log n) ==> Log Linear or Linearithmetic time complexity
5n^30 + 2^n : O(2^n) ==> Exponential time complexity
434 + 600: O(1) ==> Constant time complexity
Hope you get the idea we ignores additive and multiplicative constants and focuses on dominant terms
Exercise:
Arrange the running time complexity of these asymptotic notations above in ascending order (Best to Worst without repeating any notation) and comment your answer.
Big ohh notation:
• describe the worse case which is the bottle neck when program runs
• evaluate algorithms
• independent of machines
• independent of implementations
• can express the rate of growth of a program relative to input size
We’ve done a lot! Let’s take 3 minutes break here:
Welcome back!
Sorting algorithms:
int[] list = { 3, 2, 8, 1, 4, 7, 9, 2, 5, 10, 6 };
Suppose I asked you to sort the given array, list above how will you go about it?
Let me Guess! You can rate my guessing in the comment please.
- You will first try to find the smallest elements in the list.
- You’ll either swap it with the first element in the list or move your smallest element to a new array lets call it mySortedArray
- You’ll repeat yourself until you’re done. ohhhh hope I make good guesses?
Let’s translate the same thing into java code
public static int[] selectionSort(int[] arr) {
for (int i = 0; i < arr.length; i++) {
//We're going to say our smallest element is at the first index in the list because that's the element we've seen so far
int smallest = arr[i];
// search for smaller element
for (int j = i; j < arr.length; j++) {
// yes
if (arr[j] < smallest) {
// swap
int temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
smallest = arr[i];
}
}
/*At the end of this inner for loop, smallest element will be
*positioned at index i of the outer for loop then i will be
*incremented. Inner for loop will start it's searching at i
*that's to say it's ignoring all elements at index less than
*index i.
*/
}
return arr;
}
Two more Exercises for you
- Count the operations for this method, selectionSort and from the equation determine the Big ohh notation (You should get O(n^2) Quadratic time complexity ).
- Implement the same algorithm finding the smallest element and move it to new sorted array to be return.
If you’ve done these two exercises Congratulations!
Now, can you tell me the differences between the one you implemented and the one we implemented together?
The one we implemented together is mutating the original array, It’s called impure function in functional programming. And while the one you implemented yourself is the pure function in functional programming terms because it return new array (this case sorted version) instead of changing the state of the original array that was passed into it.
I don’t know of you but I’m tired and I’m going terminate my recursion hahaaaa!
Next, part 3:
• Bubble sort
• Bogosort
• merge sort
Remember the topics of the series:
• Searching and Sorting Algorithms
• Data Structures (Stacks, Queues, Linked list, Graphs and more).
Author Fabala Dibbasey
at Twitter
at LinkedIn
Top comments (0)