Well, it looks like the course is almost finished.
On a related note, there is quite a bit of time until the final exam, so hopefully by managing my time well I'll be well prepared for it. Our last lab was last week, it dealt with sorting. I find sorting is interesting because it deals with complexity and I find complexity to be quite interesting. In our last lecture, we learned about dynamic programming. We learned that dynamic programming is useful when recursion causes the same steps to be repeated often, as it did when we were creating the fib function. Memoization was also something we learned in lecture, which allows us to save the results of recursive calls and then look them up. The idea makes quite a bit of sense, especially when implementing the fib function. We also saw in lecture that recursion has limits, or specifically the amount of times we can recursively call a function has a limit. We learned that we can actually change this limit, which is actually called the maximum recursion depth. One question I have is why there is even a limit to begin with? Perhaps, the computer can't process that amount of calls and might just end up exploding? Even though it sounds a bit ridiculous, I don't think it's all that far-fetched. But it turns out the real reason is to avoid stack overflow. Which leads to the question of what exactly does stack overflow mean? Stack overflow occurs when a function/program is trying to access more memory beyond the call stack's bounds. Maybe not as hectic as the computer blowing up into smithereens, but stack overflow can result in the program crashing.
It looks like this is going to be my last post, but definitely adventures in comp sci never stop! I'm really looking forward to taking more comp sci courses and learning more about computers and programming and everything in between. I'm especially looking forward to learning C/C++. This was an exciting year for me, and especially grateful for having such an amazing prof! Thanks Dan!
Friday, April 4, 2014
Sunday, March 30, 2014
Sorting!
This week we learned about sorting. There are many different ways to sort a list, some more efficient than others. It's no coincidence that our knowledge of complexity can be applied to sorting.
There are many different ways to sort a list: selection sort, insertion sort, quicksort, and mergesort.
Selection sort works by finding the smallest item in a the unsorted sublist of the list and switching it with the item and the current position.
For example, if we had a list of [4, 5, 2, 1]
It would start by looking at the whole list, finding the smallest number and then switching it with the first number:
1, 5, 2, 4
Then it would look at the list from index 1 to 3:
And repeat the process:
1, 2, 5, 4
Look at the list from index 2 to 3:
1, 2, 4, 5
Insertion sort works by taking one element from the list and placing it in the correct place in the sorted sublist of the list.
For example, if we had the same list of [4, 5, 2, 1]
It would compare the first two numbers and switch them if necessary, here they are already sorted:
4, 5, 2, 1
Then it would compare the 2 to the 5 and to the 4, and correctly insert it in to the sorted part of the list:
2, 4, 5, 1
Then insert the 1 in to the right place:
1, 2, 4, 5
Both insertion sort and selection sort are O(n^2), which isn't all that efficient.
Quicksort is much faster than insertion and selection sort, but in a worst case scenario, it is also O(n^2).
I originally had some trouble understanding quicksort. Quicksort relies on recursion, and decides on a *pivot*. A pivot is supposed to help divide the list into two sublists, and recursion is carried out on those two lists. How do we decide on a good pivot? Does it matter what pivot we choose? If we choose the left most element for the pivot, it would result in the worst case scenario. For shorter lists, it would be okay to choose a random pivot, because the list is small and choice of pivot won't really affect the efficiency. On the other hand, if we had a really large list, choosing a random index, might divide the list quite unevenly, so it would be better to choose the median of the list.
There are many different ways to sort a list: selection sort, insertion sort, quicksort, and mergesort.
Selection sort works by finding the smallest item in a the unsorted sublist of the list and switching it with the item and the current position.
For example, if we had a list of [4, 5, 2, 1]
It would start by looking at the whole list, finding the smallest number and then switching it with the first number:
1, 5, 2, 4
Then it would look at the list from index 1 to 3:
And repeat the process:
1, 2, 5, 4
Look at the list from index 2 to 3:
1, 2, 4, 5
Insertion sort works by taking one element from the list and placing it in the correct place in the sorted sublist of the list.
For example, if we had the same list of [4, 5, 2, 1]
It would compare the first two numbers and switch them if necessary, here they are already sorted:
4, 5, 2, 1
Then it would compare the 2 to the 5 and to the 4, and correctly insert it in to the sorted part of the list:
2, 4, 5, 1
Then insert the 1 in to the right place:
1, 2, 4, 5
Both insertion sort and selection sort are O(n^2), which isn't all that efficient.
Quicksort is much faster than insertion and selection sort, but in a worst case scenario, it is also O(n^2).
I originally had some trouble understanding quicksort. Quicksort relies on recursion, and decides on a *pivot*. A pivot is supposed to help divide the list into two sublists, and recursion is carried out on those two lists. How do we decide on a good pivot? Does it matter what pivot we choose? If we choose the left most element for the pivot, it would result in the worst case scenario. For shorter lists, it would be okay to choose a random pivot, because the list is small and choice of pivot won't really affect the efficiency. On the other hand, if we had a really large list, choosing a random index, might divide the list quite unevenly, so it would be better to choose the median of the list.
Saturday, March 22, 2014
How Complex is Complexity?
This week we learned about complexity. Prior to learning about complexity, the name complexity itself would perhaps suggest that it is a complex subject (because who wouldn't want to explicitly state the difficulty of a particular topic in the subject name itself.) It turns out its focus is not to scare computer science students into thinking it is an extremely hard topic of the course, but it is actually to determine how complex certain functions are, ie their difficulty/efficiency.
In lecture, we wrote a function that was to return all the anagrams of a word, or a sequence of letters, since it didn't necessarily have to be a word itself. There were many approaches, some more efficient than others. One approach was to come up with all the permutations of the sequence of letters, and then see which permutations were actual words in the dictionary. This obviously is not very efficient considering the fact that if we had a sequence of letters that was 7 letters long, it would result in 5040 permutations. Imagine how long it would take for a sequence of letters longer than that. The other approach was the signature approach, which I thought was incredibly clever! The signature approach takes into account the fact that all anagrams have the same letters, so if the sequence was to be arranged in a specific way, it would be clear which words would be anagrams. So, as far as efficiency is concerned, this would be much better.
We know a function is more efficient than another function when its runtime is relatively shorter than that of another function. But there are other factors, such as the computer itself, and the compiler... Although I would have to ask what exactly a compiler is? It turns out that the compiler is the reason all our programming code actually works, the reason it actually produces any results. It is basically a program that takes a certain programming language, such as python, as input, and turns it into code that the computer's processor uses to actually process the code we typed up. If we wanted to see if a function is efficient or not, we would have to see how the function reacts as its input increases. We learned in lecture that step is defined as a computation that is done in a fixed amount of time. So then to determine a function's efficiency, we only need figure out how many steps a function takes according to its input size.
Of all the subjects that we've learned so far in the course, complexity really challenges what we know. What I mean by that is, it is not just enough to be able to write code that works, you should write code that is efficient. This can often times prove to be quite challenging. So I guess complexity is complex afterall.
In lecture, we wrote a function that was to return all the anagrams of a word, or a sequence of letters, since it didn't necessarily have to be a word itself. There were many approaches, some more efficient than others. One approach was to come up with all the permutations of the sequence of letters, and then see which permutations were actual words in the dictionary. This obviously is not very efficient considering the fact that if we had a sequence of letters that was 7 letters long, it would result in 5040 permutations. Imagine how long it would take for a sequence of letters longer than that. The other approach was the signature approach, which I thought was incredibly clever! The signature approach takes into account the fact that all anagrams have the same letters, so if the sequence was to be arranged in a specific way, it would be clear which words would be anagrams. So, as far as efficiency is concerned, this would be much better.
We know a function is more efficient than another function when its runtime is relatively shorter than that of another function. But there are other factors, such as the computer itself, and the compiler... Although I would have to ask what exactly a compiler is? It turns out that the compiler is the reason all our programming code actually works, the reason it actually produces any results. It is basically a program that takes a certain programming language, such as python, as input, and turns it into code that the computer's processor uses to actually process the code we typed up. If we wanted to see if a function is efficient or not, we would have to see how the function reacts as its input increases. We learned in lecture that step is defined as a computation that is done in a fixed amount of time. So then to determine a function's efficiency, we only need figure out how many steps a function takes according to its input size.
Of all the subjects that we've learned so far in the course, complexity really challenges what we know. What I mean by that is, it is not just enough to be able to write code that works, you should write code that is efficient. This can often times prove to be quite challenging. So I guess complexity is complex afterall.
Saturday, March 8, 2014
Hunting High and Low (in Trees, That Is)
This week we learned about trees, yet again. But not just trees, binary search trees. We already know what a binary tree is. It's a tree where each node can have a maximum of two children. So then what is a binary search tree? A binary search tree helps us to find values efficiently in a set of data, without going through the whole tree/data! Well, how would that work? It works because of the way that binary search trees are organized. A binary search tree is organized so that everything less than the node is the left child/subtree of that node, and everything greater than the node is the right child/subtree of that node, starting at the root node. Binary search trees are greatly efficient when each node has two children. This makes sense because each time, half the tree is being is eliminated. But even though a binary search tree is supposed to be efficient, it doesn't always end up that way. For example, cases where each node has a left child, or where each node has a right child, end up being the same as going through the entire set of data, which is what we don't want because it isn't efficient!
It's clear why a binary search tree is an efficient way of searching through a set of data, but wouldn't the set first need to be set up as a binary search tree in order to search through it efficiently? If that is the case, how efficient would the process of converting a set of data to a BST be? If given a list of of data, one would simply insert the first element in to the tree as the root, if the next item was greater, the first item would become the left subtree of the new root and so forth and of course this would be done recursively. So it doesn't seem like it would be too hard!
It's clear why a binary search tree is an efficient way of searching through a set of data, but wouldn't the set first need to be set up as a binary search tree in order to search through it efficiently? If that is the case, how efficient would the process of converting a set of data to a BST be? If given a list of of data, one would simply insert the first element in to the tree as the root, if the next item was greater, the first item would become the left subtree of the new root and so forth and of course this would be done recursively. So it doesn't seem like it would be too hard!
Sunday, March 2, 2014
Linked Lists... again... and again!
So far everything we've learned after recursion has close ties with recursion. Coincidence? Probably not. As the title suggests, linked lists are quite closely related to recursion, similar to trees. Initially, I did not understand the purpose of linked lists, what exactly was their purpose? But again, thinking back to an earlier lab we had done, it all became very clear. Thinking back to that lab, we needed to implement the Queue class in a way so that the time would not be dependent on the size of the queue, or in this case, list, since we were using a list to implement the Queue class. The answer to this is linked lists. A linked list contains a head node, which has the value, and the rest contains a reference to another linked list. This is similar to the nodes and references representation of trees that we learned not so long ago. My only real question about linked lists would have to be how they actually help to keep the running time constant? Since linked lists contain references to linked lists, it is obvious that it would be easy to insert and delete, since you are just removing that reference not the actual element. Looking into the matter, I found that with linked lists you start with space for just one element allocated (locating a block of unused memory of sufficient size). And then you can simply add on more elements without the need for copying or relocating, hence the reduction of time.
This week's lab proved to be quite challenging, most likely because of the lack of preparation, also in part due to the stress of the midterm. Since the lab proved to be challenging, it is clear perhaps the lab should be redone, and the notes on linked lists should be reread.
The midterm was quite fair, it wasn't too hard, or too easy. And, although I studied quite a bit for a it, I found that I didn't do as well as I had hoped to do. This is a clear indication that next time I should prepare a study schedule in advance and instead of going over terminology, should do harder practice problems that will really require me to think.
This week's lab proved to be quite challenging, most likely because of the lack of preparation, also in part due to the stress of the midterm. Since the lab proved to be challenging, it is clear perhaps the lab should be redone, and the notes on linked lists should be reread.
The midterm was quite fair, it wasn't too hard, or too easy. And, although I studied quite a bit for a it, I found that I didn't do as well as I had hoped to do. This is a clear indication that next time I should prepare a study schedule in advance and instead of going over terminology, should do harder practice problems that will really require me to think.
Sunday, February 23, 2014
Latest Adventures!
Last week, we learned more about trees. After knowing the basics about trees, such as exactly what they are, and the various definitions that came along with the data type, we needed to learn how to implement them. Although we specifically learned how to implement binary trees only. These implementations can include things such as including roots, or children, obtaining the root value etc. We learned that the two most common methods of implementing trees are list of lists or by nodes and references. The list of lists approach represents each tree binary tree (a tree of 3 nodes), where the first element of the list is the root, the second is the left node, and the third is the right node. Now the right node or left node can be binary trees on their own if they have their own children, hence the list of lists representation. The nodes and references method is quite similar to the list of lists approach only there is no list being used. Instead objects of the BinaryTree class are used represent the the individual trees inside a tree.
Learning about the implementation of trees perhaps isn't the funnest thing ever, but it is interesting to notice that trees are recursive by nature. So naturally we can use the skills we learned when learning about recursion and apply them here!
Assignment one was definitely the most challenging thing this week. It did not include anything about trees, only stacks and recursion. This assignment is perhaps the most challenging assignment we've done to date (including the assignments we'd done in CSC108). The first part was pretty straight forward, but understanding how to implement (EFFECTIVELY!) a solution with not 3 stools, but 4, proved to be quite a challenge. My work lacked effectiveness, it was very efficient up to 6 cheeses, after which it drastically became less efficient. I tried several strategies to figure out how to make it simpler, such as using the GUIController to manually figure out my solutions and find a pattern, but unfortunately, that didn't help much. What I did notice was that if n - i just happened to be more than 3, then you would require the help of the starting stool to move all the n-i cheeses to an intermediate stool. This made a lot of sense, and I also proposed having two cases specifically for this. But it seemed that the case where n-i cheeses happened to be greater than 3 were, again, harder to figure out. Hopefully, I can ensure my success on the next assignment by starting earlier and perhaps thinking a lot harder.
Learning about the implementation of trees perhaps isn't the funnest thing ever, but it is interesting to notice that trees are recursive by nature. So naturally we can use the skills we learned when learning about recursion and apply them here!
Assignment one was definitely the most challenging thing this week. It did not include anything about trees, only stacks and recursion. This assignment is perhaps the most challenging assignment we've done to date (including the assignments we'd done in CSC108). The first part was pretty straight forward, but understanding how to implement (EFFECTIVELY!) a solution with not 3 stools, but 4, proved to be quite a challenge. My work lacked effectiveness, it was very efficient up to 6 cheeses, after which it drastically became less efficient. I tried several strategies to figure out how to make it simpler, such as using the GUIController to manually figure out my solutions and find a pattern, but unfortunately, that didn't help much. What I did notice was that if n - i just happened to be more than 3, then you would require the help of the starting stool to move all the n-i cheeses to an intermediate stool. This made a lot of sense, and I also proposed having two cases specifically for this. But it seemed that the case where n-i cheeses happened to be greater than 3 were, again, harder to figure out. Hopefully, I can ensure my success on the next assignment by starting earlier and perhaps thinking a lot harder.
Sunday, February 16, 2014
These Are Not Your Regular Trees!
This week we learned about trees. Not about your regular trees, but ones we use to organize data. They are perhaps called trees because they, too, branch out.
What I particularly hate about trees is all the terminology we had to learn. There's some straightforward terminology such as parent and root, perhaps siblings as well. Others not so much such as branching factor, internal node, etc. But after revising the terminology it does start to make sense. Perhaps the biggest challenge this week was understanding the ordering types since trees can be represented as lists. There's preorder, inorder, and postorder traversal. All relative to the root of the tree, of course.
This week our lab was based on python idioms. Something we did a while back. I hadn't revised it all prior to the lab, but despite that I did not find the lab very difficult. I was able to use the python shell to my advantage and figure out what the idiom statements did.
How are trees different from other types of data structures that we've learned? Unlike everything we've learned before, namely stacks and queues, trees are different because they are not linear structures of data. They are hierarchical. We could represent lists linearly in lists, hence the different traversal types.
What I particularly hate about trees is all the terminology we had to learn. There's some straightforward terminology such as parent and root, perhaps siblings as well. Others not so much such as branching factor, internal node, etc. But after revising the terminology it does start to make sense. Perhaps the biggest challenge this week was understanding the ordering types since trees can be represented as lists. There's preorder, inorder, and postorder traversal. All relative to the root of the tree, of course.
This week our lab was based on python idioms. Something we did a while back. I hadn't revised it all prior to the lab, but despite that I did not find the lab very difficult. I was able to use the python shell to my advantage and figure out what the idiom statements did.
How are trees different from other types of data structures that we've learned? Unlike everything we've learned before, namely stacks and queues, trees are different because they are not linear structures of data. They are hierarchical. We could represent lists linearly in lists, hence the different traversal types.
Subscribe to:
Posts (Atom)