Friday, December 18, 2015

MIT 6.006 Intro to Algorithms

Resources:

Thoughts on the lectures:


All in all, I felt really good about the lectures.  My immediate impression, particularly early on, was that there was a lot of overlap with 61B, and I actually worried it might not be a good use of my time, but ultimately I think they act as a very good complement.  I think taking 61B first was helpful.


Very broadly, the topics covered were:

  • Data Structures:
    • Heaps
    • BST, AVL Trees
    • Directed Acyclic Graphs (DAGs)
  • Numerics:
    • Karatsuba
    • Newton's Method
  • Sorting:
    • Comparison model: Insertion sort, Quick sort, Merge sort
    • Non-Comparison:  Counting and Radix Sort
  • Tree spanning:
    • BFS (Breadth first search)
    • DFS (Depth first search)
  • Graphs:
    • Dijkstra (shortest path)
    • Bellman Ford (shortest path with negative values)
  • Dynamic Programming

Throughout all the lectures there was continual discussion of asymptotic complexity and emphasis on how it is computed.  In addition to the 24 lectures, they also provided videos for the sections (recitations), which went over concepts in more detail.  The recitations were conducted by a TA but were very valuable, as they discussed and worked through example problems to help bring the higher level concepts discussed in lecture home.

Thoughts on the readings:

Portions of the following chapters were assigned as reading from CLRS (about 500 pages total):

1 - Role of Algorithms (11pp)
2 - Insertion sort, Design and Analysis (27pp)
3 - Asymptotic notation (22pp)
4 - Divide and conquer (49pp)
6 - Heapsort (19pp)
8 - Counting & Radix Sort (22pp)
10 - Stacks, Lists, Rooted trees (21pp)
11 - Hash tables (33pp)
12 - Binary Search Trees (22pp)
13 - Tree rotation (31pp)
14 - Augmented data structures (18pp)
15 - Dynamic Programming (55pp)
17 - Amortized analysis (30pp)
22 - BFS, DFS, Topo sort (35pp)
24 - Dijkstra, Bellman-Ford (41pp)
34 - NP-Completeness (58pp)

My first impression was a bit of a surprise at how easy the book was to read (it got plenty harder). I was worried it would be excessively dry and put me to sleep (it is a textbook afterall).  One of the exercises in Chapter 2 was to fill out a table with the values of n that corresponded to the number of operations that could be completed in the given timeframe.  I found the results to be quite edifying, so I've shared them below:

This table shows the size of n that will complete within the specified time
Reading Chapter 3, one of the exercise questions asked if 2^(n+1)  and 2^(2n)  are  O(2^n).  Truth be told, I initially misunderstood the math involved and thought both were true, but some googling and checking the instructors manual revealed my mistake.  I initially thought you could add a polynomial coefficient anywhere, so I was playing around with n rather than the coefficient for the entire function f(x).  I think the confusion was owing to seeing 2n + 1 = O(n).  If we were asking if both those functions were 2^(O(n)) then they would have been true.  I have stack exchange to thank for that insight.  I found a function plotter online and did some experimenting.  What really made the relationship obvious was switching to a logarithmic scale for the y axis:


I think this excerpt from the instructor manual is a useful nugget to keep in mind when thinking about asymptotic complexity:

  • Exponential functions grow faster than polynomial functions, which grow faster than polylogarithmic functions.
  • The base of a logarithm doesn't matter asymptotically, but the base of an exponential and the degree of a polynomial do matter
Chapter 4, Divide and Conquer, opened with discussion of an algorithm to calculate the maximum subarray, which reminded me of a HackerRank challenge I hadn't tried yet.  I implemented the pseudo-code in Java thinking I could tweak it to meet the requirements of the challenge (the task was to find the maximum array MODULO some number).  Turned out to be a flawed approach, but I eventually found a working algorithm that I was able to make work.

While reading Chapter 13 (Red-Black trees), I wanted a better visualization for the process (the descriptions in the book are hard to picture), so I Googled "Red Black tree visualization" and found a page on the University of San Francisco CS department's website that has visualizations for many different algorithms.  I've also found that the instructors manual is very handy when I'm curious about the answer to an exercise (though it's hit or miss).

As I got toward the end of the reading, I found myself skipping longer and longer sections.  There are only so many proofs I could read.  I finished everything listed above except NP-Complete, though I will probably go back at some point and finish that chapter too.

Khan Academy:

While reading CLRS, I visited the website, and was surprised to see a link to the Khan Academy algorithms course.  Cormen (the "C" in CLRS) collaborated on the course.  It is more of a crash course (I finished it in about... oh, maybe 5-6 hours), but it had some interesting coding exercises and projects.  The fractal art project in particular was pretty cool.  I wanted to push myself, so I did the "dragon curve" L-system, and was pretty pleased with the result:


Asymptotic notation, sorting (insertion, selection, merge, quick), recursion, graph representation (edge list, adjacency matrix, adjacency list) and BFS were covered (though graphs and BFS less so than sorting and recursion).  Definitely a worthwhile exercise.

Problem Sets: 


While I do plan on circling back around to the MIT problem sets, I am burned out on algorithms for now.  Between the lectures, the reading, Khan, and something like 50+ HackerRank challenges in the Algorithms domain, I'm ready to try my hand at something else for a while.... like maybe finishing the Berkeley Software Engineering course (still haven't finished the reading or homework for that class lol).

No comments:

Post a Comment