Peter Naur, a renowned Danish computer scientist, said that programming is not merely the creation of a program, but an activity by which the programmers form a theory of the matters at hand. If you haven’t read his famous paper, I encourage you to reserve some time to do that. In my opinion, complexity prevents us from reaching a complete theory of a particular program, and many software engineering communities are exploring ways to avoid or reduce it (for example, by applying functional programming techniques to app development).
I’ve been recently exploring the domain of program visualization, that is, software on top of a debugger, or even isolated programs, whose only purpose is to help the programmer understand their code better. I think that better tools in this area would help programmers form a better theory of a complex program, and may reduce the number of bugs that are introduced and the time we spend debugging. To give you a more concrete example, I think that Apple’s recent memory graph debugger is a very neat abstraction that helps immensely when you are trying to understand the resource ownership graph of your program. It beats raw source code exploration and documentation by any measure.
The other day, I decided to have some fun with Swift arrays. Swift arrays have standard sorting routines to sort them, either in-place or by returning a new instance. Imagine that you want to understand which sorting algorithm is implemented by the standard library, and what steps are followed for a particular array. What alternatives do you have?
- As the standard library is open source, you could simply read the implementation and try to understand the algorithms by yourself.
- You could use
sort(by:)and print the elements involved in each comparison.
The second bullet is what basic debugging is about, and I don’t know about you but I’ll probably need to have a pen and paper close by to draw the array, the elements, and understand how the sorting is happening, just like I did when I was studying sorting algorithms at university.
Can we do better? Any code that is heavily algorithmic certainly benefits from visualization. That’s why websites like https://visualgo.net are so popular among students and interested professionals. In order to show the contents of the array at each step, we may start searching GitHub for a good Swift image library. But there’s a poor man’s technique that can create images and even animated GIFs using plain
A very simple Swift program that sorts a random array of 64 elements is this:
The sorting closure is invoked with each pair of elements that the algorithm needs to compare so we could expand it to access the partially sorted array and create a PPM image that depicts its current state. The problem we will face is that the
samplearray is being modified in place, so any simultaneous access to it is going to be prevented by Swift’s runtime module that tracks exclusivity (read the memory ownership manifesto for more information). One way to avoid this is to use Swift 3’s compatibility model, which downgrades this exception into a warning, or use raw UnsafeMutableBufferPointers:
The program works as follows: Line 11 prints the required header for a PPM image in binary format, including the frame width, height, and “255”, which is the maximum intensity value.
Lines 12 to 27 and 14 to 26 iterate through the array and print each value in two possible ways:
- If the element is currently selected by the algorithm, print it in red.
- If not, print it in a shade of gray whose intensity depends on the element’s value (so that we can see how the array is being sorted).
Using ImageMagick to assemble a GIF from the sequence of images generated by the above program, the result is this:
Red squares represent the elements that are selected at each stage of the algorithm, and the different shades of gray represent the current value stored in each slot. The algorithm at the beginning is quicksort, a well-known sorting algorithm with a bad worst case whose time complexity is quadratic in the size of the array. To avoid this bad worst case, when the depth of the recursion tree is greater than a certain threshold the algorithm switches to heapsort, whose time complexity is O(n*logn) in the worst case. By inspecting the source code we can see that the threshold value is 2*floor(log(N)). Finally, when subarrays are small enough, the algorithm switches to insertion sort, which is very efficient for short arrays.
Swift’s sort is thus a hybrid algorithm that combines several well-known algorithms to avoid bad worst-case performance. The official name of this hybrid algorithm is introsort (invented in 1997), and the libc++ team is implementing the same algorithm for Clang. The number of frames in the GIF animation, 407, gives us the idea that the time it took to sort this random array of 64 elements is roughly proportional to O(n*logn), which is asymptotically optimal for general sorting algorithms that compare elements. However, there are instances that trigger bad performance from Swift’s sorting algorithm. Can you think of one? Do you want to play evil and use this technique to show what happens in those worst cases? 🙂