Thursday, March 11, 2010

Why Vim is great (Exhibit 1: macros across multiple buffers)

I'll give a little bit of an introduction to my Vim experience in the next paragraph and an introduction to a real life problem that I solved using this Vim "trick" recently in the paragraph after that. You can skip to paragraph four if you don't care about that and just want to see the "trick" (it should be more or less self-standing).

I've been using Vim on a daily basis for about four years now, and I'm really happy that I took the time to learn to use it and stuck it out when it was rough. There used to be a class in the first semester at my college that had a lab in Vi basics, but it wasn't really taught in a meaningful way but rather left the students to "try it out" which resulted in people trying to type something and "getting stuck" in some mode without knowing how to do anything. At the end, everyone (including me) hated it, thought it was an old editor that nobody should ever use. Later, I tried getting into it a few times and finally made it stick. I have by no means mastered it, but I feel pretty comfortable using it and I hope I can show you a few tricks that are not too basic and well known (I'm hoping to make a series of posts out of this).

I recently started maintaining a site (pro bono :) that was abandoned around 2007. The whole thing is mostly HTML only with a few PHP scripts here and there. Unfortunately, it is structured in a way that requires a lot of duplicated effort since most files are referenced in several places. Since the updates are not very frequent (about one a week) and there is a lot of existing content that is important, I decided (for now) not to port it to a CMS and just stick to the old system. I've since developed a few scripts and tools that make the update process pretty simple. The central item of the site are trip reports that include some text and a link to a picture gallery. We now use Picasa to host the pictures. Every report includes a link to the appropriate album (or several) at the bottom. This contains a clickable thumbnail of the album cover and a clickable album name below it. Both these links open the album in the same window (no target attribute). I got complaints that once you start browsing through the gallery, there's no easy way to get back to the main site. I didn't notice this because I have a habit of always using right click->open in new tab on all links. Unfortunately, Picasa doesn't allow you to add a link to your album (from searching around, it seems like a very requested feature). I thought my best option was to simply add target="_blank" to every gallery link so that it would open in a separate window. Now, since I'd uploaded two years worth of content (a lot of reports), there were a lot of links to fix.

So on to Exhibit 1: I have a ton of files that all contain two links to an album hosted on Picasa and I want them to start opening in a new window, i.e. need to add target="_blank" (there are other ways to fix this, but they all require at least some change in the link HTML). I remembered seeing a video of a similar problem a few months ago and how it could be solved with Vim. I downloaded all the files into a directory and opened them all with vim *. Then I recorded a macro to register a (with qa). The goal was to somehow identify the gallery link. Fortunately, this turned out to be really easy since both these links start with <a href="http://picasaweb. So I did a simple substitute command typing

:%s#<a \zs\zehref="http://picasaweb#target="_blank" #g

The only nontrivial part of this command are \zs and \ze which basically define where the substitute text (the target attribute) will be inserted.

And now the "magic" part. After applying this command, I typed :wn to store the changes and move to the next buffer and pressed q to stop recording. At this point, I looked at the number of remaining buffers and applied the macro that many times (50@a). The screen started blinking as Vim was crunching away and after a second or so stopped with the message "Cannot go beyond last file" which is expected due to the :wn command on the last file.

And that was it. The whole thing took me about a minute to do. Sure, I could have written a script that did the same thing but it would have probably taken a few more minutes, and, after all, Vim was made for these sort of things. I tried searching for the video that taught me this (i.e. that mentioned it - once you hear about it, it is very obvious) to give due credit, and managed to find it here (it's by Derek Wyatt). Till next time, happy Vimming :).

Monday, February 22, 2010

Top 25 Most Dangerous Programming Errors and what we're doing about them in education

The other day I stumbled upon this list of 25 most dangerous programming errors. The list was supposedly compiled based on data from 20 different organizations. Of course, this doesn't really tell us anything and by no means implies that the list is "correct", but it is somewhat in line with what is commonly talked about in the computer security literature.

The point of this post is not to discuss these errors in detail, but to see what we can do about them in CS college education. The common denominator of most of these errors is assuming user input is safe and friendly, yet this is something that I feel is not talked about during education nearly enough. At the same time, the two last items on the list (broken/risky crypto algorithm and race conditions) are something we devote a lot of time to. Specifically, a large part of the operating systems course talks about avoiding race conditions and we have a specialized course that deals with cryptography. Additionally, there's a post-graduate computer security course that also fails to address any of the errors in the "user input" category, but also focuses on cryptography and authentication. To make matters worse, the "user input" errors are much easier to understand and deal with than these more "advanced" errors.

There are other aspects to this that I won't go into now, but I'm sure that every person with a CS college degree should know about all of these errors so we need to start covering them all in class (it is too important to be left for self-study).