Geet Duggal

Explorations in the Computer and Natural Sciences

Archive for the ‘Command Line and Code’ Category

Git on that train

with one comment

Warning: As the title suggests, I’m on a train.

In the last week and a half or so I’ve been using Git for a project amongst coworkers and most recently for my own code and text files.  I was a bit skeptical.  But after going through their excellent documentation, seeing the videos, and most importantly, a lot of tinkering, I’m realizing that it’s making life better for me.

There are a lot of resources that compare SCMs, so I don’t want to worry here about which is better and why. But I’d like to share two things that I’ve really liked about using Git.

  1. First off, using something like Git doesn’t necessarily have to be for a big collaborative project.  This may be a sort of different take than the usual, but I like to see it as a tool that helps me see different “views” of my files depending on the job.  Anyone who’s written a lot of text or code realizes that it’s actually quite hard to make things as modular as one would like and that sometimes we’re relegated to grabbing snippets of text here and there rather than making black boxes.  While this is not encouraged as an engineering practice, it’s sometimes very useful for play. With Git, I’m less concerned about making all my source code fit in a super-consistent modular framework and more concerned with focusing on doing a task cleanly.  This is possible because I can branch and merge with relative ease (which I understand is a pain in the ass with other SCMs).  In a branch I can move some files around and do whatever I want without affecting the source.  I can then cherry-pick commits I like from that into another branch.  These branches can look wholly different, but the code can still be updated.  This flexibility makes me worry less about organization and directory structure and more about just choosing the right, minimalistic view for the job.  I now see things in terms of diffs and commits and Git provides the machinery to do real work with them.
  2. Git is minimalistic, local, and fast. Which is great.  It’s a  small source code base that compiles quickly and gives me handy command line utilities.  Proper use of them = power (though even with good documentation there’s a learning curve involved).  Unlike a lot of SCMs, Git is designed to be local.  While it can do a lot of stuff over the network, it’s modus operandi is in a local repository (which is just one .git/ directory in your root directory).  I, in fact, don’t even use the SSH/SSL features layered on top.  Git helps me realize what I’ve changed and worked on and I build patches from there.  You can email them to whoever, and applying them is easy.

And if I’m frustrated with some apparent inadequacy, it’s likely I can find some post with Linus himself justifying it with a little intellegence (e.g.).

Written by Geet

February 14, 2009 at 12:26 am

R project map-reduce-filter tidbit

without comments

The R-project is a great tool set for statistical computing (this is its speciality) and even just to have around for quick calculations, plots [1], and data manipulations. The community is large and the source is open. It provides a nifty Unix-like environment to work in and is available on three major operating systems. </advertisement>

Because of the large community size and the highly-interpreted nature of the language, there is definitely an “impure” feeling about using it since some packages have procedures that will call their own C or Fortran code while others use the language directly. I personally like to see it as a good platform for exploratory data analysis/prototyping ideas, and like to leave the more heavy-lifting to something..different.

That said, since its internals are kind of Scheme-like, the expressiveness of the language for data manipulation in particular can be quite handy [2]. The introduction describes a function “tapply” which is useful for applying functions to groups of items with the same category. In that neighborhood there is also “lapply/sapply/mapply” [3] which are like the traditional “map” function. “subset” is very much like the traditional “filter” function.

Not-so-advertised are the “Map“, “Reduce“, and “Filter” functions (tricky-little capital letters). The differences between the traditional FP functions above and the two R analogs listed in the last paragraph are mostly conveniences for the way R treats its data.

If you use R or are interested in experimenting with it, keep these functions in mind because they can make just a few lines of code do some pretty awesome things.

——
[1] Gnuplot and matplotlib are also pretty good open source alternatives to plotting, and there are of course any non-open source options. In my opinion, experimentation with R is definitely worth the time if you’re playing with plots, and willing to side-step a bit from the Python bandwagon, since Python does have some well-developed statistical and scientific computing tools.

[2] See their “Manuals” section for some good introductory documentation and language definition. Many scripting languages these days support higher order functions.

[3] “mapply” is a neat sort-of multimodal map.

Written by Geet

June 21, 2008 at 6:39 pm

Jot down that melody

without comments

Have you ever conjured up a simple tune in your head but felt like big hammers were a bit inappropriate to jot the melodic thought down? There exists a wonderful thing called LilyPond which takes a TeX-like syntax for music composition and not only renders a neat little PDF or postscript of the score, but can also create a MIDI file of the tune that you can play very easily with a program like Timidity (this program allows you to avoid the sometimes tedious configuring of hardware and software to do MIDI synthesis that comes second nature to more seasoned PC audio folks). Here’s a simple example of how easy it is:

Write a super-sophisticated melody to some file called test.ly:

score {
{ c’ e’ g’ e’ }
}

Now, just run:

$ lilypond test.ly

and it will produce a test.pdf (and postscript) file that looks something like:

Now to create a MIDI file simply add a directive to your file:

score {
{ c’ e’ g’ e’ }
midi {}
}

and rerun the command above. It will produce a MIDI file. If you’d like to hear it with software-synthesized goodness:

$ timidity test.midi

Timidity can, of course, output to WAV or OGG and whatnot for you.

Written by Geet

May 20, 2008 at 2:51 am