Friday, October 22, 2010

Quick but not dirty?

A fine write up on "Taco Bell" programming. What does that mean?
Every item on the menu at Taco Bell is just a different configuration of roughly eight ingredients. With this simple periodic table of meat and produce, the company pulled down $1.9 billion last year. ... Taco Bell Programming is about developers knowing enough about Ops (and Unix in general) so that they don't overthink things, and arrive at simple, scalable solutions.
The general argument is that every line of code you write is actually a liability, as it's another line to understand and maintain in the future. The more functionality you get with the fewer lines of code, the better - and I think that modulo readability, that makes a lot of sense.

As an example, the author offers this snippet as a way to manage a web crawler:
find crawl_dir/ -type f -print0 | xargs -n1 -0 -P32 ./process
It works, but it's dense - much like regular expressions. And like regular expressions, I view this sort of coding as a great way to get things working, but a slight fallacy if you think it will also be significantly easier to manage. You're still compressing the same concepts in fewer lines, and as a result this single line may be harder for somebody to update than a 10 or 20 line script that does the same thing.

Of course, it all really depends on who the "somebody" is - if you're a regular expression veteran, you can hack around on those just fine. If you grok xargs and awk and sed and tr and ..., then shell hacking will by definition be something you'll do and probably enjoy.

So what's the best way? I'm left with the banal tautology that all generalizations are false - this writeup makes an excellent argument in favor of using the standard Unix toolchain, and I agree strongly that any programmer would be well served to understand the stack on which they program.

But I also took a brief Python course where the instructor stated his goal as causing us to never write another shell script. In his view, Python had an adequately low entry barrier (in terms of boilerplate, documentation, etc.) but substantially higher readability than shell scripts, not to mention portability (Python is Python, but bash != tcsh != zsh != ...).

I guess I'm left to conclude with an even more banal tautology that hard work is hard, and there is no shortcut. There is, of course, good design, and a big part of good design is using tools appropriate for your job. Sometimes that's the toolchain, sometimes that's assembly - it really just depends on what you're doing and why.

No comments:

Post a Comment