Nightmare before Christmas Code

A common thing seen in the enterprise world is the “implementation” of new technology using the same methods as old technology.

It is often done by doing things “the old way” with the new tech. One example “santa tech” that has been recently abused is the term “REST”. Many people go in and rewrite SOAP and call it “REST” without actually taking the time to understand the new technology’s paradigm. What’s left are long url’s embedded with actions, with GET calls making changes to data.

Let me tell you a story about a codebase that went through a “nightmare before christmas” situation.

The Nightmare

The team was elated. Another successful feature added to their ever more feature-full codebase. As always, Jack, their lead architect, had led them to another wonderful success. Because they wanted their code to be “efficient”, it was all still in assembly. The team had to pull off 80 hour weeks in order to finish the change on time, but they did.

Let’s set this in the early 80’s.

Late one night, Jack ran across ANSI C. It was comparably performant to their assembly code, but its productivity gains were astonishing! No longer would developers have to worry about what registers held what. They could leave the pushing and popping to the compilers (and Madonna). Maybe those 80 hour weeks could fade back down to 40 and they could still compete with other products in their space.

Jack set out on his mission to request a migration to this new language. Given his reputation, management, although initially fearful, was quickly behind him. Jack could do no wrong, and this seemed like the right thing to do. Several developers were assigned to port the existing code over to the new technology. This had to be simple work, right? They just want the exact same functionality in a new language.

When this team of assembly programmers is assigned this migration task, they realize “C can be a lot like assembly”. You can take the data segment and turn it into variable declarations at the top of the module. All the subroutines that were defined in assembly can then be written as void functions that do actions on these variables. It’s almost a direct conversion!

They start by migrating a simple module from this assembly code:

    add     eax, 4
    mov     ebx, dword[eax + accounts]
    mov     eax, 0
    call    populate_accounts
    call    add_extra_account
    call    find_account_by_x
    sub     esp, 12
    mov     dword[esp], msg
    mov     dword[esp + 4], ebx
    call    _printf
    add     esp, 12
    mov     eax,0
section     .data
accounts    resd    30
msg         db      '%d',0xa, 0

Into C:

int ax;
int accounts[30];
int account;
char * msg = "%d\n";
extern void populate_accounts();
void add_extra_account() {
void find_account_by_x() {
    account = accounts[ax];
int main (int argc, char ** argv) {
    ax = 0;
    return 0;

That didn’t seem so bad!

The team began to work, carefully creating global variables for all the things that were in the data segment before, and meticulously crafting functions that do exactly what the assembly had done before. And it worked exactly as before!

Management was excited! They finished the migration ahead of schedule! They’re working with a state of the art language! But something seems wrong: they still have the same 80 hour weeks when they are adding new features to the code.

As the teams become more familiar with the C, there is some measurable productivity gains in newer modules, but whenever parts require changes to this original code, timelines seem to expand to what they used to be before the migration. Oh well, they blame it on “legacy code” and hope for a rewrite.

Learn your Paradigm

In the story above, we can see how the essence C is lost. All that’s left is the same spaghetti that the new programming language was supposed to save these developers from in the first place. I’m not saying this migration isn’t a step in the right direction; it saves them from a lot of tedious stack management. What it doesn’t do is actually make the final code all that more readable.

It’s also a lie. This isn’t C code. It’s assembly code pretending to be C. Any C programmer is going to feel a little pang of sadness every time they have to touch this code.

The Dalai Lama once said, “Know the rules well, so you can break them effectively”. I used to think that this was a waste of time, but that was before I had the opportunity to be the victim of those who decide to not follow the rules; not out of malice, but out of ignorance. Trying to understand what is happening in a sizable codebase written in the subset of C shown above is just as hard as trying to understand the original assembly code.

It’s easy to spot these problems from the outside, but we who make these sorts of changes genuinely believe that we are doing things the right way. And we are, based on what we know of the world. It’s easy to ignore what others have done; we are so much smarter than them, anyways.

Save some time: slow down and learn about your new technology before integrating it into your current worldview. The results will pleasantly surprise you as your world expands.

Tweet about this on TwitterShare on FacebookShare on RedditShare on StumbleUponShare on LinkedInShare on Google+