Core Application Redesign

This is one of several stories about cool stuff I’ve done. See the Portfolio Intro post for more info.

Ok, rule #1 of software development is “Don’t rewrite a working application.” Joel Spolsky, of joelonsoftware and stackoverflow fame, can tell you why. He’s absolutely right, for all the reasons he gives. In short, while the old software may be messy, it’s got years of hard-won business knowledge and performance optimizations packed in to it. Starting over from scratch means having to re-learn all of that.

So yeah, I’ve broken rule #1, except that I actually pulled it off. It worked because I knew what I was doing, I did it for the right reasons, I was careful, and I was in a very unusual situation where a lot of the reasons behind rule #1 didn’t apply. I did it on my own time, and without putting the company at risk. I felt my managers’ pain, thought like the competition, and learned from the customer.

The Itch

The company in question had been founded about a year before I joined, and their core software had already been written. It was very sophisticated, which was cool at first, but over time its complexity became a real burden. The application was slow in ways that were inherent in its design. Setting up a new customer remained an ill-defined, highly custom, and error-prone process. New hires took several months to get up to speed, and even then would often run into problems they’d have to punt to the senior developers. And it was just flat-out frustrating to work with.

This all came to a crisis when I’d been there about three years. The company had grown and changed, and the original staff started leaving. They were a pretty close team, so it was a given that we would lose them all within a few months. That meant not only fewer people to work on projects, but also more overhead time that we’d need to spend hiring and training new people. And some institutional knowledge—quirks about the product—would be lost. On top of that, we were getting more and more pressure from our customers about its performance issues.

Faced with all that, I started thinking about how we might do this whole thing differently. The original concept had been “Information Sharing”: The user has an incomplete piece of information, and they send it out into a distributed, self-discovering network of intelligent agents which build onto it with relevant data from various sources, creating a unified structure that can then be sliced and diced for display back to the user. Very abstract and sophisticated. It involved a lot of clever code for pattern matching and coordinating the efforts of the agents.

Rather than just writing new software to do the same thing, I took a hard look at what our customers actually needed. I’d worked closely with several of them, and was familiar with all of our existing projects. The question I came up with which really focused my efforts was, “How would our competition do this?” If a new company were starting from scratch, rather than duplicating our technology, how would they design something to meet our customers’ needs?

The Fix

The more I thought about it, the more I realized that our software was doing a whole lot of stuff that didn’t provide any real business value to our customers. In truth, their needs were fairly simple. They had some identifying information for a person (or less frequently, an object) and they wanted to see all of the related documents from their various data sources. Put that way, the problem isn’t cutting-edge “Information Sharing”, it’s enterprise document search and retrieval. That’s a problem faced by almost any large organization, particularly companies which have grown by acquisition. There are a lot of people working on it, and a number of standard tools and technologies which can be applied to it.

In coming up with a new design, I was simultaneously trying to address three different sets of goals: technical issues, management concerns, and customer needs. Fortunately, these were all pointing in the same direction: a radically simpler design using standard tools and technologies.

Both the implementation difficulty and the performance problems came from the complexity of defining and coordinating the software agents. Since in practice the customers were only running a few well-defined searches, I specified them explicitly, rather than having them be intelligently derived from the data provided. Since new data sources were rarely added, if ever, I replaced the self-joining network with a simple list. I replaced our home-grown data structures with XML. This in turn let me replace custom display code with XSLT stylesheets.

As far as management was concerned, this would all add up to reduced customer implementation effort, less time needed to bring new hires up to speed, and less need for highly skilled developers. Since the core application logic was simpler, it not only was easier to train new staff on, but also required less intervention in the first place. A lot of the customer-specific work became straightforward document transforms, rather than custom Java programming.

The customers would mostly benefit from the reduced implementation time and cost, and the improved performance. Without all of the network overhead and complex processing required by the distributed agents, response time dropped from 20-30 seconds to about two.

From the time I hit on the question of “How would our competition do this?” I spent a couple of months kicking around ideas and researching in my spare time. I wasn’t sure that this was going to go anywhere, and the company certainly couldn’t spare my time to work on it. If I’d gone to management and asked for permission to re-write our core software from scratch, they would have quite rightly freaked out. I had to do this on my own dime, wait ‘til I had something pretty solid, and pitch it to them very carefully.

I knew I had to have buy-in from the senior technical staff, so I got them involved very early on, sanity-checking the concept and critiquing the design. I came up with diagrams and some concise documentation to flesh it out before writing any code. I worked up performance assessments for key tools we were bringing in.

The Pitch

Even though management was well aware of the concerns driving this new design, I couldn’t just drop it on them. Replacing the software that you’ve built a business on is scary; no way around that. I had to head off as many of those fears as possible. The two big ones were, “How will we tell the customers?” and “What if this doesn’t work in practice?”

The key to both of these was to pitch the new software as a lightweight alternative to the old application, not a replacement. We wouldn’t have to commit to the new system until we were all comfortable with it. The old application would become the “enterprise-grade” version, and the new one would be the “lightweight” version. I worked up a comparison checklist showing all of the features that both applications had, plus the extra features included in the enterprise version, like “dynamic discovery of data sources”. A confidential attachment explained that none of our customers had ever used the “dynamic discovery” feature, or any of the other enterprise features.

Once I had that figured out, and the design was fairly solid, I spent a couple weeks of evenings and weekends working up a demo. I implemented all of the core query and document processing logic. The data sources were mocked up, but had realistic performance characteristics. The displays weren’t sophisticated or polished, but they were clean, simple and effective. It had a limited but representative selection of searches and document types. Minus the parts that would have to customized for each customer, it was a working alpha release.

I demoed it for the senior technical staff, and got their enthusiastic support. Then I went to the management team, and said, “There’s this idea I’ve been kicking around…” I talked them through the motivation behind the re-design, the hiring and training benefits, the positioning as a lightweight option, and the reality of the feature comparisons; I skimmed over the technical details and focused on the business benefits. I showed them the demo and fielded questions. I stoked their hopes and allayed their fears. In the end, I got the go-ahead to use it on our next project.

The Win

It turned out to be everything that I hoped and more. What I hadn’t anticipated was that it would completely change the way we estimated and managed projects. By replacing monolithic Java modules with an XML processing pipeline, I had broken the development effort into a number of discrete stages, which we were then able to budget and track separately. Instead of just guessing “X amount of work per data source,” we were able to break it down into the effort needed for each query, document type, and display; and then build up a much more realistic estimate. It also meant that the project tracking info we gathered was more detailed and useful. In addition, we became able to work on some of the components in parallel, and had a clear understanding of which parts needed to be coordinated across data sources.

The training benefits showed themselves early on. With the old system, getting a development environment set up, even for experienced staff on an established project, could take a day or two. To get a developer at one of our sister companies set up with my demo system, I sent him a zip file and half a page of instructions. He got it up and running in well under an hour. This was just a good Java web application developer, someone who had never worked with either version of the app.

The last I heard from someone at the company, the new application was still in use and had been adopted for a number of projects. No new projects had been done with the old software. The new version was faster and more flexible in the ways that mattered to our customers. The punchline here is that the old software was about 60,000 lines of Java; the new version was about 1500. It has a few tricky bits of code, but it’s not all that technically impressive. What I’m proud of is that I came up with such a simple design that successfully integrated our technical, business, and customer needs, and was able to pitch it to each of the stakeholders in their own terms.