Personal Information Management

July 24th, 2008

Originally posted to LiveJournal on July 4, 2008

Since getting my new MacBook, I’ve been trying to pull off the pure Mac play – sticking with the stock applications, choosing integration over individual features, trying to work with what I’m given. It’s mostly worked out pretty well, but it’s taken a bit of tweaking. At this point, I’m working on a combined Mac/Google solution.

The ingredients are: MacBook at home, Ubuntu Linux at work, iPod, cellphone, old Palm Vx. Of the last three, I figured I should be able to at least ditch the Palm. It’s a workhorse, but it’s old and doesn’t play well with the Mac. I could go for the iPhone 3G in a few days. That would kill off all three, plus my digital camera. Tempting at $199, but the $100 $60 a month service plan is nuts for the amount I’d use it. It would take some sort of radical lifestyle or professional change to justify that. I’m not above throwing money at problems, but that’s just profligate.

Getting all my info off the Palm was pretty straightforward. Jpilot let me save all the calendar and to-do info as .ics files. After a small but annoying bit of tweaking to DOS-format and rename them, they uploaded smoothly into iCal. Similarly for the contacts and Address Book. That was one of the big wins – having all of my addresses, phone numbers, email and IM in the same place. Once that was all done, it all synched seamlessly to my iPod. Nice. The only snag (Apple developers, listen up!) is that the To Do list displays on the iPod in alphabetical order, not by priority. Come on guys; what are you thinking? Notes were also a bit awkward. You have to mount the iPod as a drive and save them as files in a “Notes” folder. No application integration there. Still, it might make sense to start storing To Do lists as Notes. If I work with them regularly, I may get used to them. So that’s awkward, but not a killer.

The killer turned out to be that I can only sync the iPod to one machine. The whole philosophy of the Palm is that you can go from home to work and keep your core info in sync. Even on Linux, there was a decent client that I could run either at home or at work. The iPod will really only talk to my home machine. That’s kinda control-freakish. It means that when I’m away from my MacBook, I would need to write down appointments on paper or something, and enter them in when I got back. That’s annoying enough that it got me playing with Google Calendars.

Google Calendars are pretty slick. You can have multiple calendars, set up different access constraints for them, and view them in a single display. My girlfriend and I each have personal calendars that we share with each other, and then I have an “events” calendar that’s open to the public. There’s probably also a middle ground for friends: Stuff that isn’t really private, but I don’t care to have strangers knowing, like who I’m having dinner with tomorrow.

iCal lets you set up read-only calendar “subscriptions”. So I re-exported my calendars, uploaded them to Google, deleted them from iCal, and slurped them back down as subscriptions. From there, they synched to my iPod just like before. I was also able to set up a subscription to our calendaring software at work (Zimbra). That’s gravy. Normally, I only have to worry about meetings and such when I’m in the office and have access to the calendar directly. But it is nice to have them on the iPod with audible alarms and all.

While I was at it, I discovered that Google Calendars has a handy text messaging (SMS) interface. You register your phone with them, and then you just send messages to “GVENT” (48368). Sending “day” gets a response with today’s events. “happy hour at 5pm tomorrow” creates a new event. You can set up events (individually or by default) to send you a text message alert. So if I’m going out without my iPod, I can have my cell phone remind me when it’s time to move on to the next round of fun. I don’t actually see myself using that a lot, but it’s nice to have as backup.

I watched Merlin Mann’s “Inbox Zero” Google Tech Talk video the other weekend, and I’ve gotten all fired up on that. I’d been letting stuff just pile up in my Gmail inbox because I don’t actually get much mail there. Gmail doesn’t use the normal folder setup, but it has Labels, which let you do much the same thing, and a bit more besides. So I got everything tagged and archived in short order. Set up a few basic filters to handle list mail that I only need to check every day or two. They’re not super sophisticated, but they do the trick. An empty inbox is a nice feeling.

I also started playing with Google Docs. I haven’t done spreadsheets or presentations yet, but the documents are nice. They strike a good balance between features and simplicity. I’m a little paranoid about storing all this stuff on a free “beta” service, so it was nice to find that you can easily export them as HTML (and relatively clean HTML at that).

So I think that may be the winning combination. I could also look at the iPod Touch option, which would give me a way to enter data on the go. Right now, the only problem situation on the horizon is DragonCon, where I have to track a whole lot of events, and I’m not planning on lugging my MacBook along. It could be handy if they would publish an ics file with the complete schedule before we got down there. Maybe a separate one for each programming track or room. Hmm…

Java Logging

July 24th, 2008

Spent half the day fighting with Tomcat 5.5, trying to get log messages out of my application. I need to rant a bit here.

I’m deploying a vanilla webapp with its own log4j.xml in WEB-INF/classes. Why does this not just work out of the box? Because, hey, maybe you’d want to use some other logging library. Not that there are any out there that don’t suck, but y’know, maybe someday. So we need to leave this all open-ended and configurable.

You see, once upon a time, we had Log4J. It was good. In fact, it was perfectly sufficient. It solved the problem. It did just about everything you could want.

Unfortunately, Sun was also developing their java.util.logging (JUL) around that time. They went through the whole JSR community process, so they started earlier and finished later. It sucked, but Sun owns Java in a very literal sense, so we were pretty much stuck with it. Not in the sense of actually having to use it, thank god, but any poor bastard writing logging tools had to accommodate it.

So then we got Apache Commons Logging. It’s a diplomatic facade that stands in front of either of them so your code can be agnostic about whether you’re using Log4J or JUL under the hood. In practice, anyone who doesn’t have a paycheck from Sun and a gun to their head uses Log4J. Java Developer’s Journal did a comparison of the two. They tried to give a “fair and balanced” report, as in, “there are situations in which JUL might be the better choice.” If you read between the lines, those situations were essentially if you’re writing a toy application that would probably be fine with System.out.println().

There’s something of the paradox of choice here. Trying to leave all your options open is ultimately less effective. Your code has to be way more complex to support everything, and you almost always have some sort of mismatch that forces you into a limited-functionality common denominator. It’s not worth it. Pick something that works and go with it.

So today’s pain came about because rather than doing that, Tomcat 5.5 seems to have slapped yet another layer of indirection and indecision on things. It seems that one of the ways in which java.util.logging sucks is that you can’t have multiple configurations within an application. Why on earth would you want to do that? Well, if your application is a web application server, it needs to run a bunch of java libraries as if they were standalone applications. Kinda killer flaw there, huh? But who can pass up the challenge of hacking around something that thoroughly broken? And hey, it’ll make it standards-compliant.

So you end up with these gems from the Tomcat 5.5 logging page:

the default Tomcat configuration will use java.util.logging.

And a bit later on:

The default implementation of java.util.logging provided in the JDK is too limited to be useful.

How can you say that and still build in support for it? Grow a spine, people! How much simpler would this world be if you just followed that up with, “…so Tomcat uses Log4J exclusively. Don’t like it? Write your own app server.”

But no, what did they do? They hacked together their own implementation of the java.util.logging API. Apparently, it’s not “too limited to be useful,” but it “is not intended to be a fully-featured logging library.” Why bother? You’ve had a fully-featured logging library for the last 6 years. Why would you make the default logging configuration one which by your own admission is really not adequate? Why not make Log4J the default? If you really have to, then you can provide instructions on how to plug in your half-assed substitute.

The Innovative Enterprise

May 20th, 2008

This is a grab-bag of notes and commentary on Harvard Business Review on the Innovative Enterprise, rather than a structured summary or review.

Apollo 13

Everybody brings up Apollo 13. That seems to be the role model for a high-performance, results-oriented team working under high pressure. Could we maybe pick something a little less dramatic? Remember, this is the same crew of people who put the first man on the moon less than a year earlier. They were working at the height of American Big Science. They were educated and trained as if the future of America, Democracy, and the world rested in their hands, which it arguably did. At the very least, three lives, millions of dollars of equipment, and the scientific prestige of the nation hung in the balance. I don’t care how cool your business plan is, or how much VC money you have, you’re not getting that kind of talent and motivation.

Cost of Creativity

There’s a rant in the “Creativity is not Enough” article about how organizations aren’t supposed to be creative. They are anti-creativity, anti-chaos. The whole point of an organization is to constrain and coordinate people’s actions to focus them on a task. Creativity is mostly a distraction.

It’s not that this guy is an idiot and you should just ignore him; they wouldn’t have published him. He was writing in a different time, with a different focus. For large industrial firms, the goal is well understood, and the main effort is in execution. He wouldn’t think of incremental process innovations as Creativity – that’s just doing things better/faster/cheaper. In a service economy, most of the effort is in figuring out what to do. Success is based more on adapting to the customers – understanding and meeting their needs. And the folks in the trenches have more insight into that than the big boss back in HQ.

But the main point that he’s trying to make is still valid – that creativity and change have a cost. The time you spend figuring out how to do things better is time not getting things done. The time you spend on one innovation is time not spent on others. The payoff has to justify that. And trying to do a dozen new things at once probably means that none of them will be well done.

Summary

Now a quick summary of the key points of the book. Since it’s a number of essays written by different people on the same topic, there’s a fair amount of overlap. There’s also a wealth of supporting detail for these principles.

Pressure always hinders creativity. Don’t confuse a surge of relief at dodging a bullet for the true rush of creativity. If you have real and meaningful causes of pressure, make sure they’re communicated. Meaningless or arbitrary pressure is even worse.

Focus helps. Pressure plus task switching equals stress without productivity. If you have a bunch of different things to do, sequence them; don’t multiplex. Meetings with more than one or two others are fragmented, undirected.

To motivate people, pick an enemy and cast yourselves as the underdogs. The enemy can be imaginary, or just a concept.

Innovation should be constant and pervasive. Everyone should always be thinking of how to improve, how to do their work better. Build a portfolio of ongoing experiments at different stages of maturity. A portfolio spreads the risk. You need both short-term/incremental and long-term/disruptive. You need both blue-sky research and radical solutions to known problems. You need to be willing both to take risks and to abandon unsuccessful projects. Allow for failure; make it cheap. Once an idea is proven, go all-out to make it happen.

Innovation needs to focus on competitive advantage. It has to either benefit your customers, expand your market (meet the new needs of your not-yet-customers), or give you an edge over your competition. Be alert for solutions in search of a problem, innovation for the sake of novelty. Measure performance, cost, benefits, risks. You compete on price, performance, or features. You may need to segment your customers so you can focus on the specialized needs of one group. Plan for your competitors’ response to your innovation.

Build networks of innovation through trusted third parties – investors, executive search firms.

Collaborate with your customers. Quick feedback is better than a polished release. Get them involved before the big investment. Get to know them better than they know themselves. Learn what they do, not what they say they do. Give them what they need, not what they ask for. Show them how it will be used – mock it up, tell a story.

Try to make technology invisible to normal users. Make failure transparent – give the user the information they need to solve the problem. Build tools for power users, so they can scratch their own itches and help others. People learn from experience and peers, not so much from formal training.

Each innovation needs four kinds of support: An on-the-ground champion, an executive logistical supporter, creative idea generators, and practical implementors.

Get outsiders’ perspectives. If you want people to think outside the box, hire from outside the box. Get fresh eyes on your business. Varied backgrounds produce better problem-solving skills. Focus on people who are bright, verbal, assertive, and creative. You’ll need enough outsiders to form a critical mass.

Look at your company through the eyes of your competitors. What are your limits, weaknesses and vulnerabilities? Your competitors are doing your market research for you. If they do better, it’s because they’re meeting a customer demand that you aren’t. If they don’t, you know not to try that.

Learn from the potential customers that don’t choose you. What are you failing to provide them?

Project groups focus on goals, not resources. They minimize turf battles. Align the organization’s goals with employees’ intrinsic motivations.

Software Pin Factory

January 19th, 2007

I started reading Adam Smith’s The Wealth of Nations. For fun. Because I’m a big dork. Right at the beginning, he talks about division of labor and pin factories. Pins: little bits of wire with a point on on end and a knob on the other. The point is that a single craftsman, without particular skill or tools, could only make a handful of pins a day. In a pin factory, the process is divided into more than a dozen stages, each simple and specific, with its own tools. This process is literally hundreds of times more productive. Businesses have been trying to pull off this trick ever since.

It worked fantastically well in the industrial age. As we moved into more of a service economy, the principle still applied, though not as strongly. In your typical office, you have specialization of skills: Managers, admin assistants, sales, marketing, and so on. In a one person company, that one person can cover all those roles. Things will go smoother when they can divide the work up among several people, but not by a factor of hundreds.

As software development has become a bigger and bigger business, the obvious thing has been to try to apply the same assembly line model: analyze, subdivide, specialize. And so we come to Enterprise Software development, which pretty much follows the old waterfall model: Requirements, architecture, design, implementation, test, support. The business analysts don’t write code, the architect doesn’t write test plans, and the programmers don’t field support calls. The business analyst writes a requirements document, the architect writes an architecture document, and so on. Each stage does their piece and throws it over the wall to the next team. The principle is sound: The more narrowly you divide up the skill set, the more specialized each group will be, and the better at their particular task. Push this far enough and eventually you can replace them with trained monkeys. Very clean, very elegant. Very efficient? Umm… no.

The problem is that there’s a significant difference between an Enterprise Software solution and a pin. A pin is a very simple thing. Everyone knows how it works and can tell at a glance if a given pin will do its job. Unless it’s bent, blunt or missing its head, it’s good to go. If an end user finds that one pin is poorly made, they just throw it away and get another.

An Enterprise Software system is not simple or obvious. Aside from the fact that even the customer often doesn’t entirely understand how they want it to work until they start using it, the amount of information that has to be communicated from stage to stage is huge. Requirements documents can be hundreds of pages. None of the workers has any innate sense of whether the information handed to them from the previous stage is complete or accurate. Every misunderstanding is multiplied down the chain. It’s like playing “telephone” with War and Peace.

The key here is that when you divide up the labor in a process, you also incur a communications cost between each stage. It’s a trade-off: Each stage can be done more efficiently, but information about the work has to be passed between them. In the pin factory, the benefit of specialization is huge, and the communications transfer is tiny – as I said, the quality of a pin is obvious. In a software shop, the benefit of specialization may be significant, but the communications overhead is enormous. To come at it from a different angle, I’ve heard it said that a programmer is lucky if they get to spend 20% of their time actually writing code. The other 80% is communications overhead: reading documents, sitting in meetings, writing documents. Add in the time you spend on rework due to miscommunication, and that 20% starts to sound generous. Here, the assembly line is actually less efficient.

So what do you do? The development of “Agile” tools and techniques is a response to this, along with component or service oriented architectures. Break the system up into independent, well-defined pieces, and put them in front of the users as quickly as possible. The catch here is that you can’t use trained monkeys. You need people who can talk to the end users and write code, who can carry the product all the way through the process. Maybe they need help from experts like domain specialists or QA engineers, maybe they can divvy up some of the work to tech writers or programmers, but the key is to capture an end-to-end understanding of the product in as few heads as possible.

Under the Hood

April 22nd, 2005

Imagine that you work in a large and somewhat old-fashioned office building. If you want to send a message to your buddy Joe over at XYZ Corp, this is how it goes. You write out your letter on a piece of paper and put a sticky note on it saying, “Please send to Joe Smith at XYZ Corp,” and hand it to your secretary. She (I said this was an old-fashioned place) puts the letter in an envelope and puts Joe Smith’s name on it. Then she looks up the address for XYZ Corp and writes that on the envelope, along with your return address. Then she hands it off to the guys in the mail room.

What they do is interesting. They look at the address for XYZ Corp and say, “Hmm… that’s out of town. It needs to go to Central.” So they put your envelope inside another envelope and write “Central Post Office” on it.

When it gets to the central post office, they open the outer envelope and read the address on your letter. They say, “Oh, this is going to Chicago,” or wherever. So they put your envelope inside another envelope and write “Central Post Office, Chicago” on it.

Then it gets to the Central Post Office in Chicago. They open up the envelope addressed to them and see the address for XYZ Corp. So they put it in another envelope that just has the 9-digit zip code for the XYZ Corp building on it.

It shows up at XYZ Corp, and the guys in their mail room open up that envelope and see that it’s addressed to Joe Smith. Somebody runs it upstairs to Joe’s secretary, and she opens the envelope and hands Joe your letter. When Joe sends a reply back, it works the same way.

This is how the Internet works.

It’s actually more complicated – there are more middlemen – but that’s fundamentally how it all works. It’s all these little letters (called “packets”) flying around a very, very fast postal system. This is a pretty clear match for email, but it’s also how everything from web pages to streaming video to Voice Over IP works.

When you “go to” a web site, you’re really mailing out a request for a web page. It’s like writing off to a mail-order catalog company. There’s a standard form that defines how you ask for web pages. You fill it out and send it in. You’re sending this little form that says, “I want to see http://www.bluegraybox.com/index.html”. That request goes out through this metaphorical postal system to the bluegraybox.com server, and some little toiling minion there xeroxes off another copy of the index.html document and mails it back to you.

Like I said, it’s more complicated than that. How do you keep email and web pages and FTP sites all running on the same machine without tripping over each other? Imagine your office building has a bunch of different departments in it, but they all share the same mail room. Instead of a billing department and a sales department, you’ve got an email department and a web department and an FTP department. The way it works is that they each have different post office box numbers. Whenever a letter comes into your building, the mailroom guys just have to drop it in the right box. One of the rules of this postal system is that the addresses on letters have to have a box number. Furthermore, these box numbers are standardized, so that box 80 is the normal box number for the web department, box 25 is for email, etc. So when you send off your web page request, you know that it’s a web page request, and wherever it’s going, it should be going to box 80.

This is also how you keep your responses straight. Even if you’re the only guy at your company, you could be downloading a couple of mp3s in the background while you’re popping up new browser windows right and left. If all that stuff is landing in the same inbox, you’ll never sort it out. So each time you ask for a web page, you set up a new post office box just for its responses. Your downloads go to boxes 5001 and 5002, and your web pages end up in 5003, 5004, and so on.

Here’s the next wrinkle. Say you send off a request for some big, fat mp3 file that won’t all fit in one envelope. So it gets broken up into a whole bunch of separate letters. Now on top of that, like the real post office, stuff can get lost: Mail trucks get stolen; Some yutz in New Jersey cuts through a long-distance fiber-optic cable with a backhoe. Even if nothing gets lost, there’s no guarantee that everything is going to show up in the order you sent it.

So what do you do? First, you send a letter that effectively says, “Hey, I want to send a whole bunch of letters back and forth with you. Here’s the address you should send all the replies to.” This is where your web browser says, “Connecting to …” in the status bar. Once you’ve got an OK back, you say, “Send me that mp3 file.” You get a whole flood of letters back, numbered “1 of 23″, “2 of 23″ and so on. You count through them and realize that you’re missing number 17, so you send another message saying to re-send it. When you finally have all the letters, you can put them in the right order, open them up, and glom the mp3 file together.

This little procedure we’re going through here is called a Protocol. It’s not part of the postal system itself, but it’s a set of rules that people have agreed on for how to use the postal system. In this case, it’s a way of communicating reliably through a system that isn’t reliable. It’s a layer of communications on top of a layer of communications. It’s very meta.

The internet is built up of layers of these protocols. Like the envelopes inside envelopes, at each stage, you’re only concerned with the outermost layer. You slip on or peel off your envelope, and everything else is just the stuff inside it, be it one layer or many. IP, the Internet Protocol, is the postal system – simple, but not entirely reliable. TCP, the Transmission Control Protocol, provides the reliable, ordered delivery on top of IP. Email (SMTP), the web (HTTP), and others are actually another layer on top of TCP. Essentially, they all define standard forms for different mail-order requests.

Again, it’s even more complicated than that. But for now, lots and lots of little letters zipping back and forth across the world. That’s the way to think of it.