The other day I met with some ex colleagues, one of which pulled out a copy of the ‘Toyota Way’.  I couldn’t resist asking a question which has been on my mind for some time, but that nobody seems to ask:

Does the entire Toyota Corporation (around  316000 employees worldwide according to Wikipedia) operate on lean and Kanban principles, and if not, why not?

Unsurprisingly, we didn’t really come up with an answer and after discussing the book, plus (at length) kanban and lean etc.  Eventually we changed the subject.  BTW, I would be prepared to bet a substantial amount of money that Toyota doesn’t entirely operate on lean and kanban principles; But if that is the case, what are we to make of it?

It has always made me curious, frankly, as to why the Toyota production process has been singled out for such interest within the IT profession.  It’s obviously a good process – but, equally there is little point in having a great production process if what comes out the other end isn’t wanted by the customers.  So there is obviously much more going on inside Toyota that is worthy of being scrutinised.  Product development and design would be interesting areas to look at, and there is some commonality with bespoke IT projects in this area, I would suggest.

So I wonder why it isn’t looked at?

This leads me to wonder whether looking at production processes is actually the best fit for IT.  Production processes are, very roughly, based around repetition and optimisation on the basis that the more you do the same thing (BTW this is quite different to a similar thing) the better you will get at it.  Once you get very good, you can start taking an interest the wider issues and have a say in how the work itself is organised. This is where issues around reducing waste and the like come in.

Don’t get me wrong: The Toyota Way is an interesting book and, like many others like it, makes very interesting observations and has good ideas – but it doesn’t really get to the core of some of the specific issues around bespoke development projects, or how to deal with the ‘unusual’. Maybe it’s not intended to of course. A while back I set the cat amongst the pigeons in a meeting by asking if anybody actually had any knowledge of how Toyota run their IT operation? (It would be rather ironic if it turned out they run their development projects on waterfall wouldn’t it…!)

So let me make another recommendation – a book I bought about 8 or 9 years ago called The Challenger Launch decision. This talks – in considerable depth about the Challenger space shuttle disaster of 1986. Now this might sound like a rather dramatic example to cite, but actually it covers many issues that will be familiar to people engaged in bespoke software development.  Furthermore it actually transpires that there reasons for this particular failure are rather less complicated than you might think.

I won’t spend too much time reiterating about the disaster itself as you probably know it well:  Primarily the disaster is remembered as a technical failure. The fault lay in the rubberlike O-rings.  The primary O-ring and it’s backup, the secondary O-ring, were designed to a seal a tiny gap created by pressure at ignition in the joints of the Solid Rocket Boosters. However, O-ring resiliency was impared by the unprecedented cold temperature that prevailed on the morning of the launch. Upon ignition, hot propellant gases impinged on the O-rings causing them to fail, creating a flame that penetrated the first aft field joint of the right Solid Rocket Booster, then the External Tank containing liquid hydrogen and oxygen.  The result was a catastrophic explosion.

Technology was not the only culprit, however.  The NASA organisation was implicated.  The Presedential Commission created to investigate the disaster revealed that the O-ring problem had a well-documented history at the agency.  The first documentation on it appeared in 1977 – nearly 4 years before the first shuttle flights.  Furthermore the commission learned of a midnight-hour telephone conference on the night before the launch between NASA and Morton Thiokol in Utah (the contractor responsible for the Solid Rocket Boosters). Worried Thiokol engineers argued against launching on the grounds that the O-rings were a threat to flight safety.  NASA managers decided to proceed.

The book explains the sociology of the disaster and the decision making around it.  It shows how mistake, mishap and disaster are socially organised and systematically produced by social structures.  No extraordinary actions by individuals explain what happened: there was no intentional managerial wrongdoing, no rule violations, no conspiracy.  The cause of the disaster was a mistake embedded in (according to the book) the banality of organisational life and facilitated by an environment of scarcity and competition, elite bargening, uncertain technology, incrementalism (interesting), patterns of information, routinization (interesting), organisational structures, and a complex culture. Some of the issues the book deals with are these: Scarce resources and compromised excellence: NASA’s environment of institutional consensus had changed to one requiring political allegiances to secure funding.

I point out the phrases incrementalism and routinization purposely because they will obviously strike a chord with the agile movement.  In the Challenger case, these were cited as problems and causes of the disaster itself.  I draw no conclusions from this, but it’s interesting.  The ‘routinization’ point is especially interesting.  In the book, it describes a mind set called ‘production culture’ where – to cut a very long story short – people become so used to doing the same thing over and over, they start to become blind to the risks and dangers of it.  This in conjunction with another mindset called ‘normalisation of deviance’ means people get complacent. They also start to bend the rules to – again to cut a very long story short – accommodate what they ‘got away with last time’.

I will return to some of these topics later.

« »