Archive for July, 2010


We hear alot about metrics and their importance. Often it is presented as if it’s the end of civilisation as we know it if we don’t have them.  I agree with much of the underlying sentiment – that we need to demonstrate how we are doing, where we are improving, how long it takes us to carry things out etc. etc. My issue is more around how we go about producing them.
 
In my department, we have always had a nightmare in the guise of time logging.  I won’t bore you with the details of this but it drives how we charge the customer and is pretty elaborate. Lets leave it at that.
 
I once had a manager who went even further:  In addition to doing our time logging (which was – ideally – daily)  he wanted us to enter an estimate of how much effort was left on each task and and a target date for completion. Someone had devised a ‘radar’ system that based on this data would work out when resources would become available, new work could be started, and all manner of statistics and metrcs would be available. The group pointed out that that could mean us all spending up to an hour a day logging time, and where would we log that time to?  So the idea gradually bit the dust.
 
The point is though, that if you are going to produce meaningful metrics – or any information come to that – people must be prepared to put meaningful data in and look at how we interrogate it afterwards.
 
So, Metrics aren’t free: If we’re not prepared to do these things, then it simply demonstrates that the metrics aren’t that important after all and we can live without them.
 
Interestingly, recently I have become involved in looking at how we can improve our time logging and metrics. We use Kanban to organise ourselves, and in the past have tried various methods of producing them.  We have tried Excel sheets with macros, various Lean software tools, add-ons to other tools, you name it. The problem is always the same: keeping the tool in sync with the board. I am trying a slightly fresh approach though:  If we have to do time logging – and there is no likelyhood it will go away any time soon – lets adapt it and structure the data better and get our metrics from that.  It might not be perfect, but at least we are getting some value out of something we have to do anyway.  

 

It will need a bit of development and time and effort;  so lets see if it happens and more to the point whether we really value the metrics…

I say ‘revisited’ but it never really went away. It’s always a challenge, and  recently, my team have re-opened the debate on how we do it. I have written previously about the issues so won’t restate them; Suffice to say that in recent years we have tried various different methods: Excel macros to arrive at a figure based on various criteria to which confidence and risk levels are applied; T-Shirt Sizes with notional figures; T-Shirt Sizes with bandings; We have tried ‘estimation pairs’ – a group who look at all estimates; We have considered identifying a single person who does all the estimates; the list goes on.

But it all comes back to the same issue. Either the final cost turns out to be nothing like we originally predicted, Or you spend ages on an estimate that the customer baulks and complains at, and the work doesn’t go ahead anyway.

Our specific situation isn’t helped by the fact that we have an internal recharging model meaning we book time to work which the customer is then billed for.  That means that if something goes ary, they will see the pain instantly.  Some would say that is a good thing and in a sense it is, but I cant help thinking we are still missing some kind of ‘trick’ in how we deal with it.  Even if we didn’t have a recharge model, we would still have to log and account for out time, so the issues of estimating will still be there. I’ll add we use Kanban to help manage and schedule our work. This is significant as I’ll explain in a moment.

We have started by revisiting our ‘T Shirt’ sizes and below is a table of what we have come up with. This was created by taking a fresh look at the tasks we need to perform and simply apply a notional figure based on experience.

XS

S

SM

M

Initial analysis, up to and including
Estimation Pair

2 hours

2 hours

2 hours

2 hours

Approval

Approval

Approval

Approval

Analysis and Acceptance Criteria (to Dev
Ready)

2 hours

4 hours

6 hours

10 hours

Development (to QA Ready)

4 hours

7 hours

14 hours

35 hours

Dev Pair (to QA Ready)

2 hours

3.5 hours

7 hours

17.5 hours

QA (to UAT Deployment Ready)

1 hours

2 hours

3 hours

3 hours

UAT activities

1 hours

2 hours

2 hours

2 hours

Deployment (to UAT, Release plans, CAB,
Releases etc)

1 hours

1 hours

1 hours

1hours

 
Personally, although these notional figures actually went up since the last time we created them, this seems reasonable to me.

It all looks good so far.

So now lets look at where I believe the problems come in….

1:   There is no notion of economies of scale

It is a reality of life that if someone is doing a series of very similar things together, productivity will increase. If you are decorating one room of your house you might as do three at the same time.  The whole exercise speeds up and you have less disruption in the long run. The same principle can be applied to estimating and managing tasks; If we group very similar tasks together the effort should come down: so, if we had 5 ‘extra smalls’ (XS) and we decide to do them together, the development effort isn’t 4hrs x5 but is reduced by some factor.  It’s slightly like a supplier giving you bulk discount (but only if it is processed as one order – or in our case if we keep the work together).

Note: We need some degree of caution before we get carried away with this: It depends on how much knowledge the people involved possess, whether we can keep them allocated to the work, and how similar the work itself is.

The problem is that Kanban has no concept of this.  Every card or story is treated as an atomic ‘independently schedulable’ piece of work and therefore the notion of economy of scale is alien.  Personally I think this is a flaw in the Kanban technique itself.  We can of course group things together ourselves, but that sometimes creates it’s own problems in making sure things are kept together as they move through the process, and are treated as a unit.  When we’ve tried this before it can go quite badly wrong.

It might well be that an ’extra small’ (XS) piece of work is actually deemed uneconomic to do in isolation.  It only makes sense if it is progressed with other pieces of work and developed as a piece of bigger thing.  Again, Kanban doesn’t address this.

SCRUM of course does attempt to deal with these issues by grouping similar work into the same sprint.

One option we are exploring is doing a final ‘re-estimation’ step immediately before the work goes into ‘Dev Ready’. At that point we review the estimate – and perhaps revise downwards as I’ve suggested above. This revision would have to be done according to a formula (which I haven’t quite worked out myself yet). This judgement is based on the entirety of what we now know, including how the work will be scheduled and resourced.

2.   A system is greater than the sum of it’s parts

This is related issue, and also arises from the danger of treating each story as an independent entity – we can loose sight of the bigger picture.  You need a coherent whole to refer back to.  Again, I don’t think Kanban – or perhaps more accurately –  Kanban as it is applied to software development, appreciates this.  In manufacturing you wouldn’t have pieces of bespoke work going through the same production line as everything else. Also, you have people on the line that are fully conversant with all aspects of one particular thing.  This is quite rare in software development – especially in business systems development where there is large amounts of variation. This also impacts the estimates, since it is tied to how much difference and variation you have in the work itself. The fact that we are constantly struggling to make different work ‘fit’ the system, and are always revising ‘WIP’ limits as a consequence, are examples of this.

I wrote a while back about ‘value for money‘ and how, in reality, the concept doesn’t really exist – or at least only exists in people’s minds. It’s rather like a ‘good lasagna’. Subjective.

A breathtaking example of how fruitless the idea of ‘value for money’ is, recently emerged here.

I rest my case.