I say ‘revisited’ but it never really went away. It’s always a challenge, and  recently, my team have re-opened the debate on how we do it. I have written previously about the issues so won’t restate them; Suffice to say that in recent years we have tried various different methods: Excel macros to arrive at a figure based on various criteria to which confidence and risk levels are applied; T-Shirt Sizes with notional figures; T-Shirt Sizes with bandings; We have tried ‘estimation pairs’ – a group who look at all estimates; We have considered identifying a single person who does all the estimates; the list goes on.

But it all comes back to the same issue. Either the final cost turns out to be nothing like we originally predicted, Or you spend ages on an estimate that the customer baulks and complains at, and the work doesn’t go ahead anyway.

Our specific situation isn’t helped by the fact that we have an internal recharging model meaning we book time to work which the customer is then billed for.  That means that if something goes ary, they will see the pain instantly.  Some would say that is a good thing and in a sense it is, but I cant help thinking we are still missing some kind of ‘trick’ in how we deal with it.  Even if we didn’t have a recharge model, we would still have to log and account for out time, so the issues of estimating will still be there. I’ll add we use Kanban to help manage and schedule our work. This is significant as I’ll explain in a moment.

We have started by revisiting our ‘T Shirt’ sizes and below is a table of what we have come up with. This was created by taking a fresh look at the tasks we need to perform and simply apply a notional figure based on experience.

XS

S

SM

M

Initial analysis, up to and including
Estimation Pair

2 hours

2 hours

2 hours

2 hours

Approval

Approval

Approval

Approval

Analysis and Acceptance Criteria (to Dev
Ready)

2 hours

4 hours

6 hours

10 hours

Development (to QA Ready)

4 hours

7 hours

14 hours

35 hours

Dev Pair (to QA Ready)

2 hours

3.5 hours

7 hours

17.5 hours

QA (to UAT Deployment Ready)

1 hours

2 hours

3 hours

3 hours

UAT activities

1 hours

2 hours

2 hours

2 hours

Deployment (to UAT, Release plans, CAB,
Releases etc)

1 hours

1 hours

1 hours

1hours

 
Personally, although these notional figures actually went up since the last time we created them, this seems reasonable to me.

It all looks good so far.

So now lets look at where I believe the problems come in….

1:   There is no notion of economies of scale

It is a reality of life that if someone is doing a series of very similar things together, productivity will increase. If you are decorating one room of your house you might as do three at the same time.  The whole exercise speeds up and you have less disruption in the long run. The same principle can be applied to estimating and managing tasks; If we group very similar tasks together the effort should come down: so, if we had 5 ‘extra smalls’ (XS) and we decide to do them together, the development effort isn’t 4hrs x5 but is reduced by some factor.  It’s slightly like a supplier giving you bulk discount (but only if it is processed as one order – or in our case if we keep the work together).

Note: We need some degree of caution before we get carried away with this: It depends on how much knowledge the people involved possess, whether we can keep them allocated to the work, and how similar the work itself is.

The problem is that Kanban has no concept of this.  Every card or story is treated as an atomic ‘independently schedulable’ piece of work and therefore the notion of economy of scale is alien.  Personally I think this is a flaw in the Kanban technique itself.  We can of course group things together ourselves, but that sometimes creates it’s own problems in making sure things are kept together as they move through the process, and are treated as a unit.  When we’ve tried this before it can go quite badly wrong.

It might well be that an ’extra small’ (XS) piece of work is actually deemed uneconomic to do in isolation.  It only makes sense if it is progressed with other pieces of work and developed as a piece of bigger thing.  Again, Kanban doesn’t address this.

SCRUM of course does attempt to deal with these issues by grouping similar work into the same sprint.

One option we are exploring is doing a final ‘re-estimation’ step immediately before the work goes into ‘Dev Ready’. At that point we review the estimate – and perhaps revise downwards as I’ve suggested above. This revision would have to be done according to a formula (which I haven’t quite worked out myself yet). This judgement is based on the entirety of what we now know, including how the work will be scheduled and resourced.

2.   A system is greater than the sum of it’s parts

This is related issue, and also arises from the danger of treating each story as an independent entity – we can loose sight of the bigger picture.  You need a coherent whole to refer back to.  Again, I don’t think Kanban – or perhaps more accurately –  Kanban as it is applied to software development, appreciates this.  In manufacturing you wouldn’t have pieces of bespoke work going through the same production line as everything else. Also, you have people on the line that are fully conversant with all aspects of one particular thing.  This is quite rare in software development – especially in business systems development where there is large amounts of variation. This also impacts the estimates, since it is tied to how much difference and variation you have in the work itself. The fact that we are constantly struggling to make different work ‘fit’ the system, and are always revising ‘WIP’ limits as a consequence, are examples of this.

« »