Agile Estimation Primer

Traditional Estimation

In traditional estimation models, a great deal of effort is expended at the beginning of the project to determine how long it will take to deliver everything in the project plan.  These estimates are generally expressed in hours, and thus can be multiplied by a blended hourly rate to provide a cost estimate for the project.

There is a general feeling among project managers, that due to the rigor that went into the estimation process these are ‘good’ or ‘accurate’ estimates.  It is not uncommon to see a high level of ‘precision’ in the estimate.  Even at a very high level of estimation, there is an attempt to create a relatively small ring of uncertainty around the estimate (see Figure 1).  Ironically, any illusion of precision is erased almost immediately as those estimates are padded with contingent time to account for any problems with the accuracy.  Another trick is to utilize future change requests as opportunities to ‘correct’ the highly precise, yet potentially inaccurate estimates.  The longer the project, the more opportunities there are to adjust. 

Figure 1 – Accuracy vs. Precision

Unfortunately, it’s just as likely that on a longer project, there will be some work that is designed and estimated that will not be completed due to changing project needs.  There is little recourse to protect against this outcome. 

There are some recognized shortcomings in traditional estimation.  Most prominent among them is the fact that traditional estimates are not portable – an estimate in hours made on behalf of one person, is very likely to change if it is performed by another person.  Considering the amount of time we spend creating traditional estimates and the longer timeframe of these efforts, there is a fair amount of risk that the team makeup will change before the feature is built, further reducing the effective accuracy of these estimates.  In other words, we spent a lot of time making a very precise estimate that is wrong.  This is wasteful.

Agile Estimation

Agile development methods and models all evolved from a common origin in Lean manufacturing techniques.

One of the driving tenets of Lean is the relentless removal of waste from a system.  If traditional estimation models lead to waste, then Lean demands we find a way to reduce, if not outright eliminate that inefficiency.

One way to avoid waste in estimation is to reduce the amount of time spent making the estimate. As noted earlier, there are two factors when it comes to judging an estimate:  Accuracy and Precision.  Our first duty is to get the estimate into the right ballpark – to have it be accurate-ish.  Our second duty is to reduce the uncertainty – to make it more precise.  There is a cost to higher levels of precision.  The longer you spend working on an estimate, the more you can learn about what is needed and the higher the degree of precision – however, you will eventually reach a point of diminishing return.  Yes, spending time on the estimate will tighten the circle, but the degree of improvement will be less and less with each unit of time spent learning.  In effect, the cost of gaining precision will begin to exceed the cost of the variance in the estimate.

If we gather an estimate for something that we will never implement, the time spent gathering that estimate is by definition wasted.  We should try to reduce waste as much as possible.  The farther out on the planning horizon a deliverable is, the more likely something else will get in the way before we get to it.  As time passes it becomes less likely that something else will get in the way, and therefore more likely that the item will be implemented.  It stands to reason that estimates on things far out on the planning horizon should be as fast as possible, then as we get closer to actually implementing that item, we spend a little more time to get a more precise value.  We can repeat this procedure as it grows nearer, incrementally spending a little more time with the feature, adding detail to our understanding, and precision to our estimate. 

There are multiple levels of estimation, and each level provides an increasingly level of precision in the estimate.  In the model being proposed, the three levels are shown in Figure 2.

Figure 2 – Agile Estimation Levels

For the first two levels of estimation (High Level and Size), we use a concept called ‘relative’ sizing.  Relative sizing is a practice that allows changes in understanding to be reflected in estimates without re-estimation being required.  ‘How is that possible’, you may ask? 

Relative Sizing

In general, people are not very good judging the exact size of a thing just by observation, but they are very good at judging between the relative sizes of two things.  You’ve been doing this since you were young – especially if you have siblings.  Children spend a lot of time evaluating their place in the world relative to others.  If there was a choice between two pieces of chocolate cake, you immediately knew which was the bigger one or if the difference in size was so small, as to be insignificant.  Relative sizing of features leverages that same innate skill.  You may not be able to tell me the exact size of a thing just by looking at it, but you sure can look at two things and identify whether one is bigger, smaller, or about the same as the other.

That same skill can be leveraged to gauge the relative magnitude of two things. If I give you a choice of two objects where one is clearly larger than the other, you won’t need a measuring tape to tell me that one of them bigger than the other.  You could probably even tell me if one was more than twice as big as the other!

To illustrate why we use this relative sizing instead of absolute, consider this example:

Let’s say we’re estimating the size of an object…something innocuous, like a tennis ball.  I do all my measuring in inches, you do all your measurements in feet, and my offshore team does all their measuring in centimeters. Now, I might say it’s a “2.7”, you call it “1/6th”, and the offshore team calls it a “7”, and as long as we’re only working in our own world, and have no need of passing work from one team to another, we’d be fine.  What happens when I ask the offshore team to take on this Size 2.7 item?  There is a real risk they could just say ‘yes’ without realizing your scales are different, and suddenly get surprised to learn that this thing is more than double the effort of the 3 they thought they were getting.

To get around that problem, we need a different unit of measure, and in a shared workspace, we want that to be a common unit of measure as well (i.e., everyone agrees to use it).

When we compare a golf ball to a tennis ball, we could easily say the tennis ball is bigger than the golf ball, and if pressed, we could take it a step further and say it is twice as big as the golf ball.  If I have a container that holds a dozen golf balls, then based on our thumbnail comparison, I could probably surmise the container will hold half as many tennis balls – or six. 

Now here’s the interesting part.  If YOU have a container that holds ten tennis balls, do we need to re-estimate the ball sizes or can we infer that your container will hold close to twenty golf balls? Further, if my offshore team has a duffel bag that holds 50 tennis balls, I don’t need to convert the interior dimensions of their bag from cubic cm to cubic feet or inches to be able to estimate how many golf balls will fit in there.  We already have established that tennis balls take up twice as much space as golf balls. As long as we have an agreed upon frame of reference, this is pretty simple.  And it’s simple because the balls didn’t change size.  It is the capability of each container to carry the balls is different!

Let’s take a moment to define a couple of terms. We’ll refer to the ‘size’ we estimated our balls in as a “Ball Point”. The ball point has nothing to do with the exact dimension of the balls. It is a new unit of measure. A golf ball is one ball point, and a tennis ball is 2 ball points, because it is twice as big as a golf ball. The number of ball points we believe will fit in our container is the container’s “Capacity”. Because a tennis ball is 2 ball points, and my container holds 6 tennis balls, the capacity of my container is 12. Keep in mind that you will always encounter people who believe your container can hold more balls, and they’ll encourage you to put more an more into your container. This can result in an unstable situation where balls may fall out of the container. The number of ball points that actually get delivered without being dropped or damaged is the “Velocity”. Note the distinction: Capacity is theoretical forward-looking (Plan-based), Velocity is based on practical observation of results (Outcome-based).

Relative sizing helps get us past a lot of hand-wringing about specific differences between two items.  What if you and I are talking, and I make the bold statement that I think a golf ball is 1 inch in diameter.  By my rough, relative sizing, that would mean a tennis ball (observably twice its size) must be around 2 inches in diameter.  You could counter that you think a tennis ball is closer to 3 inches in diameter; therefore a golf ball is closer to 1.5 inches.  Do we gain anything from this discussion?  We’re still basing our estimates on observation alone.  Unless something changes, more discussion isn’t going to improve our estimates.  And the fact that we made that estimate doesn’t change the capacity of our respective containers.

Let’s consider this:  As certain as we are in our estimates, we’re both objectively wrong, or at least objectively inaccurate (the Golf ball is actually 1.68 inches, and the Tennis ball is 2.7 inches).  But if you’ll excuse the pun, we’re in the ballpark!  Remember, we are providing an estimate, not an actual.  We don’t need to know the exact measurement of either ball to do estimation with relative sizes. We know from observation that my container holds six tennis balls, yours holds a dozen, and the offshore team holds closer to twenty-five. The (really important) point here, is that none of us needs to know the exact measurement of the tennis ball!

If we assume a margin of error to our estimate, then things get even better.  Instead of saying that we’re estimating an exact size, we are estimating a value that falls in a range… maybe +/- 25%

The first two balls give us a decent starting point, but it’s hardly the full universe of balls.  Let’s put a bunch of balls out on the table, and group them by those that are close to each other in size.  If we did that, we’d probably group the Squash, Ping-Pong and Golf balls together.  We’d group the Tennis, Cricket, and Baseballs together.  Croquet, Softball and Bocce are together.  The volley, bowling and soccer balls are close enough to each other to form a group.  The Basketball is largest overall, but both the football and rugby ball are longer.  One could argue the football is more complicated, so we can use its oddness to justify going with the longer dimension as our estimate.  Now for the magic:  We can very quickly, decide that the smallest group of balls will be “1’s”, the second grouping is roughly twice that size, so they’re “2’s”.  The next group are “3’s”, and the next group are “5’s”.  The last group is “8’s”.  By strict observation, these are reasonable groupings.  There’s also an odd, in-between size.  Both Racquetballs (and Billiard balls) are firmly between the Golf and Tennis groups.  For something on the line, we’ll err on the side of caution and promote these items to the higher group, in this case making them “2’s”.

Table 1 – Ball Points

Calibration

Let’s think back to our example with everyone using different units of measurement.  Nobody was really wrong; they just spoke different languages when it came to estimates.  What if we took all three of the individuals aside, threw a golf ball on the table, pointed to it and said, “One.  That Golf ball is a One.  We’re all agreeing to call it a One.  If the Golf ball is a One, what’s the Ping-Pong ball?”

We would expect everyone to say “One”. 

Then you toss a Tennis ball on the table.  “What’s that?”

“Two”

“Good.  From now on, we need you guys to shift your way of thinking to this scale, so we can more easily communicate and share work with each other.”

To put that another way, the shift must be made from thinking about relative size as related to the physical measurement of a thing, and instead by looking at the relative magnitude of it. It doesn’t matter how many inches, millimeters or hectares big a thing is if we can say it is 1, 2, 3, 5, or 8 times the size of a reference we all agree on.

Validation

Some of you reading this, have intuitively recognized the value of relative sizing and therefore don’t require further proof. If you count yourself among the enlightened, feel free to skip this section. On the other hand, for the doubters among you…

I don’t normally advocate this next step, but it seems a lot of people prefer to validate their assumptions before accepting a new method.  So if we must break from the spirit of relative sizing by whipping out a measuring stick, I’d much rather you did it with the balls, than with actual user-stories. 

Let’s try checking our work and actually measure some of the balls.  If we take all the balls in each group, and average them, then throw a 25% window around that average, you’ll create ranges that you can use confirm the limits of each range.  The earlier hand-wringing we observed over whether a racquetball belongs as a 1 or 2 is seen here as well.  It clearly falls on the cusp between the two ranges.  In times like this it is faster to just err on the side of caution and round it up to the next larger size.

Table 2 – Ball Measures

Note: This is an unnecessary level of calculation. There is no absolute relationship between ball points and size. There is an approximate relationship. Don’t look for anything deeper than that.

This was an interesting intellectual exercise, but its real value is in the fact that we verified that our relative groupings were “close enough”, and we don’t ever need to do this again.  (Trust me; it’s an agile anti-pattern for a reason!)  When we’re estimating a lot of things, the ability to do so quickly becomes important.  Hopefully, this exercise will reassure you that you don’t need to worry too much about the exactness of your estimates.  A range of uncertainty is an acceptable trade-off for the increase in speed!

High-Level Estimates

Remember that (most of the time) our ultimate goal in estimation is to provide a cost estimate, and/or validate our ability to meet a deadline. We are often asked to provide these data points before we even know a lot about the requested deliverables.  In the traditional world, high level estimates are often created by comparing a new piece of work with past pieces of work.  We figure out how big the old thing turned out to be, then assume this new thing is approximately the same size, plus or minus some pre-negotiated margin of error!

High Level estimation, a.k.a. Rough-Sizing or T-Shirt Sizing is done when we know the least about the work being proposed.  Therefore, we accept that our Accuracy will probably be way off.  This level of estimation is generally presented as having a precision of +/- 50%.

What do you apply these estimates to?  Features or Epics – generally large units of work that we’d expect to take the team more than one iteration to complete. 

When is it appropriate to use this level of estimation?  At the beginning when we know the least. 

How will we use this level of estimation?  Well for one thing, we can use it to create a high-level roadmap.  The Product Roadmap provides a rough idea of whether it is possible to deliver a body of work in a given timeframe, and to provide a rough idea of the sequence of delivery.

Units of Measurement:  T-Shirt sizes (XS, S, M, L, XL, XXL, etc…)  <- Note, this is a shorthand nomenclature that doesn’t equate to cost or time (yet).  More on that in a bit…

As a measurement of relative size, it is more important to understand the relationship between the T-Shirt sizes.  We will employ a simple rule.  Each subsequent T-Shirt size doubles the previous one.  Thus, if we were to equate “Small” is equivalent to the amount of work in 1 Sprint, then “Medium” is twice that amount of work (2 Sprints), and a “Large” is double the medium (4 Sprints).  See Figure 3.

Figure 3 – High Level Estimates

This example assumes two-week Sprints

Mid-Level Estimates

Size estimation a.k.a. Story Sizing, Story Pointing, Fibonacci, is done when the team has had a chance to understand the Epics/Features, and are now taking part in the decomposition of these larger deliverables into well-understood, bite-sized units of work that can each be delivered within a single iteration.  These size estimates are generally accepted to have a precision of +/- 25%

The most common scale used in Story Point sizing is the Fibonacci sequence.  If you remember back to your school days, the Fibonacci Sequence is defined as a sequence of numbers each number in the sequence is the sum of the two previous values:  1, 1, 2, 3, 5, 8, 13, 21, 34, etc…  Fibonacci allows us to correct for another oddity in human perception.  When we start talking about relative size, identifying things that are double and triple in size are pretty straightforward.  But as things get bigger and bigger, we have a tendency to compress them in our minds.  Such that it is virtually impossible to distinguish whether something is 9,10, or 11 times the size of another thing.  You can see how hard that would be, right?  Now imagine arguing whether something is 36 or 37 times the size of another thing!  By adopting the Fibonacci sequence, we’re introducing a distinction between values that’s pretty significant.  I may not be able to distinguish between 9 and 10, but I can still make a distinction between 8, 13 and 21. 

Figure 4 – Relative Sizes are Not Absolutes

We can also leverage this phenomenon to keep our range of values contained.  Lower values are easier to comprehend, therefore have a lower level of uncertainty.  The greater the range between two consecutive options the greater the degree of uncertainty.  So, accuracy and precision can both improve by shifting to Story Points, AND selecting reference sizes that will keep your individual estimates on the smaller side.

Building a Product Roadmap

The purpose of a product roadmap is to establish a rough idea of what’s possible over a longer term.  To accomplish this, the Product Owner needs to gather the Development Team and any SMEs who would have expertise in the area being developed.

Sequence of Events

  1. The Product Owner presents the list of major features to the team.  The team asks clarifying questions with the intent of figuring how much effort will be involved in creating that feature. 
  2. The team was encouraged to ask questions to try to gain an appreciation for the magnitude of the work involved. 
  3. While that discussion was going on, an index card was created to represent that feature. 
  4. The team was then asked to place the feature card on the table in sorted order (smallest on the left, largest on the right).  The first card is just placed in the center of the table, with things being placed by being bigger, smaller or about the same.  Cards are shifted to make room as necessary.
  5. We now had a simple sort of features from smallest to largest.
  6. Starting at the smallest Features, we had a discussion of how big those items were.  “Did the team think those items could be completed in a couple days? (no)  How about a couple weeks? (yes).  Since this is one of our smallest work items, and it will take one or more sprints, we appear to have a bunch of Epics.”
  7. This is the first point where the team injected their understanding of size into their estimate.
  8. So, now we asked about the cards that were near this one on the table.  If you think this smallest Epic is a couple weeks of work, follow the sorted line until you get to one that is no longer in that neighborhood.  Everything to the left of that story is in the same size category.  Let’s call those “Smalls”.
  9. The determination of the Smallest thing is key to relative sizing.  At this point, it’s important to shift the team away from thinking explicitly about duration, but to think instead about magnitude of work.  They are asked to think about the collection of Small Epics, and what they know about them.  Now we ask them to imagine what twice that amount of work would feel like.
  10. With this image of ‘twice as large’ firmly in mind, start with that story they identified as the first epic beyond Small.  “Does this epic feel like it is less than double the amount of work of the Smalls? (yes).  Then this will be our first “Medium” epic.  “Let’s go back to the line of cards and identify the others that are close to that size.”  The first one that is more than double the Small, is a Large.  Everything that falls between the last Small, and the first Large is “Medium”.  We mark all the Mediums, and now doubling again, find all the epics that are “Large”, then doubling again, all those that are X-Large.

At that point, we had Epics grouped in doubling clusters.  We can summarize this as follows:

  • Small = 1 x (base) = 1 Sprint
  • Medium = 2 x (base) = 2 Sprints
  • Large = 4 x (base) = 4 Sprints
  • X-Large = 8 x (base) = 8 Sprints

(base) was assumed to be “a couple weeks”

  1. We need to acknowledge that these estimates were made very quickly, with minimal information.  We expect these estimates to have an uncertainty of +/- 50% 
  2. The only thing we’re really sure about is that each subsequent group is a double of the previous. 
  3. Assuming (base) is “a couple weeks” it is convenient to refer to (base) as “1 Sprint”.  Note that this is only an ASSUMPTION!  It is possible that everyone was uniformly optimistic.  We won’t know for sure until they start working.
  4. The Team has a WIP limit of 3, so will concentrate their focus on no more than 3 Epics at a time  — Dividing effort across multiple epics will increase the length of time it takes to deliver those Epics.  So, not finishing a Small in “1 Sprint” doesn’t necessarily mean the estimate was wrong (it’s certainly a reason to investigate further).
  5. How much rework for Alpha defects are being addressed?  Are they impacting development?
  6. How have we accounted for the loss of the tester?  I know the devs are picking up the slack, but it stands to reason that’s going to slow down their progress from our base assumption.
  7. We’ve added developers, should our WIP limit assumptions shift too?

But if our assumption turns out to be false – what if it turns out base=3 weeks.  Then our doubling can still help.

  • Small = 1 x (3 weeks) = 1.5 Sprints
  • Medium = 2 x (3 weeks) = 3 Sprints
  • Large = 4 x (3 weeks)     = 6 Sprints
  • X-Large = 8 x (3 weeks) = 12 Sprints

Improving the Assumptions

Break the Epics into Stories, Size the Stories, Total the Sizes. 

We tricked Jira into reflecting the Product Roadmap layout by saying that the Team’s assumed Velocity was 30, and in order to align that to show us a WIP limit of 3, we assigned sizes to our Epics in Jira in units of 10’s  .  

  • Small = 10 SP
  • Medium = 20 SP
  • Large = 40 SP
  • XLarge = 80 SP

Thus, 3 Small Epics with size of 10, worked on in a single Sprint, would deliver 30 points, matching our Velocity.   Note that this conversion made Jira look like the spreadsheet, but that didn’t make it right!

In fact we quickly ran against the rocks when we started decomposing stories, and those stories totaled more points than the table above would indicate.   That’s actually a good thing!  If we just replaced the Story Point “Size” of each Epic with the sum total of the Stories beneath it, the Portfolio plug-in would adjust to accommodate that.  Unfortunately, we made TWO tweaks … remember we adjusted expectations to provide a WIP limit.  If the team’s velocity goes from 30 to 45, then the (base) assumption we’re making would change from 10 SP to 15 SP.

If I had a tractor…

Agile Estimation and Planning

Introduction

With the advent of agile methods and their affinity for rapid delivery cycles, traditional estimation models are coming up wanting.  Agile estimation, while undoubtedly faster is also difficult for some to grasp.  Expressing work in terms of time has been burned into the software community’s psyche.  But it doesn’t have to be that way.  I coach agile teams in making the switch from traditional development models to agile.  In doing that, I help them learn new ways of estimating.  This series of articles will hopefully explain – through metaphor, why the old way of estimation was problematic, and how the new way addresses those shortcomings, is faster, is no less accurate, and can even be fun!

If I were to stand up in front of a room full of people, and select individuals at random to answer this question:  “How big is your lawn?”, I would get a variety of responses.  Many common answers are “1/4 acre, 1/2 acre, 10000 square feet, or maybe 900 square yards”.  The answer I would not expect to get is “45 minutes”.

Why is that?  For one thing, I asked for a size estimate, not a time estimate.  Yet, all over the globe, if I ask a software developer or project manager to estimate the size of a feature, they will generally answer in a number of hours, days, weeks, months, etc.

Time instead of Size.

This behavior has been ingrained in the development community for decades.  Agile methods have attempted to break that habit by requesting effort estimates as a measurement of relative size (often story points).

Indeed, I have been coaching and guiding teams long enough to have seen a variety of attempts to shift teams to “how big?” rather than “how long?”. Admittedly, it’s a really tough transition to make. And a very common way of approaching that transition is to offer a crutch – usually a conversion factor … you know, so the team members can wrap their heads around the concept more easily.  A common one I’ve seen is 1 Story Point = 1 Ideal Day.  SAFe offers the crutch that 1 Story Point is about 8 hours.  I’ve seen someone recently amend that to a range from 1 to 10 hours.

I’ve adopted a lot of truisms in my years of observation and continuous improvement. One that has stuck with me consistently though my agile career, is simply this:  “There is no direct relationship between Story Points and Time.”  If I want to leave some wiggle room, I might say “no reliable relationship” or “no consistent link”.  Of course then I’ll toss in that story points and time is not constant from team to team.  Any attempt to build that bridge will be fraught with drama, and lead to disappointment.

But nobody believes that.

And so, again and again, those who won’t acknowledge the past, repeat it over and over.

I’m hoping this series of articles on Agile Estimation and Planning will serve to save even a handful of you from following so many others in walking off that cliff.  If nothing else, it’s a chance for me to hone my explanation.

Feedback is welcome, please weigh in.

Part 1 – Size

Let’s go back to the lawn example, and start with a simple explanation as to why time is a terrible way to measure size…

I asked you how big your lawn is. I want you to think of your lawn. Think of the dimensions, think of how much of it is actually available for mowing.  Yes, that’s right. The grass.  Not the driveway.  Not the garden, nor the planting beds.  Not the patio or the stoop.  Got it?  First of all, notice that the size of your lawn is not the size of your lot.  The lot represents the dimensions of your property.  But not all of that is lawn.

Are we good so far?  I don’t want to lose you.

Now.  Picturing your lawn, and only your lawn:  How big is it?  Odds are, unless you’ve played this game before, you can’t give me an exact answer. You probably could if I gave you enough time and a tape measure.  You could probably get me a pretty exact measurement of the lawn part of your property.  But I have two problems with that.  First, it would take too long, and second, that wouldn’t be an estimate anymore, would it?  It would be an actual.  Traditional estimation models excel at doing this (pun intended), by giving you a really long time to lull yourself into a false sense of security in your ability to predict the future.  You may have noticed this activity is often followed later by a check of how accurate your estimate was — or more to the point, how wrong you were.  “We gave you months to predict the future! Why were you unable to give us a better estimate?”  That’s a hell of a way to start a conversation, isn’t it.  Let’s try to accentuate the positive a little, shall we?

In a moment we’ll get back to talking about the size of your lawn…by comparing it to something else!  Studies have shown that while humans are pretty bad at exact estimates, they are very good at comparing things.  This is a survival skill you learned as a small child – especially if you had siblings.  Remember family deserts?  Remember when there was one piece of cake left and both you and your pesky brother (or sister) wanted it? Remember how your mother, with the wisdom of Solomon, suggested you split the piece? The rules were simple: One of you sliced.  The other one chose – which gave you all sorts of incentive to slice well.  But no matter how carefully the slicing was carried out, one piece wound up slightly bigger than the other one and I’ll wager my half of the cake that you could tell exactly which half was bigger (the one you wanted), just by looking at it.  Or perhaps, they were actually close enough that there was no real distinction between them, and you both got to eat a piece of cake with satisfaction of knowing nobody got cheated.  We are going to leverage this skill in Agile Estimation.  The ability to look at two things, and declare whether one is bigger or smaller will prove valuable here!

Ready?  Do you need some cake first?  We don’t have time for cake.  Cake is for closers.  Are you a closer?  Let’s see what you’ve got.

Think about your neighbor’s lawn.  Let’s see if you can answer a simple relative sizing question.  Is your neighbor’s lawn, bigger, smaller, or about the same size as yours?  If you live in the suburbs this part will be really easy. Odds are pretty good that your neighbor not only has the same size lot as you, they probably also have a similar amount of grass in their yard.

So, what’s your answer? Is one lawn bigger than the other? Or perhaps, they’re close enough that you’re willing to call them even.  We’re talking now about degree of precision. When we ask for estimates, we should not be asking for perfection.  We just want to know within a margin of error.

To make this next part easier, I’m going to switch to talking about myself for a sec, partly because I’m having trouble seeing your lawn, but also because my neighborhood provides a particularly compelling example.

I can say with a fairly good degree of certainty that not only do I and my two adjacent neighbors have similar-sized lots, but the amount of open grass is pretty similar between the two — close enough at any rate.  That means, in terms of relative sizing, our three lawns are the same size.  How big is that?  Not sure yet, let’s think a bit farther.

Now think about your neighborhood, and the finite universe of houses and lawns that surround you.  Can you think of any lawns smaller than yours?  Can you think of any that are bigger?  If you can answer ‘yes’ to either of those questions, then the next question will be “How Much Bigger, or How Much Smaller”, and because we haven’t established a unit of measurement yet, let’s just use relative sizes.

Returning to my yard, the houses in the middle of the block all have the same lot size.  I also can’t think of any houses in the area with a smaller yard; but I can think of a few yards on the corners that are bigger… almost twice as big!  And then if I think about all the houses on the block, there is one house that due to the curve of the road, I’m pretty sure his lawn is closer to triple the size of mine.

I’m going to establish an arbitrary starting point now.  Based on my understanding so far, my yard and my neighbors’ are all 1 point yards, the corner houses have 2-point yards, and the odd-shaped yard a 3-point yard.  The chief advantage at this point, is this measurement was FAST.  Not a lot of time spent agonizing over it.  It was a quick 1,2, 3 and we’re ready to move on.

Figure 1.1

I now have three data points we can use to compare other work to.  If someone asks me over to their house and say, “could you estimate my lawn?”, I need only look at it, and compare it against the three data points in my head.  Is the new lawn bigger, smaller or about the same as one of those other lawns.  And if it is bigger or smaller, how much so?

The other advantage is that this estimate is PORTABLE – meaning it can be applied to the work of more than one team.  We’ll start to see why that’s important in the next section.

Part 2 – Scope

At this moment, all I have is a cursory understanding of how a few pieces of work relate to one another.  There are a myriad of activities that one could perform in support of a lawn.  We’re going to focus initially on mowing.

One weekend not so long ago, I looked at a long list of activities I needed to work on, and was trying to figure out how much I could do with the time I had.  One of the things that I have a pretty good handle on is mowing my lawn.  My definition of “Done” was pretty basic: gas up the mower, make sure it was set to “mulch”, and wander the property until all the grass was a nice, mostly uniform height. On most weekends, using my self-propelled lawnmower, I have been able to complete that job in about 45 minutes.

“Ah HA!”, I hear you cry, “Time!  Your one point lawn takes 45 minutes.  I’ve cracked the code!”

Well.  That’s nice.  Good for you!  But before you pat yourself on the back and start dividing my backlog into 45-minute increments, I’m going to wrinkle this up for you a bit.

Last summer, I hired someone to mow my lawn for me.  Don’t judge.  Work was demanding, and it was hot, and once you hire one of those guys, they keep coming back.  So my point is this:  Ed didn’t have a measly self-propelled walk-behind mower.  No.  He had one of those zero-turn-radius, stand/ride-behind monster machines that goes zero to sixty in about 2 seconds.  He fired that bad boy up, and viola! My lawn was freshly shorn in 10 minutes!

So, let me ask you this: Did my lawn change size?  If there is a universal constant between time and size, then Ed’s performance had reduced my 1 point story to something under a quarter point.

I submit to you that my lawn is in fact, the exact same postage stamp it has always been.  If it was 1 point before, it is 1 point now.  But something is clearly different.  Even I have to admit that 45 minutes vs. 10 is pretty compelling (or at least disheartening).

The difference has very little to do with the lawn itself.  The difference is that Ed, with his superior machine is capable of delivering the same work in less time than me.  How can I use this?  Well, if you take my 1-point lawn, and assume that every approximately 1-point lawn will take the same approximate amount of time (for me) then I can use that understanding to suggest how many 1-point lawns I believe I could fit into a weekend.  Given what I know, I think I could complete approximately 16 points worth of lawn .  But my friend with the impressive hardware can do many more!  The math whizzes out there have probably already concluded that Ed can do six 1-point lawns in an hour.  So given an eight hour workday and two days in a weekend, he’s going to be pounding out 96 points worth of lawns every weekend.  Those same math whizzes will point out that my sixteen-lawn estimate for myself is obviously under-represented.  I should be able to do four lawns every 3 hours, so in an 8 hour day, I should be able to do 21 points worth of lawns in a weekend, and if I just did a little overtime, I could pull off 22!

Isn’t math awesome?

and Terrible?

Allow me to toss a little water on those flames of victory you’re dancing around.  Unless you are going to work things out so those 21 lawns are placed end-to-end next to each other and set up my mower to be perpetually full of gasoline, I can’t possibly attain the number you’ve so carefully assigned to me.  I have to move the mower from one location to another.  I have to refuel at every job.  I have to stop to take a drink of water, or eat lunch, or answer a call.  Your assumption of my velocity seems to be missing a few things.  Your assumption of Ed’s velocity is just as wrong, by the way.  In order to move that behemoth of his, he needs to drive it up onto a trailer in order to move it to the next job site.  Where would we account for loading and unloading time?  Where do we lump the travel time between locations?  What about bathroom breaks?

The capability of a team is made up of a lot more than just the sum of the hours of work they perform.

Part 3 – Definition of Done

Something else happened when I hired Ed to mow my lawn.  He brought friends.  Not only was Ed mowing my lawn, but he had another guy with a gas-powered trimmer and yet another guy with a leaf-blower helping him.   So when he was “Done” with my lawn, not only had he completed the mowing job in less time, but he had DONE MORE than I ever did in my 45 minutes.  Trimming!  Edging!  Cleanup!  And once my wife realized those were on the table, those all became desirable additions to my lawn mowing regimen as well!  It was no longer sufficient for me to just gas up my mulching mower and try to get as close as I could to the flower beds.  I needed to pull out my electric trimmer, and follow myself around the yard, then edge the sidewalk, and then make sure I swept the sidewalk and driveway.  Guess what!  My 45-minute job wasn’t 45 minutes anymore!  Ed has a team of three people.  My team still only has me on it.  All that additional work added another hour to the time it takes me to complete a typical 1-point lawn (thanks a lot, Ed).

I ask again.  Think carefully.  Has the lawn change size, now?

No, the lawn is still the same lawn it always was.  But the definition of what constitutes a completed, quality job has certainly changed!  And because I now need to do more things to complete the same sized lawn, rate of delivery will suffer for it by decreasing.

Understanding this distinction is absolutely key!  The job didn’t change size, the level of expectation changed, and my team failed to adapt to that change.  The result was a measurable impact on my ability to deliver value (completed lawns).

Why would I insist on looking at it this way?  Consider this:  Recall, earlier I had estimated that I could deliver 16 points of lawns in a weekend — sixteen lawns just like mine.  In order to compete with Ed’s lawn service, I can no longer get away with a mow-only task list.  I need to perform the same quality tasks that Ed provides!  If I equated size to time, I would now have to re-estimate all of the lawns in my list!  Instead of 45 minutes, they’re taking 45+60=105 minutes.  By that magic conversion factor of 1 point = 45 minutes, those 1 point lawns in my neighborhood are now 2.33 points each (because math).  The corner lots are now 4.66 and the weird lot is a whopping 6.99.

Also, since you already had the calculator out, the project manager/mathematicians went even further: “By my calculations, in the same 16 hour work weekend, you can now to do (16 hours x 60 minutes per hour ) / 105 (minutes per lawn) = 9 lawns. 

See how easy this is?  You changed the scale and did math again, trying to get me to commit to something like (2.33 x 9 = 20.97 — aw heck, let’s say 21 points).  When a 1-point lawn was 45 minutes, you demanded 21 points.  Now that a 1-point lawn is 105 minutes, you still want 21 points.  Except that before, that 21 points would have delivered 21 pieces of customer value, and now that same 21 points is only delivering 9 pieces.  My team appears to have maintained velocity (that’s good, right?), but we’re not advancing anywhere near that rate!  Only 9 customers are satisfied by the same 21 points.

Also keep in mind, that I’m still not accepting your math.  I was only willing to commit to 16 points per weekend.  And I’m not going to re-estimate everything on my backlog, either.  THE LAWNS DIDN’T CHANGE SIZE!

With the impact the new Definition of Done is bringing to the table, unless I do something to correct my team makeup, you’re going to see my velocity drop like a stone to 7 points – seven lawns instead of sixteen.  And that very rightly should set off a red flag somewhere.

Part 4 – Velocity

When we talk of delivering value to our customers, we need to be very careful that we only count things that we’ve actually completed. The Definition of Done says I must mow, trim, edge and clean to meet my customer’s expectations. If I fail to deliver on any single aspect of that definition, the customer could reject my work, and refuse to pay me.

Think about that.  Even if I spent every minute of my weekend pushing the lawnmower around, and managed to cut the grass on twenty lawns — if I never touched the trimmer or broom, then despite the fact that I was BUSY the whole time, I didn’t actually COMPLETE any of the work.  What do I do now?  What do my customers do?  Do I just come back next weekend and do the trimming, edging and sweeping then?  Nice theory, but the grass will all have grown back by then, so now I’ll spend all weekend finishing he jobs from last weekend, but the customer will look out and still see incomplete work, and they’d be well within their rights to withhold payment again.

This is why the Agile world is OUTCOME-DRIVEN. It doesn’t matter that we are keeping our people active and working (busy) if they aren’t finishing everything in the Definition of Done.  When you get right down to it, time estimates are not helpful.  We don’t deliver time.  Our customers don’t consume hours.  They really don’t care how busy we were.  They care about results.  So we need a measurement of our ability to deliver VALUE.  In Agile, we call this Velocity.

Velocity is a measurement of the number of points of value that a team is able to complete in an iteration.  Only work that meets the Definition of Done is counted toward Velocity.  Partially completed work may as well not exist.

Velocity is a very useful metric, but it is not a universal constant.  It is a value that is unique to every team, based on their skills, tools, team structure, etc.

No matter what, Ed’s team is going to have a higher velocity than me.  First, their equipment is better -I can’t compete with that monster mower. And second, his cross-functional team can do tasks simultaneously.  In terms of clock time, from the moment they pull the equipment off the trailer until the time they roll it back up, something like 15 minutes goes by… (this includes Ed coming to the door for his payment).

Despite the fact that I’m telling you not to estimate in time, the concept of time keeps working its way into the conversation.  For instance, if I say Ed’s takes 15 minutes from wheels down, to wheels back up on the trailer.  That’s time, isn’t it?  You’re right, I did talk about duration of the job.  But I didn’t ESTIMATE in time.  Do you understand why?

The reason we want to stay away from time estimation is because time estimation is not portable!  It cannot translate from one team to another. When Ed takes 15 minutes, or I take 105 minutes, this causes a problem when it comes to translating our work.

My hours are not your hours.  The way I work, and the experiences I’ve had will ultimately affect the amount of time it takes for me to deliver.  What if Jeremy, the kid across the street grabs his dad’s lawn-mower and decides to get into this lawn-cutting game for some extra pocket money.   As it turns out, Jeremy’s mower is exactly the same as mine because his father and I both bought them when the local big-box store had a killer overstock sale.  Jeremy and I have the same equipment at our disposal.  Does that mean he is going to take the same amount of time that I will to mow my lawn?  I will bet you the answer is NO.  Whether Jeremy and I have different levels of strength, speed, attention span, or even experience, I can virtually guarantee that we will perform the same amount of work at different speeds.

Furthermore, if Jeremy or I walk up to a house on the next block, intent on giving our estimate, and instead are informed that Ed was already there, and told them it’s a 15-minute job.  Does that mean I’m required to accept that estimate?  How many times in your professional career have you found yourself being held to an estimate made by someone else?  It happens all the time – even more disturbing when that estimate isn’t even made by someone who does that work.

Part 5 – Cross-Functional Teams

Let’s assume you’re someone who’s going to fund the mowing of lawns.  Maybe you’re the head of the homeowners association, and you’re in the process of subcontracting the upkeep of lawns in the subdivision.  My team originally pledged 16 points, and my competition pledged 6o points.  But after you saw the additional services they were providing, you requested that we have a uniform Definition of Done across both teams.  This threatens to tank my performance!

How could we reconcile this?  “What,” you may ask, “do I need to do, to maintain the higher committed velocity?”

I suppose I could throw caution to the wind, and try running behind the mower!  I might be able to bring the time it takes to mow down a bit, but I’m probably going to miss some spots.  I don’t think that will ultimately solve the problem.

I could work overtime, mowing long into the evening hours.  That has potential, but now I’m going to get way more tired, and likely make mistakes.  The quality of my work will certainly suffer, and I now incur the risk of the customer rejecting the work. What would you do?

How about adding people to my team to handle these other tasks in the Definition of Done.  Ed has a cross-functional team (mower, trimmer, sweeper), where I had only a single generalist on my team).  If I got someone to run the weed whacker while I mowed, and whichever one of us finished their job first, then swept the sidewalk, I might be able to bring my clock time down to under an hour.  The math would support us saying we could reach 8 points per day then, but I’d still feel a lot better calling it 6 or maybe 7.  Either way my Velocity for the weekend could increase from a around 7 if I go it alone, to a real possibility of 12-14.  Not quite 16, but let’s face it: when I was just mowing the lawn, there was still long grass growing against the fence and the planting beds that the mower just couldn’t reach.  There was still grass clippings on the sidewalk, and let’s not even talk about the edging! In short, my quality was low, and the other team revealed that fact to the stakeholder.  I had to adapt or risk losing the gig.  So I expanded my team.

This notion of a cross-functional team is very powerful.  It allows us to build teams with all the skills needed to achieve our Definition of Done in one iteration, without relying on outside help.

Part 6 – Multi-Tasking and Work in Process

How good are you at multi-tasking?

Part 7 – Stable Funding

…or more importantly, our value.  What if I want to find out when I can expect delivery on the 16 points of lawns Figure 1.1?  It’s a valid question.  It happens all the time in business.