Agile Metrics

10 Powerful Agile Metrics – and 1 Missing Metric

What are Agile Metrics?

Agile metrics help agile development teams and their management measure the development process, gauging productivity, work quality, predictability, and health of the team and products being developed. A key focus of agile metrics is on value delivered to customers – instead of measuring “what” or “how much” we are doing, we measure how it impacted a customer.

Types of Agile Metrics

There are three important families of agile metrics:

  • Lean metrics – Focus on ensuring a flow of value from the organization to its customers and eliminating wasteful activities. Common metrics include lead time and cycle time.
  • Kanban metrics – Focus on workflow, organizing and prioritizing work and getting it done. A common metric is a cumulative flow.
  • Scrum metrics – Focus on the predictable delivery of working software to customers. Common metrics include the burndown chart and team velocity.

The Importance of Agile Testing Metrics

Agile methodologies place a special emphasis on quality because the end goal is delivering working software to users – buggy or unusable software is not working software. Quality is also manifested in internal aspects that are not directly visible to customers, such as code quality, maintainability and technical debt.

Agile testing metrics can help teams measure and visualize the effort spent in software quality, and to a certain extent, the results of this effort. For example, the escaped defects metric measures, across versions, sprints or product lines, how many bugs were discovered in production – whereas ideally bugs should be discovered and fixed during the development stage.

What makes for a powerful metric in an agile environment?

Agile environments require metrics that are well understood by teams and can help learn and improve processes.

Here are a few qualities that make a metric powerful, in the sense that it can help drive positive improvement in an agile team:

  • The metric is used by the team – Agile metrics should not be imposed or measured by management, they should be used voluntarily by agile teams to learn and improve.
  • The metric is surrounded by conversation – Metrics should not just be numbers, they should be the starting point of a conversation about process and roadblocks affecting the team.
  • The metric is part of a specific experiment – Metrics should be used to answer a specific question about agile processes, not just measured for the sake of measurement.
  • The metric is used in tandem with other metrics – Even a great metric, if used alone, might lead to tunnel vision, and incentivise teams to maximize that metric at the expense of all else. Using several metrics together provides a balanced picture of agile activity.
  • The metric is easy to calculate and understand – Metrics that are overly complex or not fully understood, even if they provide good insights about a team’s work, are not useful in guiding day-to-day activities.

These qualities were inspired by work by Leo Tranter and Joel Bancroft-Connors.

10 Powerful Agile Metrics

1. Sprint Burndown

The sprint burndown chart visualizes how many story points have been completed during the sprint and how many remain, and helps forecast if the sprint scope will be completed on time.

Why it is powerful: Makes it instantly clear how much value a sprint has already delivered and how close we are to completing our commitment to customers.

2. Agile Velocity

Velocity measures how many story points were completed by a team, on average, over the past few sprints. It can be used to predict the team’s output in the upcoming sprints.

Sprint velocity

Why it is powerful: Velocity is powerful because it’s a result metric – how much value was actually delivered to customers in a series of sprints. Be careful not to compare velocity across teams because story points and definition of done can vary between teams.

3. Lead Time

Lead time measures the total time from the moment a story enters the system (in the backlog), until it is completed as part of a sprint, or released to customers. It measures the total time for a requirement to be realized and start earning value – the speed of your value chain.

Image Source: Screenful

Why it is powerful: In a sense, lead time is more important than velocity because it measures the entire agile system from end to end. Reducing lead time means the entire development pipeline is becoming more efficient.

4. Cycle Time

As illustrated above, the cycle time is a subset of lead time – it measures the time for a task to go from “started” or “in progress” to “done”. Normally, cycle times should be around half the sprint length. If cycle times are longer than a sprint, teams are not completing work they committed to.

Why it is powerful: A very simple metric that can raise a red flag when items within sprints across your entire system are not moving forward.

5. Code Coverage

Code coverage measures the percentage of your code which is covered by unit tests. It can be measured by the number of methods, statements, branches or conditions which are executed as part of a unit test suite.

Why it is powerful: Code coverage can be run automatically as part of every build and gives a crude picture showing how much of the codebase has been tested. A low code coverage almost always indicates low code quality. However, a high coverage may not equal high quality, because there are other types of tests – such as UI or integration tests – which are not counted.

6. Static Code Analysis

While not exactly a metric, this is an automated process that can provide insights into code quality and clean code from simple errors redundancies. Code quality, while difficult to define and measure, is known to be a key contributor to software quality in general, and in particular, software maintainability.

Why is it powerful: Static code analysis provides a safe baseline for code quality. However, it is no substitute for human input into code quality, via manual code reviews, pair programming or other methods.

7. Release Net Promoter Score

Net Promoter Score (NPS), calculated for a software release, measures whether users would recommend the software to others, do nothing, or recommend against using it. It is an important gauge of customer satisfaction.

Image Source: Wootrick

Why it is powerful: The ultimate test of agile development is providing value to a customer. If customers are recommending this new release to others, that is a clear indication of success. If not, you can use this as a warning metric and use other data to understand what’s wrong.

8. Cumulative Flow

This is a kanban metric which shows the status of tasks – in a sprint, a release or across software teams. It can visualize bottlenecks in the process – a disproportionately large number of tasks in any of the workflow stages indicates a problem. For example, a big “bubble” in the chart in a verification or testing stage indicates this stage has insufficient resources.

Image Source: MicroTool

Why it is powerful: As with the burndown chart, the power of this metric is in its visual simplicity – you can grasp a process in one glance and immediately identify issues. Cumulative flow lets you catch problems in mid-process before they result in delayed delivery.

9. Failed Deployments

Measures the number of deployments (either to test, production environments, or both). Can help understand how solid environments are and whether teams are really building potentially shippable software.

Why it is powerful: Especially when applied to production environments, this metric can provide a clear indication that sprints or releases are production ready, or not.

10. Escaped Defects

The number of bugs discovered only after a build or release enters production. Escaped defects should ideally be zero. Measuring them across releases or teams provides a crude, but still highly relevant, a measure of deployed software quality.

Why it is powerful: Production bugs, especially if frequent, are a problem in the agile process. Just like in lean manufacturing, we should “stop the production line” and discover what’s wrong.

The Missing Metric: Quality Intelligence

We presented several powerful metrics that provide important insights into the agile process. However, there is no single metric as clear or powerful as the burndown chart or cycle time, which can tell us the most important thing: “how good” is the software being built by our developers.

A new category of tools called Software Quality Intelligence can provide this missing metric: a clear view of software quality. SeaLights is a platform which combines data about code changes, production uses and test execution, to provide the following quality metrics:

  • Test gap analytics—Identifying areas where the code was recently changed or executed in production but is untested. Test gaps are the best place to invest resources to improve quality.
  • Quality trend intelligence—Showing which parts of a system are improving in quality coverage, and which are getting worse—meaning more testing time should be invested.
  • Release quality analytics—SeaLights performs real-time analytics on hundreds of thousands of test executions, code changes, builds and production events to assess the readiness of a release. Which build is best and provides the highest quality for users?


PEARL III: Principles of Lean Software Development for Agile Methodology

PEARL III: Principles of Lean Software Development for Agile Methodology

The market for software is fast paced with frequently changing customer needs. In order to stay competitive companies have to be able to react to the changing needs in a rapid manner. Failing to do so often results in a higher risk of market lock-out , reduced probability of market dominance , and it is less likely that the product conforms to the needs of the market. In consequence software companies need to take action in order to be responsive whenever there is a shift in the customers’ needs on the market. That is, they need to meet the current requirements of the market, the requirements being function or quality related. Two development paradigms emerged in the last decade to address this challenge, namely agile and lean software development.

The term Lean Software Development was first coined as the title for a conference organized by the ESPRIT initiative of the European Union, in Stuttgart Germany, October 1992. Independently, the following year, Robert Charette in 1993 suggested the concept of “Lean Software Development” as part of his work exploring better ways of managing risk in software projects. The term “Lean” dates to 1991, suggested by James Womack, Daniel Jones, and Daniel Roos, in their book The Machine That Changed the World: The Story of Lean Production as the English language term to describe the management approach used at Toyota. The idea that Lean might be applicable in software development was established very early, only 1 to 2 years after the term was first used in association with trends in manufacturing processes and industrial engineering.

In their 2nd book, published in 1995, Womack and Jones defined five core pillars of Lean Thinking. These were:

  • Value
  • Value Stream
  • Flow
  • Pull
  • Perfection

This became the default working definition for Lean over most of the next decade. The pursuit of perfection, it was suggested, was achieved by eliminating waste. While there were 5 pillars, it was the 5th one, pursuit of perfection through the systemic identification of wasteful activities and their elimination, that really resonated with a wide audience. Lean became almost exclusively associated with the practice of elimination of waste through the late 1990s and the early part of the 21st Century.

The Womack and Jones definition for Lean is not shared universally. The principles of management at Toyota are far more subtle. The single word “waste” in English is described more richly with three Japanese terms:

  • Muda – literally meaning “waste” but implying non-value-added activity
  • Mura – meaning “unevenness” and interpreted as “variability in flow”
  • Muri – meaning “overburdening” or “unreasonableness”

Perfection is pursued through the reduction of non-value-added activity but also through the smoothing of flow and the elimination of overburdening. In addition, the Toyota approach was based in a foundational respect for people and heavily influenced by the teachings of 20th century quality assurance and statistical process control experts such as W. Edwards Deming.

Unfortunately, there are almost as many definitions for Lean as there are authors on the subject.

Lean and Agile

Bob Charette was invited but unable to attend the 2001 meeting at Snowbird, Utah, where the Manifesto for Agile Software Development was authored. Despite missing this historic meeting, Lean Software Development was considered as one of several Agile approaches to software development. Jim Highsmith dedicated a chapter of his 2002 book to an interview with Bob about the topic. Later, Mary & Tom Poppendieck went on to author a series of 3 books. During the first few years of the 21st Century, Lean principles were used to explain why Agile methods were better. Lean explained that Agile methods contained little “waste” and hence produced a better economic outcome. Lean principles were used as a “permission giver” to adopt Agile methods.
Early lean software ideas were developed by Poppendieck and Middleton and Sutton. These books explored how lean thinking could be transferred from manufacturing to the more intangible world and different culture of software engineers. Specific techniques on how the concept of kanban could be applied to software were also developed . Note
that the use of these methods is partly a metaphor rather than a direct copying. For example, kanban in factories literally is a binary signal to replenish an inventory buffer, based on what the customer has taken away. In software it performs a similar function, but more broadly displays information on the status of the process and potential problems.
Moving upstream and applying lean thinking to influence project selection and definition also creates great benefits .
The proceedings of the first Lean & Kanban Software conference and the work of Shalloway et al. show adoption is spreading.

Defining Lean Software Development

Defining Lean Software Development is challenging because there is no specific Lean Software Development method or process. Lean is not an equivalent of Personal Software Process, V-Model, Spiral Model, EVO, Feature-Driven Development, Extreme Programming, Scrum, or Test-Driven Development. A software development lifecycle process or a project management process could be said to be “lean” if it was observed to be aligned with the values of the Lean Software Development movement and the principles of Lean Software Development. So those anticipating a simple recipe that can be followed and named Lean Software Development will be disappointed. Individuals must fashion or tailor their own software development process by understanding Lean principles and adopting the core values of Lean.

There are several schools of thought within Lean Software Development. The largest, and arguably leading, school is the Lean Systems Society, which includes Donald Reinertsen, Jim Sutton, Alan Shalloway, Bob Charette, Mary Poppendeick, and David J. Anderson. Mary and Tom Poppendieck’s work developed prior to the formation of the Society and its credo stands separately, as does the work of Craig Larman, Bas Vodde , and, most recently, Jim Coplien . This section seeks to be broadly representative of the Lean Systems Society viewpoint as expressed in its credo and to provide a synthesis and summary of their ideas.


The Lean Systems Society published its credo at the 2012 Lean Software & Systems Conference . This was based on a set of values published a year earlier. Those values include:
  • Accept the human condition
  • Accept that complexity & uncertainty are natural to knowledge work
  • Work towards a better Economic Outcome
  • While enabling a better Sociological Outcome
  • Seek, embrace & question ideas from a wide range of disciplines
  • A values-based community enhances the speed & depth of positive change

Accept the Human Condition

Knowledge work such as software development is undertaken by human beings. We humans are inherently complex and, while logical thinkers, we are also led by our emotions and some inherent animalistic traits that can’t reasonably be overcome. Our psychology and neuro-psychology must be taken into account when designing systems or processes within which we work. Our social behavior must also be accommodated. Humans are inherently emotional, social, and tribal, and our behavior changes with fatigue and stress. Successful processes will be those that embrace and accommodate the human condition rather than those that try to deny it and assume logical, machine-like behavior.

Accept that Complexity & Uncertainty are Natural to Knowledge Work

The behavior of customers and markets are unpredictable. The flow of work through a process and a collection of workers is unpredictable. Defects and required rework are unpredictable. There is inherent chance or seemingly random behavior at many levels within software development. The purpose, goals, and scope of projects tend to change while they are being delivered. Some of this uncertainty and variability, though initially unknown, is knowable in the sense that it can be studied and quantified and its risks managed, but some variability is unknowable in advance and cannot be adequately anticipated. As a result, systems of Lean Software Development must be able to react to unfolding events, and the system must be able to adapt to changing circumstances. Hence any Lean Software Development process must exist within a framework that permits adaptation (of the process) to unfolding events.

Work towards a better Economic Outcome

Human activities such as Lean Software Development should be focused on producing a better economic outcome. Capitalism is acceptable when it contributes both to the value of the business and the benefit of the customer. Investors and owners of businesses deserve a return on investment. Employees and workers deserve a fair rate of pay for a fair effort in performing the work. Customers deserve a good product or service that delivers on its promised benefits in exchange for a fair price paid. Better economic outcomes will involve delivery of more value to the customer, at lower cost, while managing the capital deployed by the investors or owners in the most effective way possible.

Enable a better Sociological Outcome

Better economic outcomes should not be delivered at the expense of those performing the work. Creating a workplace that respects people by accepting the human condition and provides systems of work that respect the psychological and sociological nature of people is essential. Creating a great place to do great work is a core value of the Lean Software Development community.

Principles of Lean Software Development for Scaling Agile

The Lean Software & Systems community seems to agree on a few principles that underpin Lean Software Development processes. These are the principles of Lean software development for Agile Methodology.
  • Follow a Systems Thinking & Design Approach
  • Emergent Outcomes can be Influenced by Architecting the Context of a Complex Adaptive System
  • Respect People (as part of the system)
  • Use the Scientific Method (to drive improvements)
  • Encourage Leadership
  • Generate Visibility (into work, workflow, and system operation)
  • Reduce Flow Time
  • Reduce Waste to Improve Efficiency

Follow a Systems Thinking & Design Approach

This is often referred to in Lean literature as “optimize the whole,” which implies that it is the output from the entire system (or process) that we desire to optimize, and we shouldn’t mistakenly optimize parts in the hope that it will magically optimize the whole. Most practitioners believe the corollary to be true, that optimizing parts (local optimization) will lead to a suboptimal outcome.

A Lean Systems Thinking and Design Approach requires that we consider the demands on the system made by external stakeholders, such as customers, and the desired outcome required by those stakeholders. We must study the nature of demand and compare it with the capability of our system to deliver. Demand will include so-called “value demand,” for which customers are willing to pay, and “failure demand,” which is typically rework or additional demand caused by a failure in the supply of value demand. Failure demand often takes two forms: rework on previously delivered value demand and additional services or support due to a failure in supplying value demand. In software development, failure demand is typically requests for bug fixes and requests to a customer care or help desk function.

A systems design approach requires that we also follow the Plan-Do-Study-Act (PDSA) approach to process design and improvement. W. Edwards Deming used the words “study” and “capability” to imply that we study the natural philosophy of our system’s behavior. This system consists of our software development process and all the people operating it. It will have an observable behavior in terms of lead time, quality, quantity of features or functions delivered (referred to in Agile literature as “velocity”), and so forth.

Velocity : At the end of each iteration, the team adds up effort estimates associated with user stories that were completed during that iteration. This total is called velocity.
Knowing velocity, the team can compute (or revise) an estimate of how long the project will take to complete, based on the estimates associated with remaining user stories and assuming that velocity over the remaining iterations will remain approximately the same. This is generally an accurate prediction, even though rarely a precise one.

“Lead time” is a term borrowed from the manufacturing method known as Lean or Toyota Production System, where it is defined as the time elapsed between a customer placing an order and receiving the product ordered.
Translated to the software domain, lead time can be described more abstractly as the time elapsed between the identification of a requirement and its fulfillment. Defining a more concrete measurement depends on the situation being examined: for instance, when focusing on the software development process, the “lead time” elapsed between the formulation of a user story and that story being used “in production”, that is, by actual users under normal conditions.
Teams opting for the kanban approach favor this measure, over the better known velocity. Instead of aiming at increasing velocity, improvement initiatives intend to reduce lead time.

These metrics will exhibit variability and, by studying the mean and spread of variation, Individuals can develop an understanding of their capability. If this is mismatched with the demand and customer expectations, then the system will need to be redesigned to close the gap.

Deming also taught that capability is 95% influenced by system design, and only 5% is contributed by the performance of individuals. In other words, we can respect people by not blaming them for a gap in capability compared to demand and by redesigning the system to enable them to be successful.

To understand system design, we must have a scientific understanding of the dynamics of system capability and how it might be affected. Models are developed to predict the dynamics of the system. While there are many possible models, several popular ones are in common usage: the understanding of economic costs; so-called transaction and coordination costs that relate to production of customer-valued products or services; the Theory of Constraints – the understanding of bottlenecks; and The Theory of Profound Knowledge – the study and recognition of variability as either common to the system design or special and external to the system design.

Emergent Outcomes can be Influenced by ARCHITECTURE of the Context for a Complex Adaptive System

Complex systems have starting conditions and simple rules that, when run iteratively, produce an emergent outcome. Emergent outcomes are difficult or impossible to predict given the starting conditions. The computer science experiment “The Game of Life” is an example of a complex system. A complex adaptive system has within it some self-awareness and an internal method of reflection that enables it to consider how well its current set of rules is enabling it to achieve a desired outcome. The complex adaptive system may then choose to adapt itself – to change its simple rules – to close the gap between the current outcome and the desired outcome. The Game of Life adapted such that the rules could be re-written during play would be a complex adaptive system.

In software development processes, the “simple rules” of complex adaptive systems are the policies that make up the process definition. The core principle here is based in the belief that developing software products and services is not a deterministic activity, and hence a defined process that cannot adapt itself will not be an adequate response to unforeseeable events. Hence, the process designed as part of our system thinking and design approach must be adaptable. It adapts through the modification of the policies of which it is made.

The Kanban approach to Lean Software Development utilizes this concept by treating the policies of the kanban pull system as the “simple rules,” and the starting conditions are that work and workflow is visualized, that flow is managed using an understanding of system dynamics, and that the organization uses a scientific approach to understanding, proposing, and implementing process improvements.

The term “kanban” with the sense of a sign, poster or billboard, and derived from roots which literally translate as “visual board”.
Its meaning within the Agile context is borrowed from the Toyota Production System, where it designates a system to control the inventory levels of various parts. It is analogous to (and in fact inspired by) cards placed behind products on supermarket shelves to signal “out of stock” items and trigger a resupply “just in time”.
The Toyota system affords a precise accounting of inventory or “work in process”, and strives for a reduction of inventory levels, considered wasteful and harmful to performance.
The phrase “Kanban method” also refers to an approach to continuous improvement which relies on visualizing the current system of work scheduling, managing “flow” as the primary measure of performance, and whole-system optimization – as a process improvement approach, it does not prescribe any particular practices.

Respect People

The Lean community adopts Peter Drucker’s definition of knowledge work that states that workers are knowledge workers if they are more knowledgeable about the work they perform than their bosses. This creates the implication that workers are best placed to make decisions about how to perform work and how to modify processes to improve how work is performed. So the voice of the worker should be respected. Workers should be empowered to self-organize to complete work and achieve desired outcomes. They should also be empowered to suggest and implement process improvement opportunities or “kaizen events” as they are referred to in Lean literature. Making process policies explicit so that workers are aware of the rules that constrain them is another way of respecting them. Clearly defined rules encourage self-organization by removing fear and the need for courage. Respecting people by empowering them and giving them a set of explicitly declared policies holds true with the core value of respecting the human condition.
SAP has been using SCRUM and other Agile methodologies for several years at the team level. Herbert Illgner, COO Business Solutions and Technology at SAP, who has been involved with the effort, says that team empowerment and faster feedback cycles with customers are two significant benefits. Illgner added that SAP is expanding application of Agile methods to the entire product creation process using a Lean framework that includes empowered cross-functional teams, continuous improvement process and managers as support and teachers..

Use the Scientific Method

Seek to use models to understand the dynamics of how work is done and how the system of Lean Software Development is operating. Observe and study the system and its capability, and then develop and apply models for predicting its behavior. Collect quantitative data in the applicable studies, and use that data to understand how the system is performing and to predict how it might change when the process is changed.

The Lean Software & Systems community uses statistical methods such as statistical process control charts and spectral analysis histograms of raw data for lead time and velocity to understand system capability. They also use models such as: the Theory of Constraints to understand bottlenecks; The System of Profound Knowledge to understand variation that is internal to the system design versus that which is externally influenced; and an analysis of economic costs in the form of tasks performed to merely coordinate, set up, deliver, or clean up after customer-valued product or services are created. Some other models are coming into use, such as Real Option Theory, which seeks to apply financial option theory from financial risk management to real-world decision making.

The scientific method suggests: we study; we postulate an outcome based on a model; we perturb the system based on that prediction; and we observe again to see if the perturbation produced the results the model predicted. If it doesn’t, then we check our data and reconsider whether our model is accurate. Using models to drive process improvements moves it to a scientific activity and elevates it from a superstitious activity based on intuition.

Encourage Leadership

Leadership and management are not the same. Management is the activity of designing processes, creating, modifying, and deleting policy, making strategic and operational decisions, gathering resources, providing finance and facilities, and communicating information about context such as strategy, goals, and desired outcomes. Leadership is about vision, strategy, tactics, courage, innovation, judgment, advocacy, and many more attributes. Leadership can and should come from anyone within an organization. Small acts of leadership from workers will create a cascade of improvements that will deliver the changes needed to create a Lean Software Development process.

Generate Visibility

Knowledge work is invisible. If you can’t see something, it is (almost) impossible to manage it. It is necessary to generate visibility into the work being undertaken and the flow of that work through a network of individuals, skills, and departments until it is complete. It is necessary to create visibility into the process design by finding ways of visualizing the flow of the process and by making the policies of the process explicit for everyone to see and consider. When all of these  are visible, then the use of the scientific method is possible, and conversations about potential improvements can be collaborative and objective. Collaborative process improvement is almost impossible if work and workflow are invisible and if process policies are not explicit.

Reduce Flow Time

The software development profession and the academics who study software engineering have traditionally focused on measuring time spent working on an activity. The Lean Software Development community has discovered that it might be more useful to measure the actual elapsed calendar time something takes to be processed. This is typically referred to as Cycle Time and is usually qualified by the boundaries of the activities performed. For example, Cycle Time through Analysis to Ready for Deployment would measure the total elapsed time for a work item, such as a user story, to be analyzed, designed, developed, tested in several ways, and queued ready for deployment to a production environment. In consultation with the customer or product owner, the team divides up the work to be done into functional increments called “user stories”.

Lead time clock starts when the request is made and ends at delivery. Cycle time clock starts when work begins on the request and ends when the item is ready for delivery. Cycle time is a more mechanical measure of process capability. Lead time is what the customer sees.

Lead time depends on cycle time, but also depends on your willingness to keep a backlog, the customer’s patience, and the customer’s readiness for delivery.

Focusing on the time work takes to flow through the process is important in several ways. Longer cycle times have been shown to correlate with a non-linear growth in bug rates. Hence shorter cycle times lead to higher quality. This is counter-intuitive as it seems ridiculous that bugs could be inserted in code while it is queuing and no human is actually touching it. Traditionally, the software engineering profession and academics who study it have ignored this idle time. However, empirical evidence suggests that cycle time is important to initial quality.

Alan Shalloway has also talked about the concept of “induced work.” His observation is that a lag in performing a task can lead to that task taking a lot more effort than it may have done. For example, a bug found and fixed immediately may only take 20 minutes to fix, but if that bug is triaged, is queued and then waits for several days or weeks to be fixed, it may involve several or many hours to make the fix. Hence, the cycle time delay has “induced” additional work. As this work is avoidable, in Lean terms, it must be seen as “waste.”

The third reason for focusing on cycle time is a business related reason. Every feature, function, or user story has a value. That value may be uncertain but, nevertheless, there is a value. The value may vary over time. The concept of value varying over time can be expressed economically as a market payoff function. When the market payoff function for a work item is understood, even if the function exhibits a spread of values to model uncertainty, it is possible to evaluate a “cost of delay.” The cost of delay allows us to put a value on reducing cycle time.

With some work items, the market payoff function does not start until a known date in the future. For example, a feature designed to be used during the 4th of July holiday in the United States has no value prior to that date. Shortening cycle time and being capable of predicting cycle time with some certainty is still useful in such an example. Ideally, we want to start the work so that the feature is delivered “just in time” when it is needed and not significantly prior to the desired date, nor late, as late delivery incurs a cost of delay. Just-in-time delivery ensures that optimal use was made of available resources. Early delivery implies that we might have worked on something else and have, by implication, incurred an opportunity cost of delay.

As a result of these three reasons, Lean Software Development seeks to minimize flow time and to record data that enables predictions about flow time. The objective is to minimize failure demand from bugs, waste from over-burdening due to delay in fixing bugs, and to maximize value delivered by avoiding both cost of delay and opportunity cost of delay.

Reduce Waste to Improve Efficiency

A value stream mapping technique is used to identify waste. The second step is to point out sources of waste and to eliminate them. Waste-removal should take place iteratively until even essential-seeming processes and procedures are liquidated.

For every valued-added activity, there are setup, cleanup and delivery activities that are necessary but do not add value in their own right. For example, a project iteration that develops an increment of working software requires planning (a setup activity), an environment and perhaps a code branch in version control (collectively known as configuration management and also a setup activity), a release plan and performing the actual release (a delivery activity), a demonstration to the customer (a delivery activity), and perhaps an environment teardown or reconfiguration (a cleanup activity.) In economic terms, the setup, cleanup, and delivery activities are transaction costs on performing the value-added work. These costs (or overheads) are considered waste in Lean.

Any form of communication overhead can be considered waste. Meetings to determine project status and to schedule or assign work to team members would be considered a coordination cost in economic language. All coordination costs are waste in Lean thinking. Lean software development methods seek to eliminate or reduce coordination costs through the use of colocation of team members, short face-to-face meetings such as standups, and visual controls such as card walls.

The third common form of waste in Lean Software Development is failure demand. Failure demand is a burden on the system of software development. Failure demand is typically rework or new forms of work generated as a side-effect of poor quality. The most typical forms of failure demand in software development are bugs, production defects, and customer support activities driven out of a failure to use the software as intended. The percentage of work-in-progress that is failure demand is often referred to as Failure Load. The percentage of value-adding work against failure demand is a measure of the efficiency of the system.

The percentage of value-added work against the total work, including all the non-value adding transaction and coordination costs, determines the level of efficiency. A system with no transaction and coordination costs and no failure load would be considered 100% efficient.

Traditionally, Western management science has taught that efficiency can be improved by increasing the batch size of work. Typically, transaction and coordination costs are fixed or rise only slightly with an increase in batch size. As a result, large batches of work are more efficient. This concept is known as “economy of scale.” However, in knowledge work problems, coordination costs tend to rise non-linearly with batch size, while transaction costs can often exhibit a linear growth. As a result, the traditional 20th Century approach to efficiency is not appropriate for knowledge work problems like software development.

It is better to focus on reducing the overheads while keeping batch sizes small in order to improve efficiency. Hence, the Lean way to be efficient is to reduce waste. Lean software development methods focus on fast, cheap, and quick planning methods; low communication overhead; and effective low overhead coordination mechanisms, such as visual controls in kanban systems. They also encourage automated testing and automated deployment to reduce the transaction costs of delivery. Modern tools for minimizing the costs of environment setup and teardown, such as modern version control systems and use of virtualization, also help to improve efficiency of small batches of software development.

Lean Software Development Practices for Agile 

Lean software development are viewed as a set of thinking tools that could easily blend in with any agile approach.So as you can see, lean and agile are deeply intertwined in the software world.
In practice, Agile seems to be changing for the better by adopting Lean thinking in a large way. Rally Development says that its customers get to market 50% faster and are 25% more productive when they employ a hybrid of Lean and Agile development methods.

Lean Software Development does not prescribe practices. It is more important to demonstrate that actual process definitions are aligned with the principles and values. However, a number of practices are being commonly adopted. This section provides a brief overview of some of these.

Continuous learning

Software development is a continuous learning process with the additional challenge of development teams and end product sizes. The best approach for improving a software development environment is to amplify learning. The accumulation of defects should be prevented by running tests as soon as the code is written. Instead of adding more documentation or detailed planning, different ideas could be tried by writing code and building. The process of user requirements gathering could be simplified by presenting screens to the end-users and getting their input.

The learning process is sped up by usage of short iteration cycles – each one coupled with refactoring and integration testing. Refactoring consists of improving the internal structure of an existing program’s source code, while preserving its external behavior. The noun “refactoring” refers to one particular behaviour-preserving transformation, such as “Extract Method” or “Introduce Parameter”. Refactoring does “not” mean: rewriting code, fixing bugs or improve observable aspects of software such as its interface

Refactoring in the absence of safeguards against introducing defects (i.e. violating the “behaviour preserving” condition) is risky. Safeguards include aids to regression testing including automated unit tests or automated acceptance tests, and aids to formal reasoning such as type systems.

Increasing feedback via short feedback sessions with customers helps when determining the current phase of development and adjusting efforts for future improvements. During those short sessions both customer representatives and the development team learn more about the domain problem and figure out possible solutions for further development. Thus the customers better understand their needs, based on the existing result of development efforts, and the developers learn how to better satisfy those needs. Another idea in the communication and learning process with a customer is set-based development – this concentrates on communicating the constraints of the future solution and not the possible solutions, thus promoting the birth of the solution via dialogue with the customer.

Decide as late as possible

As software development is always associated with some uncertainty, better results should be achieved with an options-based approach, delaying decisions as much as possible until they can be made based on facts and not on uncertain assumptions and predictions. The more complex a system is, the more capacity for change should be built into it, thus enabling the delay of important and crucial commitments. The iterative approach promotes this principle – the ability to adapt to changes and correct mistakes, which might be very costly if discovered after the release of the system.

An agile software development approach can move the building of options earlier for customers, thus delaying certain crucial decisions until customers have realized their needs better. This also allows later adaptation to changes and the prevention of costly earlier technology-bounded decisions. This does not mean that no planning should be involved – on the contrary, planning activities should be concentrated on the different options and adapting to the current situation, as well as clarifying confusing situations by establishing patterns for rapid action. Evaluating different options is effective as soon as it is realized that they are not free, but provide the needed flexibility for late decision making.

Deliver as fast as possible

In the era of rapid technology evolution, it is not the biggest that survives, but the fastest. The sooner the end product is delivered without major defects, the sooner feedback can be received, and incorporated into the next iteration. The shorter the iterations, the better the learning and communication within the team. With speed, decisions can be delayed. Speed assures the fulfilling of the customer’s present needs and not what they required yesterday. This gives them the opportunity to delay making up their minds about what they really require until they gain better knowledge. Customers value rapid delivery of a quality product.

The just-in-time production ideology could be applied to software development, recognizing its specific requirements and environment. This is achieved by presenting the needed result and letting the team organize itself and divide the tasks for accomplishing the needed result for a specific iteration. At the beginning, the customer provides the needed input. This could be simply presented in small cards or stories – the developers estimate the time needed for the implementation of each card. Thus the work organization changes into self-pulling system – each morning during a stand-up meeting, each member of the team reviews what has been done yesterday, what is to be done today and tomorrow, and prompts for any inputs needed from colleagues or the customer. This requires transparency of the process, which is also beneficial for team communication.

Another key idea in Toyota’s Product Development System is set-based design. If a new brake system is needed for a car, for example, three teams may design solutions to the same problem. Each team learns about the problem space and designs a potential solution. As a solution is deemed unreasonable, it is cut. At the end of a period, the surviving designs are compared and one is chosen, perhaps with some modifications based on learning from the others – a great example of deferring commitment until the last possible moment. Software decisions could also benefit from this practice to minimize the risk brought on by big up-front design

Empower the team

There has been a traditional belief in most businesses about the decision-making in the organization – the managers tell the workers how to do their own job. In a “Work-Out technique”, the roles are turned – the managers are taught how to listen to the developers, so they can explain better what actions might be taken, as well as provide suggestions for improvements. The lean approach favors the aphorism “find good people and let them do their own job,” encouraging progress, catching errors, and removing impediments, but not micro-managing.

Another mistaken belief has been the consideration of people as resources. People might be resources from the point of view of a statistical data sheet, but in software development, as well as any organizational business, people do need something more than just the list of tasks and the assurance that they will not be disturbed during the completion of the tasks. People need motivation and a higher purpose to work for – purpose within the reachable reality, with the assurance that the team might choose its own commitments. The developers should be given access to the customer; the team leader should provide support and help in difficult situations, as well as ensure that skepticism does not ruin the team’s spirit.

Build integrity in

The customer needs to have an overall experience of the System – this is the so-called perceived integrity: how it is being advertised, delivered, deployed, accessed, how intuitive its use is, price and how well it solves problems.

Conceptual integrity means that the system’s separate components work well together as a whole with balance between flexibility, maintainability, efficiency, and responsiveness. This could be achieved by understanding the problem domain and solving it at the same time, not sequentially. The needed information is received in small batch pieces – not in one vast chunk with preferable face-to-face communication and not any written documentation. The information flow should be constant in both directions – from customer to developers and back, thus avoiding the large stressful amount of information after long development in isolation.

One of the healthy ways towards integral architecture is refactoring. As more features are added to the original code base, the harder it becomes to add further improvements. Refactoring is about keeping simplicity, clarity, minimum amount of features in the code. Repetitions in the code are signs for bad code designs and should be avoided. The complete and automated building process should be accompanied by a complete and automated suite of developer and customer tests, having the same versioning, synchronization and semantics as the current state of the System. At the end the integrity should be verified with thorough testing, thus ensuring the System does what the customer expects it to. Automated tests are also considered part of the production process, and therefore if they do not add value they should be considered waste. Automated testing should not be a goal, but rather a means to an end, specifically the reduction of defects.

See the whole

Software systems nowadays are not simply the sum of their parts, but also the product of their interactions. Defects in software tend to accumulate during the development process – by decomposing the big tasks into smaller tasks, and by standardizing different stages of development, the root causes of defects should be found and eliminated. The larger the system, the more organizations that are involved in its development and the more parts are developed by different teams, the greater the importance of having well defined relationships between different vendors, in order to produce a system with smoothly interacting components. During a longer period of development, a stronger subcontractor network is far more beneficial than short-term profit optimizing, which does not enable win-win relationships.

Lean thinking has to be understood well by all members of a project, before implementing in a concrete, real-life situation. “Think big, act small, fail fast; learn rapidly” – these slogans summarize the importance of understanding the field and the suitability of implementing lean principles along the whole software development process. Only when all of the lean principles are implemented together, combined with strong “common sense” with respect to the working environment, is there a basis for success in software development.

Model Storming: 
Agile Modeling’s practices of light weight, initial requirements envisioning followed by iteration modeling and just-in-time (JIT) model storming work because they reflect deferment of commitment regarding what needs to be built until it’s actually needed, and the practices help eliminate waste because you’re only modeling what needs to be built.
Agility by Self Organization :
It is possible to deliver high-quality systems quickly. By limiting the work of a team to its capacity, which is reflected by the team’s velocity (this is the number of “points” of functionality which a team delivers each iteration), you can establish a reliable and repeatable flow of work. An effective organization doesn’t demand teams do more than they are capable of, but instead asks them to self-organize and determine what they can accomplish. Constraining these teams to delivering potentially shippable solutions on a regular basis motivates them to stay focused on continuously adding value.

Cumulative Flow Diagrams

Cumulative Flow Diagrams have been a standard part of reporting in Team Foundation Server since 2005. Cumulative flow diagrams plot an area graph of cumulative work items in each state of a workflow. They are rich in information and can be used to derive the mean cycle time between steps in a process as well as the throughput rate (or “velocity”). Different software development lifecycle processes produce different visual signatures on cumulative flow diagrams. Practitioners can learn to recognize patterns of dysfunction in the process displayed in the area graph. A truly Lean process will show evenly distributed areas of color, smoothly rising at a steady pace. The picture will appear smooth without jagged steps or visible blocks of color.

In their most basic form, cumulative flow diagrams are used to visualize the quantity of work-in-progress at any given step in the work item lifecycle. This can be used to detect bottlenecks and observe the effects of “mura” (variability in flow).

Visual Controls

In addition to the use of cumulative flow diagrams, Lean Software Development teams use physical boards, or projections of electronic visualization systems, to visualize work and observe its flow. Such visualizations help team members observe work-in-progress accumulating and enable them to see bottlenecks and the effects of “mura.” Visual controls also enable team members to self-organize to pick work and collaborate together without planning or specific management direction or intervention. These visual controls are often referred to as “card walls” or sometimes (incorrectly) as “kanban boards.”

Virtual Kanban Systems

A kanban system is a practice adopted from Lean manufacturing. It uses a system of physical cards to limit the quantity of work-in-progress at any given stage in the workflow. Such work-in-progress limited systems create a “pull” where new work is started only when there are free kanban indicating that new work can be “pulled” into a particular state and work can progress on it.

In Lean Software Development, the kanban are virtual and often tracked by setting a maximum number for a given step in the workflow of a work item type. In some implementations, electronic systems keep track of the virtual kanban and provide a signal when new work can be started. The signal can be visual or in the form of an alert such as an email.

Virtual kanban systems are often combined with visual controls to provide a visual virtual kanban system representing the workflow of one or several work item types. Such systems are often referred to as “kanban boards” or “electronic kanban systems.” A visual virtual kanban system is available as a plug-in for Team Foundation Server, called Visual WIP[20]. This project was developed as open source by Hakan Forss in Sweden.

Small Batch Sizes / Single-piece Flow

Lean Software Development requires that work is either undertaken in small batches, often referred to as “iterations” or “increments,” or that work items flow independently, referred to as “single-piece flow.” Single-piece flow requires a sophisticated configuration management strategy to enable completed work to be delivered while incomplete work is not released accidentally. This is typically achieved using branching strategies in the version control system. A small batch of work would typically be considered a batch that can be undertaken by a small team of 8 people or less in under 2 weeks.

Small batches and single-piece flow require frequent interaction with business owners to replenish the backlog or queue or work. They also require a capability to release frequently. To enable frequent interaction with business people and frequent delivery, it is necessary to shrink the transaction and coordination costs of both activities. A common way to achieve this is the use of automation.


Lean Software Development expects a high level of automation to economically enable single-piece flow and to encourage high quality and the reduction of failure demand. The use of automated testing, automated deployment, and software factories to automate the deployment of design patterns and creation of repetitive low variability sections of source code will all be commonplace in Lean Software Development processes.

Kaizen Events

In Lean literature, the term kaizen means “continuous improvement” and a kaizen event is the act of making a change to a process or tool that hopefully results in an improvement.

The Lean concept of Kaizen also has a strong influence on the way Agile is being practiced, filling a gap relating to continuous improvement

Lean Software Development processes use several different activities to generate kaizen events. These are listed here. Each of these activities is designed to stimulate a conversation about problems that adversely affect capability and, consequently, ability to deliver against demand. The essence of kaizen in knowledge work is that we must provoke conversations about problems across groups of people from different teams and with different skills.

The evolution of Agile is primarily focused on evolving the product toward a better fit with requirements. In Agile, both the product and the requirements are refined as more is known through experience. Kaizen, a continuous improvement method used in Lean, focuses on the development process itself. When Kaizen is practiced in an Agile project, the participants not only suggest ways to improve the fit between the product and the requirements but also offer ways to improve the process being used, something usually not emphasized in Agile methods. Eckfeldt described the use of Kaizen snakes and project thermometers to capture process improvement feedback.

Daily standup meetings

Teams of software developers, often up to 50, typically meet in front of a visual control system such as a whiteboard displaying a visualization of their work-in-progress. They discuss the dynamics of flow and factors affecting the flow of work. Particular focus is made to externally blocked work and work delayed due to bugs. Problems with the process often become evident over a series of standup meetings. The result is that a smaller group may remain after the meeting to discuss the problem and propose a solution or process change. A kaizen event will follow. These spontaneous meetings are often referred to as spontaneous quality circles in older literature. Such spontaneous meetings are at the heart of a truly kaizen culture. Managers will encourage the emergence of kaizen events after daily standup meetings in order to drive adoption of Lean within their organization.


Project teams may schedule regular meetings to reflect on recent performance. These are often done after specific project deliverables are complete or after time-boxed increments of development known as iterations or sprints in Agile software development.

Retrospectives typically use an anecdotal approach to reflection by asking questions like “what went well?”, “what would we do differently?”, and “what should we stop doing?”

Retrospectives typically produce a backlog of suggestions for kaizen events. The team may then prioritize some of these for implementation.

A retrospective is intended to reveal facts or feelings which have measurable effects on the team’s performance, and to construct ideas for improvement based on these observations. It will not be useful if it devolves into a verbal joust, or a whining session.

On the other hand, an effective retrospective requires that each participant feel comfortable speaking up. The facilitator is responsible for creating the conditions of mutual trust; this may require taking into accounts such factors as hierarchical relationships, the presence of a manager for instance may inhibit discussion of performance issues.
Being an all-hands meeting, a retrospective comes at a significant cost in person-hours. Poor execution, either from the usual causes of bad meetings (lack of preparation, tardiness, inattention) or from causes specific to this format (lack of trust and safety, taboo topics), will result in the practice being discredited, even though a vast majority of the Agile community views it as valuable.
An effective retrospective will normally result in decisions, leading to action items; it’s a mistake to have too few (there is always room for improvement) or too many (it would be impractical to address “all” issues in the next iteration). One or two improvement ideas per iteration retrospective may well be enough.
Identical issues coming up at each retrospective, without measurable improvement over time, may signal that the retrospective has become an empty ritual

Operations Reviews

An operations review is typically larger than a retrospective and includes representatives from a whole value stream. It is common for as many as 12 departments to present objective, quantitative data that show the demand they received and reflect their capability to deliver against the demand. Operations reviews are typically held monthly. The key differences between an operations review and a retrospective is that operations reviews span a wider set of functions, typically span a portfolio of projects and other initiatives, and use objective, quantitative data. Retrospectives, in comparison, tend to be scoped to a single project; involve just a few teams such as analysis, development, and test; and are generally anecdotal in nature.

An operations review will provoke discussions about the dynamics affecting performance between teams. Perhaps one team generates failure demand that is processed by another team? Perhaps that failure demand is disruptive and causes the second team to miss their commitments and fail to deliver against expectations? An operations review provides an opportunity to discuss such issues and propose changes. Operations reviews typically produce a small backlog of potential kaizen events that can be prioritized and scheduled for future implementation.

There is no such thing as a single Lean Software Development process. A process could be said to be Lean if it is clearly aligned with the values and principles of Lean Software Development. Lean Software Development does not prescribe any practices, but some activities have become common. Lean organizations seek to encourage kaizen through visualization of workflow and work-in-progress and through an understanding of the dynamics of flow and the factors (such as bottlenecks, non-instant availability, variability, and waste) that affect it. Process improvements are suggested and justified as ways to reduce sources of variability, eliminate waste, improve flow, or improve value delivery or risk management. As such, Lean Software Development processes will always be evolving and uniquely tailored to the organization within which they evolve. It will not be natural to simply copy a process definition from one organization to another and expect it to work in a different context. It will also be unlikely that returning to an organization after a few weeks or months to find the process in use to be the same as was observed earlier. It will always be evolving.

The organization using a Lean software development process could be said to be Lean if it exhibited only small amounts of waste in all three forms (“mura,” “muri,” and “muda”) and could be shown to be optimizing the delivery of value through effective management of risk. The pursuit of perfection in Lean is always a journey. There is no destination. True Lean organizations are always seeking further improvement.

Lean Software Development is still an emerging field, and we can expect it to continue to evolve over the next decade.

Lean software development at BBC WorldWide

The lean ideas behind the Toyota production system can be applied to software project
management. This paragraph explains the  investigation of the performance of a nine-person software development team employed by BBC Worldwide based in London. The data collected in 2009 involved direct observations of the development team, the kanban boards, the daily stand-up meetings, semistructured interviews with a wide variety of staff, and statistical analysis. The evidence shows that over the 12-month period, lead time to deliver software improved by 37%, consistency of delivery rose by 47%, and defects reported by customers fell 24%. The significance of this work is showing that the use of lean methods including visual management, team-based problem solving, smaller batch sizes, and statistical process control can improve software development. It also summarizes key differences between agile and lean approaches to software development.
The conclusion is that the performance of the software development team was improved by adopting a lean approach. The faster delivery with a focus on creating the highest value to the customer also reduced both technical and market risks.

Lean software development at IMVU INc

IMVU Inc. ( is a virtual company where users meet as personalized avatars in 3D digital rooms. Founded in 2004, IMVU has 25 million  registered users, 100,000 registered developers and  reached $1 million in monthly revenue. Over 90 percent of IMVU‘s revenue is from the direct sale of  virtual credits (a form of currency) to users who purchase digital products from its 1.8 million item digital  catalog. IMVU has won the 2008 Virtual Worlds  Innovation Award and was also named a Rising Star  in the 2008 Silicon Valley Technology Fast 50 program. IMVU receives funding from top venture investors Menlo Ventures, Allegis Capital and Bridgescale Partners. Its offices are located in Palo Alto,  CA. (

Software Development at IMVU
IMVU‘s founders had previously founded—a virtual world startup that took three  years to build, burned through a ton of money, and  was an abysmal failure after launch. However, from  an engineering perspective, was an  amazing success, as they built it ahead of schedule,  maintained tight quality standards, and solved multiple difficult technical problems. Still, it wasn‘t a  commercial success—and large amounts of time and  money were wasted. As a result, IMVU‘s founding  team decided to build the minimum viable product  and then test it with users—even if the product  seemed only half-built (an engineer‘s nightmare).
As a result, IMVU was one of the startups that pioneered the ―build-just-a-little-and-get-customer-feedback model. This model was only possible because  of the application of several lean principles at the  technical level in the development process.
Lean Principle #1: Specify Value in the Eyes of the  Customer
From the beginning, IMVU‘s founders decided they  wanted to build a culture of ―ship, ship, ship. From a  business perspective, this makes a lot of sense, but  from an engineering perspective, it‘s like pulling fingernails with a pair of rusty pliers:―Bugs were all over the place, extremely ugly looking, and only the most rudimentary features.
In essence, releasing a sub-par product allowed  IMVU to avoid over-production wastes by putting  man hours only into features their customers liked.
Lean Principle #2: Identify Value Stream and  Eliminate Waste
The IMVU team worked hard to cultivate the ―ship,  ship, ship mentality. For example, on their very first  day, most developers were expected to write some  code and push it into production. Even though it was  generally just a small bug fix or a miniscule feature,  this ―release-code-on-the-first-day-of-work idea  seemed revolutionary for most new hires.
Continuous deployment reduced the wastes of overproduction, waiting, and processing. In a traditional  development process, multiple engineers are busy  building multiple features based on the last bit of  stable code. When they try to deploy their feature  after two weeks of work, they find that someone else  deployed a different feature the previous day and the
two features don‘t play well together. Continuous  deployment allows engineers to upload their work  instantaneously – thus ensuring engineers are always  working from the same base code. This avoids  spending extra weeks making the feature code  compatible.
Lean Principle #3: Make Value Flow at the Pull of the Customer
IMVU projects have an eight week Return on Investment (ROI) target. Whenever someone suggests a  small project, they are asked to provide a general  roadmap showing that the project could repay the  time investment in eight weeks. Projects are continuously tested on small numbers of IMVU users –who often had no idea they were part of a bucket test.  If a project shows success, they keep working on it.
After a few weeks, if the numbers shows the project  had zero chance of positive ROI, it is shut down immediately. Over time, as IMVU matures, this project  ROI target is expanded:
Lean Principle #4: Involve and Empower
Employees IMVU implemented the 5 Why‘s process, also known as ―Root cause analysis, to involve and empower its employees during trouble-shooting  processes. 5 Why‘s process is the technique of asking  why five times to get to the root cause of a problem  when it occurs.
in blog posts by Ries, each IMVU engineer has his/her own sandbox which mimicked
production as close as possible. IMVU has a comprehensive set of unit, acceptance, functional, and performance tests, and practiced Test-Drive-Development across the whole team. Engineers build a series of test tags, and quickly run a subset of tests in their sandboxes. Revisions are required if a test fails. To keep developers on the same code before it passed the various tests, IMVU created the equivalent of a Kanban system plus an Andon cord (automated testing and immediate rollback system). Developers are assigned a single task, and not allowed to move onto the next task until their code not only successfully passes the automated testing, but also has successfully deployed. Only then, they can pull the next task from the queue. This means that developers have a little bit of idle time while the tests were running. It also means that code is fully completed before a developer moves on. As a result, engineering is optimized for productivity rather than activity:

Lean Principle #5: Continuously Improve in Pursuit of Perfection
The problem with all this emphasis on ―ship, ship, ship, was different bugs in the code kept taking the site down. Sometimes it was simply a scaling issue— new upgrades worked fine on an engineer‘s computer, but crashed when hundreds of thousands of users tried it. Other times, it was a new employee releasing some feature without understanding how the previous code base worked. From a business perspective, it didn‘t matter what the problem was; if the site was down, IMVU was losing money.
From a technical perspective, each new problem required a different solution. Solving scaling issues is very different than solving a single infinite loop problem. The only practical fix was either cease continuous deployment or institute automated tests that checked the code, plus allowed for immediate code rollbacks if any server started to crash.

Eventually, IMVU architected a series of automated tests that looked at every new code check-in, tested it, and then pushed it onto the live servers. If at any point the code crashed—either during testing or once it started running in the wild—the automated tests instituted a rollback to the last verified good version, and sent a nice little e-mail back to the engineer that said ―We‘re sorry, but it looks like your code ABC caused a problem at XYZ. Afraid we can‘t let your code go live until this is fixed. As a result, the automated testing caught an amazing number of errors, and IMVU management started pursuing massively high quality expectations.

IMVU Successfully implemented lean principles at the technical level in the software development  process. They encountered many common challenges that software companies face: choosing the  right product feature, long development cycle, endless testing and debugging. IMVU found solutions  by sticking with the basic lean principles. They were  able to identify and reduce common wastes in software development process, specifically, overproduction, waiting, process, and defects.

IMVU clearly demonstrated the importance of lean  implementation in the software development process.
The implementation of lean principles cannot turn  software development into a production line environment, with scientific methods for each step of the  way. However, it can help turn a chaotic, constantly  changing process, into a much more predictable, fast  moving, and streamlined process. Lean implementation coupled with brilliant designs and fully engaged  intellectual team can help deliver great software  products.
It would seem that the rapid release cycles called for  by lean principles can only be effective if there is a  comprehensive and rigorous testing environment. An  interesting question is whether IMVU‘s practices  (such as daily release online) would be applicable to  software companies that focus on packaged rather  than online products. In this case, the ―customer is a  combination of the other developers and the ultimate  consumer. IMVU‘s experience challenges the conventional wisdom in software development. Can it be  beneficial to all software companies striving to deliver the right product, at the right time, and at the  right price? Middleton and Sutton  believe that the benefits work across different types of software. Yet,  they also recognize that lean software is far too early  in its evolution.

Lean Beyond Agile

In recent years, Lean Software Development has really emerged as its own discipline related to, but not specifically a subset of the Agile movement. This evolution started with the synthesis of ideas from Lean Product Development and the work of Donald G. Reinertsen and ideas emerging from the non-Agile world of large scale system engineering and the writing of James Sutton and Peter Middleton. David J. Anderson also synthesized the work of Eli Goldratt and W. Edwards Deming and developed a focus on flow rather than waste reduction . At the behest of Reinertsen around 2005,  David J. Anderson introduced the use of kanban systems that limit work-in-progress and “pull” new work only when the system is ready to process it. Alan Shalloway added his thoughts on Lean software development in his 2009 book on the topic. Since 2007, the emergence of Lean as a new force in the progress of the software development profession has been focused on improving flow, managing risk, and improving (management) decision making. Kanban has become a major enabler for Lean initiatives in IT-related work. It appears that a focus on flow, rather than a focus on waste elimination, is proving a better catalyst for continuous improvement within knowledge work activities such as software development.

PEARL XVIII : Elucidation on ATDD – Acceptance Test Driven Development

PEARL XVIII : Elucidation on ATDD – Acceptance Test Driven Development 

TDD helps software developers produce working, high-quality code that’s maintainable and, most of all, reliable. Customers are rarely, however, interested in buying code. Customers want software that helps them to be more productive, make more money, maintain or improve operational capability, take over a market, and so forth. This is what we need to deliver with our software—functionality to support business function or market needs. Acceptance test-driven development (acceptance TDD) is what helps developers build high-quality software that fulfills the business’s needs as reliably as TDD helps ensure the software’s technical quality.

Acceptance Test Driven Development (ATDD) is a practice in which the whole team collaboratively discusses acceptance criteria, with examples, and then distills them into a set of concrete acceptance tests before development begins. It’s the best way to ensure that we all have the same shared understanding of what it is we’re actually building. It’s also the best way to ensure we have a shared definition of Done. .

Acceptance TDD helps coordinate software projects in a way that helps us deliver exactly what the customer wants when they want it, and that doesn’t let us implement the required functionality only half way.

An essential property of acceptance TDD is that it’s a team activity and a team process.

Acceptance tests are specifications for the desired behavior and functionality of a system. They tell us, for a given user story, how the system handles certain conditions and inputs and with what kinds of outcomes. There are a number of properties that an acceptance test should exhibit; 

An important property of acceptance tests is that they use the language of the domain and the customer instead of geek-speak only the programmer understands. This is the fundamental requirement for having the customer involved in the creation of acceptance tests and helps enormously with the job of validating that the tests are correct and sufficient. Scattering too much technical lingo into our tests makes us more vulnerable to having a requirement bug sneak into a production release—because the customer’s eyes glaze over when reading geek-speak and the developers are drawn to the technology rather than the real issue of specifying the right thing.

By using a domain language in specifying our tests, we are also not unnecessarily tied to the implementation, which is useful since we need to be able to refactor our system effectively. By using domain language, the changes we need to make to our existing tests when refactoring are typically non-existent or at most trivial.

Concise, precise, and unambiguous

Largely for the same reasons we write our acceptance tests using the domain’s own language, we want to keep our tests simple and concise. We write each of our acceptance tests to verify a single aspect or scenario relevant to the user story at hand. We keep our tests uncluttered, easy to understand, and easy to translate to executable tests. The less ambiguity involved, the better we are at avoiding mistakes and the working with our tests.

We might write our stories as simple reminders in the form of a bulleted list, or we might opt to spell them out as complete sentences describing the expected behavior. In either case, the goal is to provide just enough information for us to remember the important things we need to discuss and test for, rather than documenting those details beforehand. Card, conversation, confirmation—these are the three Cs that make up a user story. Those same three Cs could be applied to acceptance tests as well.

Yet another common property of acceptance tests is that they might not be implemented (translation: automated) using the same programming language as the system they are testing. Whether this is the case depends on the technologies involved and on the overall architecture of the system under test. For example, some programming languages are easier to inter-operate with than others. Similarly, it is easy to write acceptance tests for a web application through the HTTP protocol with practically any language we want, but it’s often impossible to run acceptance tests for embedded software written in any language other than that of the system itself.

The main reason for choosing a different programming language for implementing acceptance tests than the one we’re using for our production code (and, often, unit tests) is that the needs of acceptance tests are often radically different from the properties of the programming language we use for implementing our system. To give you an example, a particular real-time system might be feasible to implement only with native C code, whereas it would be rather verbose, slow, and error-prone to express tests for the same real-time system in C compared to, for example, a scripting language.

The ideal syntax for expressing our acceptance tests could be a declarative, tabular structure such as a spreadsheet, or it could be something closer to a sequence of higher-level actions written in plain English. If we want to have our customer collaborate with developers on our acceptance tests, a full-blown programming language such as Java, C/C++, or C# is likely not an option. “Best tool for the job” means more than technically best, because the programmer’s job description also includes collaborating with the customer.

The acceptance TDD cycle

In its simplest form, the process of acceptance test-driven development can be expressed as the simple cycle illustrated by figure 1

Figure 1. The acceptance TDD cycle

This cycle continues throughout the iteration as long as we have more stories to implement, starting over again from picking a user story; then writing tests for the chosen story, then turning those tests into automated, executable tests; and finally implementing the functionality to make our acceptance tests pass.

In practice, of course, things aren’t always that simple. We might not yet have user stories, the stories might be ambiguous or even contradictory, the stories might not have been prioritized, the stories might have dependencies that affect their scheduling, and so on.

Step 1: Pick a user story

The first step is to decide which story to work on next. Not always an easy job; but, fortunately, most of the time we’ll already have some relative priorities in place for all the stories in our iteration’s work backlog. Assuming that we have such priorities, the simplest way to go is to always pick the story that’s on top of the stack—that is, the story that’s considered the most important of those remaining. Again, sometimes, it’s not that simple.

Generally speaking, the stories are coming from the various planning meetings held throughout the project where the customer informally describes new features, providing examples to illustrate how the system should work in each situation. In those meetings, the developers and testers typically ask questions about the features, making them a medium for intense learning and discussion. Some of that information gets documented on a story card (whether virtual or physical), and some of it remains as tacit knowledge. In those same planning meetings, the customer prioritizes the stack of user stories by their business value (including business risk) and technical risk (as estimated by the team).

There are times when the highest-priority story requires skills that we don’t possess, or we consider not having enough of. In those situations, we might want to skip to the next task to see whether it makes more sense for us to work on it. Teams that have adopted pair programming don’t suffer from this problem as often. When working in pairs, even the most cross-functional team can usually accommodate by adjusting their current pairs in a way that frees the necessary skills for picking the highest priority task from the pile.

The least qualified person

The traditional way of dividing work on a team is for everyone to do what they do best. It’s intuitive. It’s safe. But it might not be the best way of completing the task. Arlo Belshee presented an experience report at the Agile 2005 conference, where he described how his company had started consciously tweaking the way they work and measuring what works and what doesn’t. Among their findings about stuff that worked was a practice of giving tasks to the least qualified person.

There can be more issues to deal with regarding picking user stories, but most of the time the solution comes easily through judicious application of common sense. For now, let’s move on to the second step in our process: writing tests for the story we’ve just picked.

Step 2: Write tests for a story

With a story card in hand (or onscreen if you’ve opted for managing your stories online), our next step is to write tests for the story.

The first thing to do is, of course, get together with the customer. In practice, this means having a team member sit down with the customer (they’re the one who should own the tests, remember?) and start sketching out a list of tests for the story in question.

As usual, there are personal preferences for how to go about doing this, but current preference  is to quickly scram out a list of rough scenarios or aspects of the story we want to test in order to say that the feature has been implemented correctly. There’s time to elaborate on those rough scenarios later on when we’re implementing the story or implementing the acceptance tests. At this time, however, we’re only talking about coming up with a bulleted list of things we need to test—things that have to work in order for us to claim the story is done.

On timing

Especially in projects that have been going on for a while already, the customer and the development team probably have some kind of an idea of what’s going to get scheduled into the next iteration in the upcoming planning meeting. In such projects, the customer and the team have probably spent some time during the previous iteration sketching acceptance tests for the features most likely to get picked in the next iteration’s planning session. This means that we might be writing acceptance tests for stories that we’re not going to implement until maybe a couple of weeks from now. We also might think of missing tests during implementation, for example, so this test-writing might happen pretty much at any point in time between writing the user story and the moment when the customer accepts the story as completed.

Once we have such a rough list, we start elaborating the tests, adding more detail and discussing about how this and that should work, whether there are any specifics about the user interface the customer would like to dictate, and so forth. Depending on the type of feature, the tests might be a set of interaction sequences or flows, or they might be a set of inputs and expected outputs. Often, especially with flow-style tests, the tests specify some kind of a starting state, a context the test assumes is part of the system.

Other than the level of detail and the sequence in which we work to add that detail, there’s a question of when—or whether—to start writing the tests into an executable format. Witness step 3 in our process: automating the tests.

Step 3: Automate the tests

The next step once we’ve got acceptance tests written down on the back of a story card, on a whiteboard, in some electronic format, or on pink napkins, is to turn those tests into something we can execute automatically and get back a simple pass-or-fail result. Whereas we’ve called the previous step writing tests, we might call this step implementing or automating those tests.

In an attempt to avoid potential confusion about how the executable acceptance tests differ from the acceptance tests

We might turn Acceptance tests into an executable format by using a variety of approaches and tools. The most popular category of tools  these days seems to be what we calltable-based tools. Their premise is that the tabular format of tables, rows, and columns makes it easy for us to specify our tests in a way that’s both human and machine readable. Figure 2 presents an example of how we might draft an executable test for the first test  “Valid account number”.

Figure 2. Example of an executable test, sketched on a piece of paper

In figure 2, we’ve outlined the steps we’re going to execute as part of our executable test in order to verify that the case of an incoming support call with a valid account number is handled as expected, displaying the customer’s information onscreen. Our test is already expressed in a format that’s easy to turn into a tabular table format using our tool of choice—for example, something that eats HTML tables and translates their content into a sequence of method invocations to Java code according to some documented rules.

The inevitable fact is that most of the time, there is not such a tool available that would understand our domain language tests in our table format and be able to wire those tests into calls to the system under test. In practice, we’ll have to do that wiring ourselves anyway—most likely the developers or testers will do so using a programming language.

To summarize this duality of turning acceptance tests into executable tests, we’re dealing with expressing the tests in a format that’s both human and machine readable and with writing the plumbing code to connect those tests to the system under test.

On style

The example in figure 2 is a flow-style test, based on a sequence of actions and parameters for those actions. This is not the only style at our disposal, however. A declarative approach to expressing the desired functionality or business rule can often yield more compact and more expressive tests than what’s possible with flow-style tests.

Yet our goal should—once again—be to keep our tests as simple and to the point as possible, ideally speaking in terms of what we’re doing instead of how we’re doing it.

With regard to writing things down , there are variations on how different teams do this. Some start writing the tests right away into electronic format using a word processor; some even go so far as to write them directly in an executable syntax. Some teams run their tests as early as during the initial authoring session. Some people,  prefer to work on the tests alongside the customer using a physical medium, leaving the running of the executable tests for a later time. For example, agilists like to sketch the executable tests on a whiteboard or a piece of paper first, and pick up the computerized tools only when they got something  relatively sure won’t need to be changed right away.

The benefit is that we’re less likely to fall prey to the technology—Agilsts noticed that tools often steal too much focus from the topic, which we don’t want. Using software also has this strange effect of the artifacts being worked on somehow seeming more formal, more final, and thus needing more polishing up. All that costs time and money, keeping us from the important work.

In projects where the customer’s availability is the bottleneck, especially in the beginning of an iteration (and this is the case more often than not), it makes a lot of sense to have a team member do the possibly laborious or uninteresting translation step on their own rather than keep the customer from working on elaborating tests for other stories. The downside to having the team member formulate the executable syntax alone is that the customer might feel less ownership in the acceptance tests in general—after all, it’s not the exact same piece they were working on. Furthermore, depending on the chosen test-automation tool and its syntax, the customer might even have difficulty reading the acceptance tests once they’ve been shoved into the executable format dictated by the tool.

let’s consider a case where our test-automation tool is a framework for which we express our tests in a simple but powerful scripting language such as Ruby. Figure 3 highlights the issue with the customer likely not being as capable of feeling ownership of the implemented acceptance test compared to the sketch, which they have participated in writing. Although the executable snippet of Ruby code certainly reads nicely to a programmer, it’s not so trivial for a non-technical person to relate to.

Figure 3. Contrast between a sketch an actual, implemented executable acceptance test

Another aspect to take into consideration is whether we should make all tests executable to start with or whether we should automate one test at a time as we progress with the implementation. Some teams—and this is largely dependent on the level of certainty regarding the requirements—do fine by automating all known tests for a given story up front before moving on to implementing the story.

Some teams prefer moving in baby steps like they do in regular test-driven development, implementing one test, implementing the respective slice of the story, implementing another test, and so forth. The downside to automating all tests up front is, of course, that we’re risking more unfinished work—inventory, if you will—than we would be if we’d implemented one slice at a time. Agilists  preference is strongly on the side of implementing acceptance tests one at a time rather than try getting them all done in one big burst. It should be mentioned, though, that elaborating acceptance tests toward their executable form during planning sessions could help a team understand the complexity of the story better and, thus, aid in making better estimates.

Many of the decisions regarding physical versus electronic medium, translating to executable syntax together or not, and so forth also depend to a large degree on the people. Some customers have no trouble working on the tests directly in the executable format (especially if the tool supports developing a domain-specific language). Some customers don’t have trouble identifying with tests that have been translated from their writing. As in so many aspects of software development, it depends.

Regardless of our choice of how many tests to automate at a time, after finishing this step of the cycle we have at least one acceptance test turned into an executable format; and before we proceed to implementing the functionality in question, we will have also written the necessary plumbing code for letting the test-automation tool know what those funny words mean in terms of technology. That is, we will have identified what the system should do when we say “select a transaction” or “place a call”—in terms of the programming API or other interface exposed by the system under test.

To put it another way, once we’ve gotten this far, we have an acceptance test that we can execute and that tells us that the specified functionality is missing. The next step is naturally to make that test pass—that is, implement the functionality to satisfy the failing test.

Step 4: Implement the functionality

Next on our plate is to come up with the functionality that makes our newly minted acceptance test(s) pass. Acceptance test-driven development doesn’t say how we should implement the functionality; but, needless to say, it is generally considered best practice among practitioners of acceptance TDD to do the implementation using test-driven development.

In general, a given story represents a piece of customer-valued functionality that is split—by the developers—into a set of tasks required for creating that functionality. It is these tasks that the developer then proceeds to tackle using whatever tools necessary, including TDD. When a given task is completed, the developer moves on to the next task, and so forth, until the story is completed—which is indicated by the acceptance tests executing successfully.

In practice, this process means plenty of small iterations within iterations. Figure 4 visualizes this transition to and from test-driven development inside the acceptance TDD process.

Figure 4. The relationship between test-driven development and acceptance test-driven development

As we can see, the fourth step of the acceptance test-driven development cycle, implementing the necessary functionality to fix a failing acceptance test, can be expanded into a sequence of smaller TDD cycles of test-code-refactor, building up the missing functionality in a piecemeal fashion until the acceptance test passes.

While the developer is working on a story, frequently consulting with the customer on how this and that ought to work, there will undoubtedly be occasions when the developer comes up with a scenario—a test—that the system should probably handle in addition to the customer/developer writing those things down. Being rational creatures, we add those acceptance tests to our list, perhaps after asking the customer what they think of the test. After all, they might not assign as much value to the given aspect or functionality of the story as we the developers might.

At some point, we’ve iterated through all the tasks and all the tests we’ve identified for the story, and the acceptance tests are happily passing. At this point, depending on whether we opted for automating all tests up front  or automating them just in time, we either go back to Step 3 to automate another test or to Step 1 to pick a brand-new story to work on.

. Getting acceptance tests passing is intensive work.

 Acceptance TDD inside an iteration

A healthy iteration consists mostly of hard work. Spend too much time in meetings or planning ahead, and you’re soon behind the iteration schedule and need to de-scope . Given a clear goal for the iteration, good user stories, and access to someone to answer our questions, most of the iteration should be spent in small cycles of a few hours to a couple of days writing acceptance tests, collaborating with the customer where necessary, making the tests executable, and implementing the missing functionality with test-driven development.

As such, the four-step acceptance test-driven development cycle of picking a story, writing tests for the story, implementing the tests, and implementing the story is only a fraction of the larger continuum of a whole iteration made of multiple—even up to dozens—of user stories, depending on the size of your team and the size of your stories. In order to gain understanding of how the small four-step cycle for a single user story fits into the iteration, we’re going to touch the zoom dial and see what an iteration might look like on a time line with the acceptance TDD–related activities scattered over the duration of a single iteration.

Figure 5 is an attempt to describe what such a time line might look like for a single iteration with nine user stories to implement. Each of the bars represents a single user story moving through the steps of writing acceptance tests, implementing acceptance tests, and implementing the story itself. In practice, there could (and probably would) be more iterations within each story, because we generally don’t write and implement all acceptance tests in one go but rather proceed through tests one by one.

Figure 5. Putting acceptance test-driven development on time line

Notice how the stories get completed almost from the beginning of the iteration? That’s the secret ingredient that acceptance TDD packs to provide indication of real progress. Our two imaginary developers (or pairs of developers and/or testers, if we’re pair programming) start working on the next-highest priority story as soon as they’re done with their current story. The developers don’t begin working on a new story before the current story is done. Thus, there are always two user stories getting worked on, and functionality gets completed throughout the iteration.

So, if the iteration doesn’t include writing the user stories, where are they coming from? As you may know if you’re familiar with agile methods, there is usually some kind of a planning meeting in the beginning of the iteration where the customer decides which stories get implemented in that iteration and which stories are left in the stack for the future. Because we’re scheduling the stories in that meeting, clearly we’ll have to have those stories written before the meeting, no?

That’s where continuous planning comes into the picture.

Continuous planning

Although an iteration should ideally be an autonomous, closed system that includes everything necessary to meet the iteration’s goal, it is often necessary—and useful—to prepare for the next iteration during the previous one by allocating some amount of time for pre-iteration planning activities.  Suggestions regarding the time we should allocate for this continuous planning range from 10–15% of the team’s total time available during the iteration. As usual, it’s good to start with something that has worked for others and, once we’ve got some experience doing things that way, begin zeroing in on a number that seems to work best in our particular context.

In practice, these pre-iteration planning activities might involve going through the backlog of user stories, identifying stories that are most likely to get scheduled for the next iteration, identifying stories that have been rendered obsolete, and so forth. This ongoing pre-iteration planning is also the context in which we carry out the writing of user stories and, to some extent, the writing of the first acceptance tests. The rationale here is to be prepared for the next iteration’s beginning when the backlog of stories is put on the table. At that point, the better we know our backlog, the more smoothly the planning session goes, and the faster we get back to work, crunching out valuable functionality for our customer.

By writing, estimating, splitting if necessary, and prioritizing user stories before the planning meeting, we ensure quick and productive planning meetings and are able to get back to delivering valuable features sooner.

It would be nice if we had all acceptance tests implemented (and failing) before we start implementing the production code. That is often not a realistic scenario, however, because tests require effort as well—they don’t just appear from thin air—and investing our time in implementing the complete set of acceptance tests up front doesn’t make any more sense than big up-front design does in the larger scale. It is much more efficient to implement acceptance tests as we go, user story by user story.

Teams that have dedicated testing personnel can have the testing engineers work together with the customer to make acceptance tests executable while developers start implementing the functionality for the stories.  A hazard is  that most teams, however, are much more homogeneous in this regard and participate in writing and implementing acceptance tests together, with nobody designated as “the acceptance test guy.”

The process is largely dependent on the availability of the customer and the test and software engineers. If your customer is only onsite for a few days in the beginning of each iteration, you probably need to do some trade-offs in order to make the most out of those few days and defer work that can be deferred until after the customer is no longer available. Similarly, somebody has to write code, and it’s likely not the customer who’ll do that; software and test engineers need to be involved at some point.

We start from those stories we’ll be working on first, of course, and implement the user story in parallel with automating the acceptance tests that we’ll use to verify our work. And, if at all possible, we avoid having the same person implement the tests and the production code in order to minimize our risk of human nature playing its tricks on us.

Again, we want to keep an eye on putting too much up-front effort in automating our acceptance tests—we might end up with a huge bunch of tests but no working software. It’s much better to proceed in small steps, delivering one story at a time. No matter how valuable our acceptance tests are to us, their value to the customer is negligible without the associated functionality.

The mid-iteration sanity check

Agilists like to have an informal sanity check in the middle of an iteration. At that point, we should have approximately half of the stories scheduled for the iteration running and passing. This might not be the case for the first iteration, due to having to build up more infrastructure than in later iterations; but, especially as we get better at estimating our stories, it should always be in the remote vicinity of having 50% of the stories passing their tests.

Of course, we’ll be tracking story completion throughout the iteration. Sometimes we realize early on that our estimated burn rate was clearly off, and we must adjust the backlog immediately and accordingly. By the middle of an iteration, however, we should generally be pretty close to having half the stories for the iteration completed. If not, the chances are that there’s more work to do than the team’s capacity can sustain, or the stories are too big compared to the iteration length.

A story’s burn-down rate is constantly more accurate a source of prediction than an inherently optimistic software developer. If it looks like we’re not going to live up to our planned iteration content, we decrease our load.

Decreasing the load

When it looks like we’re running out of time, we decrease the load. We don’t work harder (or smarter). We’re way past that illusion. We don’t want to sacrifice quality, because producing good quality guarantees the sustainability of our productivity, whereas bad quality only creates more rework and grinds our progress to a halt. We also don’t want to have our developers burn out from working overtime, especially when we know that working overtime doesn’t make any difference in the long run. Instead, we adjust the one thing we can: the iteration’s scope—to reality. In general, there are three ways to do that: swap, drop, and split. Tom DeMarco and Timothy Lister have done a great favor to our industry with their best-selling books Slack (DeMarco; Broadway, 2001) and Peopleware (DeMarco, Lister; Dorset House, 1999), which explain how overtime reduces productivity.

Swapping stories is simple. We trade one story for another, smaller one, thereby decreasing our workload. Again, we must consult the customer in order to assure that we still have the best possible content for the current iteration, given our best knowledge of how much work we can complete.

Dropping user stories is almost as straightforward as swapping them. “This low-priority story right here, we won’t do in this iteration. We’ll put it back into the product backlog.” But dropping the lowest-priority story might not always be the best option, considering the overall value delivered by the iteration—that particular story might be of low priority in itself, but it might also be part of a bigger whole that our customer cares about. We don’t want to optimize locally. Instead, we want to make sure that what we deliver in the end of the iteration is a cohesive whole that makes sense and can stand on its own.

The third way to decrease our load, splitting, is a bit trickier compared to dropping and swapping

Splitting stories

How do we split a story we already tried hard to keep as small as possible during the initial planning game? In general, we can split stories by function or by detail (or both). Consider a story such as “As a regular user of the online banking application, I want to optionally select the recipient information for a bank transfer from a list of most frequently and recently used accounts based on my history so that I don’t have to type in the details for the recipients every time.”

Splitting this story by function could mean dividing the story into “…from a list of recently used accounts” and “…from a list of most frequently used accounts.” Plus, depending on what the customer means by “most frequently and recently used,” we might end up adding another story along the lines of “…from a weighted list of most frequently and recently used accounts” where the weighted list uses an algorithm specified by the customer. Having these multiple smaller stories, we could then start by implementing a subset of the original, large story’s functionality and then add to it by implementing the other slices, building on what we have implemented for the earlier stories.

Splitting it by detail could result in separate stories for remembering only the account numbers, then also the recipient names, then the VAT numbers, and so forth. The usefulness of this approach is greatly dependent on the distribution of the overall effort between the details—if most of the work is in building the common infrastructure rather than in adding support for one more detail, then splitting by function might be a better option. On the other hand, if a significant part of the effort is in, for example, manually adding stuff to various places in the code base to support one new persistent field, splitting by detail might make sense.

Regardless of the chosen strategy, the most important thing to keep in mind is that, after the splitting, the resulting user stories should still represent something that makes sense—something valuable—to the customer.

PEARL XIV : An Inquiry on Release/Hardening sprint in Scrum

PEARL XIV : An Inquiry on Release/Hardening sprint in Scrum

There is a deep divide between people who recognize that spending some time on hardening is needed for many environments, and people who are adamant that allocating some time for hardening is a sign that you are doing some things – or everything – wrong

A Hardening/release sprint, often part of a project using agile management methodologies, incorporates activities, sometimes not related to creating product features, that the development team can’t realistically complete within development sprints. To accommodate prerelease activities and help ensure that the release goes well, scrum teams often schedule a release sprint as the final sprint prior to releasing product to customers.

There are two components of the Production Release practice: 1) Release Preparation, and 2) Deployment. Release preparation establishes a release baseline and produces all the necessary supporting material necessary to deploy (and back out, if necessary) the release.

Deployment involves the act of delivering the release into the production environment, verifying that the integration of the release package into the existing environment was successful, and notifying all relevant stakeholders that the features of the release are available for use.

The Hardening release sprint should contain anything you need to do to move the working product to production. Sprint backlog items in a release sprint may include

  • Creating user documentation for the most recent version of the product
  • Performance testing, load testing, security testing, and any other checks to ensure the working software — or other product — will perform acceptably in production
  • Integrating the product with enterprise-wide systems, where testing may take days or weeks
  • Completing organizational or regulatory procedures that are mandatory prior to release
  • Preparing release notes — final notes about changes to the product

If your product is software, backlog items for the Hardening release sprint may also include

  • Preparing the deployment package, enabling all the code for the product features to move to production at one time
  • Deploying your code to the production environment

In a hardening sprint, the team stops focusing on delivering new features or architecture, and instead spends their time on stabilizing the system and getting it ready to be released.

For some people, hardening sprints are for completing testing and fixing work that couldn’t be done – or didn’t get done – earlier. This might include UAT or other final acceptance testing if this is built into a contract or governance model.

Mike Cohn recognizes that teams may need a “ hardening/release sprint” at the end of each release cycle, because the team’s definition of “done” may not be enough – that a “potentially shippable product”and a system that is actually “shippable” or ready for production aren’t the same thing. He suggests that after every 3-5 feature iterations, the team may want to schedule a release sprint to do work like expensive manual system and integration testing and extra reviews, whatever is needed to make sure that what they think is done, is actually done.

Anand Viswanath, in “The end of regression, stabilisation, hardening or release sprints”, describes a common approach where teams schedule 1 or 2 stabilization sprints every 4-6 iterations to do regression testing and system testing in a staging environment, and then fix whatever bugs are found. As he points out, it’s hard to predict how much testing might be required and long it will take to fix whatever problems are found, so the idea is to time box this work and then triage the results.

Because this can be an expensive and risky and stressful way to work, Vishwanath recommends following Continuous Delivery to build an automated test pipeline through to staging in order to catch as many problems as early as possible. This is a good idea, but most large projects, especially projects starting from a legacy code base, will still probably need some kind of hardening or integration testing phase at regular points regardless of what kind of continuous testing they are doing.

Some testing, like interoperability testing with other systems and operational testing, can’t be done effectively until later, when there is enough of a working system to do end-to-end testing, and some of this testing can only be done in staging , or in production. For some systems, load testing and stress testing and soak testing also needs to be left to later, because these teams don’t have access to a  right high end test system to run high load scenarios before they get to production.

Is Hardening a sign that you aren’t doing things right?

Not everyone thinks that scheduling a hardening sprint for testing and fixing like this is a good idea:

“[a hardening sprint] might take the cake for stupid things invented that has lead to institutionalized delusion and ‘Agile’ dysfunction.” Janelle Klein, Who Came up with the “Hardening Sprint”?

For many people, a hardening sprint or release sprint is a bad “process smell”: a sign that the team isn’t working properly or thinking clearly:

“The problem with “hardening sprints” is that you are lying. You make believe your imaginary burndown during the initial sprints shows that you are approaching Done. But it’s a lie–you aren’t getting any closer to being ready for Production until you begin your Test phase. You wrote a pile of code that you didn’t test adequately. You don’t know how good it is, you don’t know how much work you have left to do, and you don’t know how much longer it will take, until you are deep into your Test phase.” Richard Kasperowski, Hardening sprints? Sorry, you’re not Agile

Ron Jeffries says that a hardening sprint for testing and fixing is a clear anti-pattern. Many Agilists agree: if you need a separate sprint to fix bugs, then you’re doing something wrong. But that doesn’t mean that you won’t need extra time to fix things before the system goes live – knowing that it is wrong doesn’t make the bugs go away, you still have to fix them. As the same discussion thread points out, there is a risk that your “definition of done” could fall short of what is actually needed by the customer, so you should plan for 1 or more hardening sprints before release, to double-check and stabilize things, just in case.

In these cases, the need for hardening sprints is a sign of a team’s immaturity (from a post by Paul Beavers):

  1. A beginning agile team will prefer to schedule 6 hardening iterations after a 12 iteration development plan. This is “agile” to the hard core “waterfall guy”.
  2. As time goes by, the team will mature a bit and you will see the seasoned agile team will shrink the number of required hardening iterations at the end, just because they understand they need to “fix” the high severity bugs as they go and QA understands they need to test closer and better early up in the release cycle.
  3. Further down the road the team will notice that by adding a hardening iteration in the middle of the development cycle (and flushing out even lesser priority bugs earlier on in the process), it will help them to maintain cadence later on.
  4. The final step of maturity is there when the team starts understanding “hardening is not required any more”, because they made fixing bugs part of their daily routines.

Hardening is whatever you need to do to Make the System Ready for Production

Another way of looking at hardening, is that this is when you stop thinking about features and focus all of your time on the detailed steps of deploying, installing and configuring the system and making sure that everything is working from end-to-end. In a hardening sprint, your most important customers are operations and support, the people who are going to make sure that the system is running, rather than the end users.

For some teams, this kind of hardening can come as an ugly and expensive surprise, after they understand that what they need to do is to take a working functional prototype and make it ready for the real world:

“All those things that got skipped in the first phase – error handling, monitoring, administration – need to get put into the product.” Catherine Powell, The “Hardening Myth”

But a hardening sprint can also be when when you take care of what operations calls hardening: reviewing and preparing the production environment and securing the run-time, tightening up access to production data, double-checking system and application configs, making sure that auditing is enabled properly, wiring the system in to operations monitoring and metrics collection, checking system dependencies like platform software versions and patch levels (and making sure that all of the systems are consistent, that there aren’t any snowflakes), completing final security reviews and other review and release gates, and making sure that the people installing and running the software have the correct instructions.This is also when you need to prepare your roll-back plan or recovery plan if something bad happens with the release, and test your roll-back and recovery steps. Walk through and rehearse the release process and checklists, and make sure that everyone is prepared to roll out patches quickly after the release is done.

Jail Free Card

Agilists have also heard of hardening sprints being used as a sort of “get out of jail free card” within Scrum teams. The conversations are usually along the following lines:

We don’t have enough time to test that component properly, so we’ll do it in hardening

We don’t have time to fix all of the cosmetic bugs we produced in this sprint, so we’ll do it in hardening
Let’s defer customer support training entirely to hardening
The design is still too volatile to document. We’ll wait until it stabilizes and document it in hardening

What is happening here has to be noted ? Hardening can undermine the broad definition of done that is so incredibly important to Scrum’s quality and delivery dynamics. But if it is so bad, why do it at all? That is why there are some of the contexts where hardening might not only be a good idea, but almost required.

Hardening is something that you have to do

Some people see an obvious need for hardening sprints. For example, Dean Leffingwell includes hardening sprints in his “Scaled Agile Framework”, because there is some work that can only really be done in a final hardening phase:

  • Final exploratory and field testing
  • Checklist validation against release, QA and standards governance
  • Release signoffs if you need them
  • Ops documentation
  • Deployment package
  • Communicate release to everyone (hard to do in big companies)
  • Traceability etc for high assurance and regulatory compliance

Leffingwell makes it clear that hardening shouldn’t include system integration, fixing high priority bugs, automating test scripts, user documentation, regression testing and code cleanup. There is other work that should be done earlier – but in the first year or so, will probably need to be done in a late hardening phase:

  • Cross-component integration, integration with third-party/customer
  • Integrated system-level testing
  • Final QA sign-offs
  • User doc finalization
  • Localization

Dan Rawsthorne explains that teams need at least one release sprint at first to get ready for release to production, because until you’ve actually done it, you don’t really know what you need to do. Release sprints include tasks like:

  • Exploratory testing to double check that key features are working properly
  • Stress testing/load testing/performance testing – testing that is expensive to setup and do
  • Interoperability testing with other production systems
  • Fix whatever comes out of this testing
  • Review and finish off any documentation
  • Train support and sales and customers on new features
  • Help with press releases and other marketing material

The Software Project Manager’s Bridge to Agility anticipates that teams will need at least a short hardening iteration before the system is ready for release, even if they frontload as much testing as possible. A release iteration is not a test-fix phase – it’s when you prepare for the release: capturing screenshots for marketing materials, final tests, small tweaks, finish documentation for whoever needs it, training. The authors suggest however that if some developers have any time left over in the release iteration, they can do some refactoring and other cleanup – which Agilists think is bad advice, given that at this point you don’t want to be introducing any new variables or risks.

Disciplined Agile Delivery, a method that was developed by Scott Ambler at IBM to scale Agile practices to large organizations and large projects, includes a Transition Phase before each release to take care of:

  • Transition planning and coordination
  • End-of-lifecycle testing and fixing
  • Testing and rehearsing deployment
  • Data setup and migration
  • Pilots and beta testing (short UAT if necessary)
  • Reviewing and finalizing documentation
  • Preparing operations and support
  • Stakeholder training

This kind of transition can take almost no time, or it can take several weeks, depending on the situation.

Hardening – taking some time to make sure that the system is really ready to be released – can’t be avoided. The longer your release cycles, the further away development is from day-to-day production, the more hardening you need. Even if you’ve been doing disciplined testing and reviews in stream, you’re going to find some problems at the end. Even if you planned ahead for transition, you’re going to run into operational details that you didn’t know about or didn’t understand until the end.

When agilists first launched a platform from start up, they had to do hardening and stabilization work before going live to get the system ready, and some more work afterwards to deal with operational issues and requirements that they weren’t prepared for. They included time at the end of subsequent releases for extra testing, deployment and roll back planning, and release coordination.

But as they shortened their release cycle, releasing less but more often, and as they built more fail-safes into the system and as they learned more about what they needed to do in ops, and as they invested more in simplifying and automating deployment and everything else that they could, they found that they didn’t need time any outside of  regular iterations for hardening. They a’re still doing hardening – but now this is part of the day-to-day job of building and releasing software.

But there are a whole lot of contexts where hardening can be incredibly useful. Here are a few of these, just to set you up for healthier applications:
1. Distributed and At-Scale Agile: It is quite easy to say that a team should “integrate” their software fully within each sprint. But what if you have 20-50 teams working on the same project and the teams are geographically distributed?
Sure, you want to try to integrate as much as possible across the teams, but clearly you are not going to do it all. It just does not meet reasonable ROI on a sprint-by-sprint basis.
In these cases, having a hardening sprint that is focused toward full Integration, regression, and system testing might be a prudent trade-off.

.2. Customer Receptivity: Something that gets lost in the “potentially shippable product increment” goal that is such a strong part of Scrum is – what if your customer simply cannot tolerate or accept a release every 1-2-3 weeks? What do you do then? Many domains and customers simply cannot.
The game is then to accumulate partial release contents over the course of several-to-many sprints. But then there is the need to re-qualify all of that code when you are ready
to do a real (not potential) release.
3. Test Automation Coverage: Having high degrees of test automation, and automation in general, is one of the ways that agile teams can truly go fast. It provides a wonderful development safety net and gets feedback to the team quickly. But what if you do not have it? What if you are currently stuck with thousands of manual, albeit valuable, test cases to run? Clearly if you run all of them within each sprint you would derail the focus. But running risk-based regression and functional testing is risky since you are deferring levels of coverage. A hardening sprint is a solid way to handle this technical test debt  before releasing.
4. Skewed Sprint Consolidation: In many agile contexts, some work is done a “sprint ahead” of other activities. For example, UX design is often completed by a design Scrum
team and then “handed off” to a front-end implementation team for coding in a follow-on sprint. This sort of staging often happens with research (story spikes), design, architecture, and other up-front activities in projects. The hardening sprint is a place to check that all of these skewed activities have come together into a cohesive package from a customer perspective. It certainly should not be the first or only time, but it can serve as a design convergence point.
5. Defect Rework: At the end of the day, bugs are discovered pre-release whether you are implementing Waterfall or Agile methods. If you are doing Agile well, then you are finding fewer and the rework cycle time is much shorter. But, inevitably, some cleanup of bugs pre-release is fairly normal. Reserving time in your hardening sprint gives you some time for final defect cleanup before making the release.
6. Deployment Readiness and Training: There is a “whole lot” of work required to get a release out the door in many organizational contexts and business domains.  Agilists think one of the things that “potentially shippable product increment” does is to trivialize this level of pre-release preparation. While thier definition of hardening sprints tends to be more testingand defect-repair focused, it is an interval that allows time to prepare software for, err, release! Dan Rawsthorne has written an article on the Agile Atlas site entitled Out the
Door: the Release Sprint, which is perhaps another name for a hardening sprint. Dan focuses the sprint on this area though –finishing all of the details surrounding actually
releasing a supportable product.

Regulations, Governance, and the Art of Trivialized Agile Testing: Finally, another focus for a hardening sprint is making all of the steps and creating all of the obliged artifacts that meet your internal and external regulations for the product. Consider it a “governance acceptance test” checklist where various requirements are exercised and confirmed. Quite often this includes a full regression test that provides traceability between the project requirements (User Stories) and the tests that covered that functionality.
In many domains, financial and healthcare for instance, proof of completeness and following your predefined processes are required steps pre-release

These sorts of sprints are considered a Scrum-butt or Scrum anti-pattern by many of the leading Scrum authorities (CST’sand CSC’s)

PEARL XXIII : Guidelines for Successful and Effective Retrospectives

PEARL XXIII : Retrospectives are widely regarded as the most indispensable of people-focused agile techniques. Inspection and adaptation lie at the very heart of agility, and retrospectives focus on inspecting and adapting the most valuable asset in a software organization, the team itself. Without pursuing improvement as retrospectives require, true agility is simply not achievable. This section deals with guidelines for Successful and effective Retrospectives.

Performance can be neither improved nor maintained without exercise. Simply conducting a meeting isn’t enough to be successful, however. Attention must be paid to ensuring teams plan improvements. If a plan to improve is not part of the outcome, it wasn’t actually a Sprint Retrospective.

When done well, retrospectives are often the most beneficial ceremony a team practices. When done poorly, retrospectives can be wasteful and grueling to attend.

Without deliberately maintaining and improving performance, systems trend toward entropy and degrade over time. This is as true of software development teams as it is of professional athletes and expensive sports cars. That’s why Scrum prescribes the Sprint Retrospective, a regularly occurring event focused on the health and performance of the Scrum Team itself. Sprint Retrospectives are meetings in which Scrum Teams reflect on themselves and their work, producing an actionable plan for improving. Sprint Retrospectives are the final event in each Sprint, marking the end of each Sprint cycle. The Sprint Retrospective is an opportunity for the Scrum Team to inspect itself and create a plan for improvements to be enacted during the next Sprint.
The purpose of the Sprint Retrospective is to:

  • Inspect how the last Sprint went with regards to people, relationships, process, and tools;
  • Identify and order the major items that went well and potential improvements; and,
  • Create a plan for implementing improvements to the way the Scrum Team does its work.Sprint Retrospectives are used by teams to deliberately improve. Effective Sprint Retrospectives are an important ingredient in helping good teams become great and great teams sustain themselves.

Why Retrospectives Matter

Retrospectives are widely regarded as the most indispensable of people-focused agile techniques. Inspection and adaptation lie at the very heart of agility, and retrospectives focus on inspecting and adapting the most valuable asset in a software organization, the team itself. Without pursuing improvement as retrospectives require, true agility is simply not achievable.

Performance can be neither improved nor maintained without exercise. Simply conducting a meeting isn’t enough to be successful, however. Attention must be paid to ensuring teams plan improvements. If a plan to improve is not part of the outcome, it wasn’t actually a Sprint Retrospective. When done well, retrospectives are often the most beneficial ceremony a team practices. When done poorly, retrospectives can be wasteful and grueling to attend.

Anatomy of a Healthy Sprint Retrospective

Scrum says little about the internal structure of Sprint Retrospectives. Rather than prescribing how the Sprint Retrospective is conducted, Scrum specifies the output of the Sprint Retrospective: improvements the Scrum Team will enact for the next Sprint.

This flexibility has birthed a wide array of tools and techniques specifically designed to conduct retrospectives. Several popular practices are described later in this article, but regardless of the specific technique used, good Sprint Retrospectives have these characteristics:

  • The entire team is engaged
  • Discussion focuses on the team rather than individuals
  • The team’s Definition of Done is visited and hopefully expanded
  • A list of actionable commitments is created
  • The results of the previous Sprint Retrospective are visited
  • The discussion is relevant for all attendees

The entire Scrum Team attends each Sprint Retrospective. Usually, this means the Product Owner and Development Team attend as participants while the Scrum Master facilitates the meeting. In some cases, Scrum Teams invite other participants to the meeting. This can be especially helpful when working closely with customers or other stakeholders. Regardless of who attends, the environment for Sprint Retrospectives must be safe for all participants. This means attendees must be honest and transparent while treating others with respect. Passions can ignite in retrospectives as issues of performance and improvement are discussed; skilled facilitators ensure discussions stay positive and professional, focusing on improvement of the team as a whole. This is not an opportunity for personal criticism or attack. Increasing the Definition of Done Development Teams in Scrum use a Definition of Done to note what must be true about their work before it is considered complete. For example, a Development Team may decide that each feature it implements must have at least one passing automated acceptance test. Or the team’s Definition of Done may state that all code must be peer reviewed.

A Development Team’s Definition of Done is meant to expand over time. A newly formed team will invariably have a less stringent and smaller Definition of Done than a more mature team with a shared history of improving. Expanding a team’s Definition of Done lies at the very core of Kaizen, a Japanese term meaning a mindful and constant focus on improvement. While a team may initially require only that code build before being checked in, over time they should evolve more exacting standards like the need for unit tests to accompany new code. With each Sprint, Development Teams hopefully learn something that informs the expansion of the Definition of Done. The Sprint Retrospective is the perfect forum for discussing what was observed and learned during the Sprint and what changes might be made to the Definition of Done as a result. Because not every Product Owner has interest or involvement in internal Development Team practices, some Scrum Teams divide the Sprint Retrospective into two different phases:

  1. Focus on the entire Scrum Team
  2. Focus on the Development Team

Making Actionable Commitments

Although discussion may diverge and converge during the meeting, no Sprint Retrospective is successful if it doesn’t result in commitments by the team. It is not enough to simply reflect on what happened during the Sprint. The Scrum Team makes actionable commitments for what it will:

  1. Keep doing
  2. Start doing
  3. Stop doing

The word “actionable” is significant. Actionable commitments have clear steps to completion and acceptance criteria, just like a good requirement. An actionable commitment is clearly articulated and understood by the team. When teams first start performing retrospectives, they often find it easier to identify problems than plan what to do about them. Accordingly, the commitments published by the team may look like these:

  • Work in smaller batches
  • Make requirements easier to read
  • Write more unit tests
  • Be more accurate when estimating

These are not commitments; they are either goals or perhaps thinly veiled complaints. These are certainly issues that teams may wish to discuss during the Sprint Retrospective, but a list of actionable commitments looks more like this:

  • Check in code at least twice per day: before lunch and before going home
  • Express new Product Backlog items as User Stories and include acceptance criteria
  • Create a failing automated test that proves a defect exists before fixing it
  • Use Planning Poker during Product Backlog grooming sessions

Commitments made in the previous Sprint Retrospective are visited in each new Sprint Retrospective. This is necessary for retrospectives to retain their meaning and value. Few things are as frustrating as being on a team that continually commits to improving itself without making tangible progress toward doing so. For the Sprint Retrospective to be valuable team members must be more than present, they must be invested. Collaborating to create actionable commitments engages attendees and invests them in the success of the team.

Keeping it Relevant

Sprint Retrospectives are fundamentally a technique used to reveal the practices and behaviors of the Scrum Team to itself. When a self-organizing system becomes self-aware, it self-corrects and deliberately improves when given the tools to do so.

For retrospectives to be useful, they must be meaningful to the participants. If the focus isn’t on something valued by the participants, benefits will simply not be realized. The team must be allowed to consider and improve in areas it believes are important. Further, if a facilitator or dominant personality is driving the retrospective to a specific conclusion, the team avoids taking responsibility for itself and its work. Topics visited should be relevant for all levels of expertise. For example, there is little value in visiting the fine points of advanced Test-Driven Development (TDD) scenario if some team members aren’t even familiar with unit tests. The real value may be in deciding to increase the number of tests the team is writing, in getting some training, or in having a team member confident in TDD coach others.
Keep the focus on the Scrum Team, not the individual, and not the broader organization. Focusing holistically allows the team to genuinely see itself as a self-organizing unit, rather than as a loose confederation of individuals. Addressing issues of individual performance is not appropriate during a team retrospective. Not only is personal feedback most appropriately given in private, individual behaviors are not something the team can change together.
Having the team focus on one individual during a Sprint Retrospective is recipe for disaster and may result in irreparable harm to team member’s trust of each other. For retrospectives to be meaningful, they should focus on issues the team can control. Criticizing a company-wide vacation policy may be gratifying for the complainer looking for a sympathetic ear, but does little to help the team improve. Attention must be paid to those issues the team can affect itself, like the reaction it may choose to a particular policy.

Varying the Techniques

There are numerous techniques for conducting retrospectives. Trying different constructions of the Sprint Retrospective meeting keeps things fresh and interesting. As the primary facilitators for the Scrum Teams, Scrum Masters should at least be familiar with some of the more popular techniques.

There are books about retrospectives and blog articles aplenty to help people get the most from their practice. Some of the most popular are briefly described here.

In the most basic of Sprint Retrospective’s a facilitator simply asks basic questions of the team and facilitates discussion. The facilitator or Scrum Master may use various brainstorming techniques to get the team to answer:

  1. What went well in this Sprint?
  2. What happened in this Sprint that could use improvement?
  3. What will we commit to doing in the Sprint?

One simple technique to derive these answers has each team member write 2-3 answers to these questions on sticky notes during a 3-5 minute period of silence. Once created, the suggestions are grouped on a wall for all to see before being voted upon. A list of actionable commitments can thereby be derived from the collective wisdom of the team. Most other Sprint Retrospective techniques are variations on this theme and may focus on only one question or stage of this process. In any case, the outcomes are most important and any good technique supports this basic model.

Reviewing Previous Commitments

In addition to looking ahead to the next Sprint, each Sprint Retrospective should include a review of commitments made in the previous Sprint and a discussion about the team’s success in meeting those commitments. If this discussion isn’t part of each Sprint Retrospective, attendees soon learn their commitments don’t matter, and they’ll stop meeting them.

Further, the right place to review Sprint Retrospective commitments is throughout the Sprint, not just at the end. Once commitments for improvement are made, posting them publicly can help ensure they are considered on a daily basis. Some teams value posting commitments made during Sprint Retrospectives on the wall in a public area as a reminder to everyone what they should be focusing on improving each day.

There are many other techniques for conducting parts or the whole of the Sprint Retrospective. The names of many techniques are listed below and each is worthy of detailed discussion. All of the following are well documented online and in various publications.

Techniques for Sprint Retrospectives

A fishbowl conversation is a form of dialog that can be used when discussing topics within large groups. Fishbowl conversations are usually used in participatory events like Open Space Technology and Unconferences. The advantage of Fishbowl is that it allows the entire group to participate in a conversation. Several people can join the discussion.Four to five chairs are arranged in an inner circle. This is the fishbowl. The remaining chairs are arranged in concentric circles outside the fishbowl. A few participants are selected to fill the fishbowl, while the rest of the group sit on the chairs outside the fishbowl. In an open fishbowl, one chair is left empty. In a closed fishbowl, all chairs are filled. The moderator introduces the topic and the participants start discussing the topic. The audience outside the fishbowl listen in on the discussion.In an open fishbowl, any member of the audience can, at any time, occupy the empty chair and join the fishbowl. When this happens, an existing member of the fishbowl must voluntarily leave the fishbowl and free a chair. The discussion continues with participants frequently entering and leaving the fishbowl.

Depending on how large your audience is you can have many audience members spend some time in the fishbowl and take part in the discussion. When time runs out, the fishbowl is closed and the moderator summarizes the discussion.An immediate variation of this is to have only two chairs in the central group. When someone in the audience wants to join the two-way conversation, they come forward and tap the shoulder of the person they want to replace, at some point when they are not talking. The tapped speaker must then return to the outer circles, being replaced by the new speaker, who carries on the conversation in their place.In a closed fishbowl, the initial participants speak for some time. When time runs out, they leave the fishbowl and a new group from the audience enters the fishbowl. This continues until many audience members have spent some time in the fishbowl. Once the final group has concluded, the moderator closes the fishbowl and summarizes the discussion 

Mad Sad Glad

  1. Divide the board into three areas labelled:
    • Mad – frustrations, things that have annoyed the team and/or have wasted a lot of time
    • Sad – disappointments, things that have not worked out as well as was hoped
    • Glad – pleasures, things that have made the team happy
  2. Explain the meanings of the headings to the team and encourage them to place stickies with their ideas for each of them under each heading
  3. Wait until everyone has posted all of their ideas
  4. Have the team group similar ideas together
  5. Discuss each grouping as a team identifying any corrective actions


  1. Draw a large circle on a whiteboard and divide it into five equal segments
  2. Label each segment ‘Start’, ‘Stop’, ‘Keep Doing’, ‘More Of’, ‘Less Of’
  3. For each segment pose the following questions to the team:
    • What can we start doing that will speed the team’s progress?
    • What can we stop doing that hinders the team’s progress?
    • What can we keep doing to do that is currently helping the team’s progress?
    • What is currently aiding the team’s progress and we can do more of?
    • What is currently impeding the team’s progress and we can do less of?
  4. Encourage the team to place stickies with ideas in each segment until everyone has posted all of their ideas
  5. Erase the wheel and have the team group similar ideas together. Note that the same idea may have been expressed in opposite segments but these should still be grouped together
  6. Discuss each grouping as a team including any corrective actions

Problem Tree
A great technique to solve some of these is to use a problem solving tree. What you need is some post it notes, markers and a large wall or whiteboard.

  1. Start with an problem you need to solve, that you’ve identified in the retrospective.
  2. Write this on a sticky note, and stick it at the top of the tree.
  3. Now ask what participants what you can do to solve the problem.
  4. For each different idea put a sticky note below the first, at the same level.
  5. For each of these nodes do the same and build up a tree structure similar to an organisation chart.
  6. For each idea you put up, ask if it can be done in a single sprint, and if everyone understands what they need to do. If the answer is no, break it down smaller and make another level in the problem solving tree.
  7. Once you have some lower levels that are well understood and easy to implement in a single sprint, dot vote to see which to tackle in the next sprint. Try to only pick one and get it done, rather than lots that go nowhere.

Sailboat Retrospective


  1. Draw a boat on a white board. Include the following details:
    • Sails or engines  – these represent the things that are pushing the team forward towards their goals
    • Anchors – these represent the things that are impeding the team from reaching their goals
  2. Explain the metaphors to the team and encourage them to place stickies with their ideas for each of them on appropriate area of the drawing
  3. Wait until everyone has posted all of their ideas
  4. Have the team group similar ideas together
  5. Discuss each grouping as a team including any corrective actions going forward

Top 5

Expose the most pressing issues in an initially anonymous manner and determine the most effective actions to resolve them.
Length of time:
Approximately 45 minutes depending on the size of the team.
Short Description:
The facilitator asks participants to bring along their top five issues which are then grouped and in pairs the participants create actions to resolve them them before voting on the top actions which are taken away.
Whiteboard or flipchart paper & pens.

  1. Before the retrospective provide participants with a simple Word document template and ask them to identify their top 5 issues (one per template) and for each issue suggest as many solutions as possible. The template is to ensure participants can be as anonymous as possible.
  2. Collect all the print-outs, spread them on the table and ask the team to group relevant issues.
  3. Ask for a title for each group, create a column for each one on a whiteboard (or flip chart sheets stuck to the wall) and place the associated print outs on the floor below.
  4. Get participants to form pairs (preferably with someone they don’t normally work too closely with) and give them three minutes with each column to come up with as many actions as they can and to write them in the column. Pairs are able to refer to the print outs and previous pairs’ actions for inspiration.
  5. After three minutes pairs move on to another column until all are exhausted.
  6. Go through all the actions so all participants are aware of them all.
  7. Give each participant three votes and ask them to choose their favourite actions (can use votes however they wish e.g. 3 on one action).
  8. Identify the most popular actions and ask for volunteers to own them. Make it clear it will be their responsibility to ensure they get completed before the next retrospective (tip: don’t choose too many actions and definitely no more than one action per participant).
  9. As with all retrospective output Agilists find the best way to ensure they get actioned is to stick them up on the wall somewhere everyone can see.

Other Techniques

  • Journey Lines
  • 6 Thinking Hats
  • Appreciative Retrospective
  • Top 5
  • Plan of action
  • Race Car
  • The Abyss
  • The Perfection Game
  • The Improvement Game
  • Force Field Analysis
  • Four L’s
  • World Café
  • Emotional Seismograph

Two particularly rich resources for facilitators looking to expand their retrospective toolboxes are:

Sprint Retrospectives aren’t the Scrum Master’s playground. Newly minted Scrum Masters are sometimes tempted to vary the techniques wildly from Sprint to Sprint. While variety in retrospectives prevents teams falling into a rut, tempering this with some consistency will yield the best results. Teams focusing on actionable outcomes will see the most value from their retrospectives.

Why Retrospectives Dont Work

Worse than being ineffective or a waste of time, badly run Sprint Retrospectives can be destructive and harmful to the team. For this reason, having a skilled facilitator conduct the meeting is highly recommended, especially when teams are new to the practice.Facilitation is typically the job of the Scrum Master, but for Scrum Masters new to the role, this may not be an area of expertise. It requires more than a working knowledge of Scrum for Sprint Retrospectives to have positive outcomes; it requires facilitation skills and the ability to lead a group away from negative discussion toward positive outcomes.

Common Smells

A common example of a bad retrospective is one that deteriorates into a gripe session. It is much easier to remember that went poorly than to identify things that went well, and a trickle of “improvement suggestions” can easily turn into a torrent of complaints when the facilitator doesn’t redirect this conversation.
Other smells that a Sprint Retrospective isn’t working well include:

  • Considering the retrospective a “post-mortem” or “after-action” report rather than an opportunity to plan for improvement
  • Unengaged attendees
  • Critiquing a single person’s performance
  • No resulting actionable commitments
  • Having no “what we did well” answers; teams need to understand and appreciate their positive as well as negative behaviors and practices

In all of the above situations, it is often easy to trace the root cause of the negativity to a lack of trust and commitment on the part of one or more team members. While there is no silver bullet to address this, Scrum specifically charges the Scrum Master with working toward addressing situations like these.

Although Sprint Retrospectives are powerful and valuable events, they are a commonly discarded element of Scrum. Scrum Teams with recent and regular success tend to rationalize away the need to conduct Sprint Retrospectives. This is rather like a fit person deciding to stop exercising.

The meta-conversation may sound a bit like the following: Six Months after Introducing Scrum
Developer Dave: Quality is up, bugs are down. Morale is high, manual regression cost is low. Since we are doing so well, we don’t need the Sprint Retrospectives to help us improve anymore.
Boss Bob: That sounds reasonable. Cancelling that meeting will save us time that can be spent on adding more features.
Six Months Later
Boss Bob: Quality has dropped and bugs are increasing. Team members are dissatisfied and much of the regression work is being performed manually.
Developer Dave: It’s because of Scrum. We told you that it wasn’t a silver bullet and it obviously doesn’t work.
Boss Bob: True. I’ll find a methodology consultant to implement a new process.Obviously, it wasn’t Scrum that failed here. The organization’s decision to omit a key ingredient of Scrum’s success was the catalyst for failure. Unfortunately this scenario is all too common.

Scrum Teams reaching that most tenuous state of high performance are rare, beautiful, and fragile. Meaningful retrospectives are a significant ingredient in keeping those teams functioning at such high levels. Reflecting upon itself allows the team to self-adjust and achieve even higher levels of performance and product quality. This is the very essence of Kaizen, and core to any real program of improvement.

When retrospectives work, the results are palpable. There is an excitement in the team to try new things. When retrospectives work, these things will inevitable be true:

  • The team achieves measurably higher and higher levels of quality over time
  • Individuals understand their role within the context of the team
  • Actionable commitments are known by all team members

Finally, when Sprint Retrospectives work well, the team grows more focused, productive, and valuable to the organization. Excellent software development teams do not simply appear. They emerge over time and then only by deliberate attention to improvement. Sprint Retrospectives are a key ingredient in that emergence.

Common Pitfalls

  • A retrospective is intended to reveal facts or feelings which have measurable effects on the team’s performance, and to construct ideas for improvement based on these observations. It will not be useful if it devolves into a verbal joust, or a whining session.
  • On the other hand, an effective retrospective requires that each participant feel comfortable speaking up. The facilitator is responsible for creating the conditions of mutual trust; this may require taking into accounts such factors as hierarchical relationships, the presence of a manager for instance may inhibit discussion of performance issues.
  • Being an all-hands meeting, a retrospective comes at a significant cost in person-hours. Poor execution, either from the usual causes of bad meetings (lack of preparation, tardiness, inattention) or from causes specific to this format (lack of trust and safety, taboo topics), will result in the practice being discredited, even though a vast majority of the Agile community views it as valuable.
  • An effective retrospective will normally result in decisions, leading to action items; it’s a mistake to have too few (there is always room for improvement) or too many (it would be impractical to address “all” issues in the next iteration). One or two improvement ideas per iteration retrospective may well be enough.
  • Identical issues coming up at each retrospective, without measurable improvement over time, may signal that the retrospective has become an empty ritual.

Milestone Retrospective

Once a project has been underway for some time, or at the end of the project (in that case, especially when the team is likely to work together again), all of the team’s permanent members (not just the developers) invests from one to three days in a detailed analysis of the project’s significant events.

PEARL XXI : PRISMA: Product Risk Assessment for Agile projects

PEARL XXI : PRISMA: Product  Risk Assessment for  Agile projects

Risk assessment and management is the backbone of sequential development models but how does it fit in agile  environments? How can we be sure to identify new risks  when they emerge and to ensure our understanding of  all risks remains accurate? In agile much emphasis is on  communication. Perfect for development issues where  mistakes can be discussed and fixed. But product risk are  not by nature iterative: they is absolute and exists all the  time and making mistakes in dealing with them may not  acceptable. Hence the discussion and consensus approach  needs to be slightly formalized by the use of a systematic  method and process. That’s where PRISMA comes in

Testing activities can be seen as mitigating product risk. A product risk is defined as a risk which is directly related to a potentially failing product. Risk based testing is an approach for developing and prioritizing tests based upon the impact and likelihood of failure of the functionality to be tested. Likelihood is the chance that the software contains defects (caused by for example poor programming, high complexity, etc.). Impact is an indication of the consequences when the software fails.

PRISMA (Product RISk Management) is an approach for identifying the areas that are most important to test, i.e., identifying the areas that have the highest level of business and/or technical risk. The PRISMA method has been bottom-up developed by Improve Quality Services in practice over a large number of years. PRISMA has been proven to be successful in supporting (test) organizations as they apply risk-based testing.
Today, it is taught at several universities to IT students. The PRISMA approach especially supports the test professional in performing product risk identification and product risk analysis as well as in working in close co-operation with stakeholders.
Product Risk Matrix



The central theme in the PRISMA process is the creation of the so-called product risk matrix. For each product risk identified, the impact of possible defects and the likelihood of these defects occurring is determined. By assigning numeric values to both impact and likelihood, a product risk (test item) can be positioned in the product risk matrix.
The standard risk matrix is divided in four areas each representing a different level and type of risk. A different level and/or type of risk should also imply a different test approach, to be documented in a (master) test plan. The product risk matrix can thus used as a basis for all testing performed in a project.

A picture is often worth more than a thousand words. Presenting risk assessment results in a diagram is usually much more effective than in tabular form with many numbers. The table becomes indecipherable very quickly, and often stakeholders lose themselves in a number based discussion.

Presenting the results of a risk analysis in a matrix format, as in a PRISMA product risk where impact is on the horizontal axis, likelihood is on the vertical axis, and the four quadrants each represent a level and type of risk – generally provides a much better basis for discussing and validating the product risks.
Since risk mitigation is one of main objectives of Agile, an approach such as PRISMA can fit into an Agile development project perfectly. In practice PRISMA has proven to be a relatively light weight approach (unlike some), focused on producing tangible results, e.g., the product risk matrix and a differentiated risk-based test approach. Most often when organizations come from a more traditional environment using a structured testing approach such as TMap, many testing practices are removed from dayto-day practice.

Test management approach (TMap) is a software testing methodology. TMap is a method which combines insights on how to test and what to manage, as well as techniques for the individual test consultant.

One of the testing practices that is still necessary is a product risk assessment which determines where and how to focus the limited test resources to effectively meet the project deadlines.

Where some methods use very detailed approaches for product risk assessment, PRISMA is generally considered relatively light weight and result-driven. In fact, from Agilists experience, most projects that convert to Agile software development keep PRISMA as one of their core testing practices. Note that in Agile the team is explicitly responsible for the quality of the product.

The risk assessment process
How is PRISMA applied in Agile software development?

Risk based testing with “Risk Poker” in agile projects

One of the most frequently asked questions about testing, both in traditional and Agile projects, is: “How much testing should be done”? In some traditional projects managers may want the team to ‘test everything’. They want to be absolutely sure that the system is completely tested before it is released into the market, to prevent problems – or even claims – in production. However, testing the entire system in every possible way is impossible. No organization is willing to spend sufficient resources for ‘exhaustive testing’ and pressure on budget and release schedule will not allow for the required effort.
James Bach, leading proponent of Exploratory Testing, introduced the concept of ‘good enough testing’ in 1997. This concept is helpful in understanding the risk based testing approach. Agile projects are usually not striving to develop ‘the absolute perfect software’.
The concept of ‘a potentially useful version of working product’ in Scrum actually means that the software is working ‘good enough’ to take it into production.

‘Good enough’ in this context is defined as: providing sufficient benefits, having no critical problems and the benefits of releasing now outweigh both the consequences of non-critical problems and delaying the project for further testing.

Risk Poker is an approach for product risk based testing in agile projects. The process of Risk Poker is similar to the way that Planning Poker is done, e.g. in Scrum, except that it will result in risk identification and risk analysis rather than estimations and story points.
What are the reasons for applying risk based testing with Risk Poker in agile projects – and what are the benefits?
1. Most agile methods and frameworks – like Scrum – are time-boxed. Iterations have a fixed duration, so both development and testing activities are by definition limited to a pre-defined timeframe. Risk based testing provides an excellent answer to the problem ‘how much testing’ by ensuring that the most important testing has been done within the available time. Therefore risk based testing is a very suitable approach for time-boxed development methods.
2. Just like Planning Poker, Risk Poker is a team-based activity and decisions are made by achieving consensus.
3. In the Scrum process, Risk Poker can be easily combined with Planning Poker in the Planning meeting. They complement each other, because information about business value from the Product Owner (PO) will provide input for the impact component of product risk, and the question-and-answer game about product risks will be input for estimating the testing effort in Planning Poker.
4. User stories are very suitable entities to be used as ‘risk items’ in a product risk analysis. In agile projects, risk identification comes down to identifying user stories.
5. Agile is all about ‘working software’, Risk Poker is a light-weight approach to achieve the most appropriate balance between sufficient quality and acceptable risk – within the available constraints in time and resources.

Product risks are derived from documents (i.e., the list of backlog items assigned to the next sprint and user stories) and are typically identified in a brainstorm session(s). Of course the approach largely depends on the Agile approach that is being used and the cycle time. Based on agilists experience, longer sprints of four weeks or one month are most common. The sprint team is often also the PRISMA team performing the product risk assessment.

“External” stakeholders are contacted and asked for their input or actively participate in the process. It is usually carried out as a focused meeting, where the team runs through the PRISMA process as described below. At the end of the meeting the team agrees on the product risk matrix and thus the focus of testing.

Risk poker

Prior to the planning meeting, the team should determine which factors influence the quality of the delivered software. Typical factors for likelihood are complexity, new development (level of re-uses), interrelations (number of interfaces), size, technology and (in-)experience of team. The team itself decides which factors for likelihood are to be taken into account during the Risk Poker.
Concerning the impact, influencing factors can be business importance (i.e. selling item), financial damage, usage intensity, external visibility and legal sanctions. The Product Owner will decide (together with stakeholders) which of these factors for impact are to be taken into account.

Risk Planes

Risk Planes

Having the list of product risks, they are now scored (separately for likelihood and impact) using the essentials of the planning poker technique as often practiced in agile projects. Planning Poker is a consensus-based technique for estimating. It is a variation of the Wideband Delphi method.
The PRISMA risk poker is uses the list of product risks (user stories) to be tested and several copies of a deck of cards. The decks have numbered cards and often use the sequence: 0, ½, 1, 2, 3, 5, 8, 13, 20, 40, 100, and optionally a “?” (unsure) and a coffee cup (I need a break). A common variation is not using a deck with numbers but colored cards, e.g., dark green, light green, yellow, orange and red, relating back to the “1 to 5” value set. This is practiced since the meaning of the numbers from the deck often lead to much discussion, and are ambiguous in the PRISMA context, when using them to estimate likelihood and impact.
Each team member receives a deck of cards with varying values (or colors). After a short explanation of the product risk item (user story), the moderator (e.g., a SCRUM Master) calls for an estimate for either likelihood or impact. After a few seconds of contemplation, each team member selects a card, without showing it to the other team members, and at a set time, all show their selected cards. It is important that all cards are shown at once, to prevent ‘peer pressure’ towards a lower or higher number (or color). If the numbers (or colors) are essentially the same, the moderator writes down the median value. If they differ wildly, the lowest estimator and highest estimator briefly explain their choice essentially going back to the PRISMA factors for likelihood and impact. Often then agreement is achieved for a number (or color) based on that discussion. If no agreement
is reached, the moderator, business owner (for impact) or lead developer (for likelihood) act as a tie breaker and chooses a number (or color) from within the range. It is important to move quickly to the next product risk item. Optionally, an egg timer can be used to limit time spent in discussion of each item. One common variation is providing each team member with a limited number of each value or color, and having them ‘use up’ each value card in the process. This prevents the tendency of some people to stick to very
high or very low scores for all product risks.

Protection Poker

Without infinite resources, software development teams must prioritize security fortification efforts to prevent the most damaging attacks. The Protection Poker “game” is
a collaborative means for guiding this prioritization and has the potential to improve software security practices and team software security knowledge.

Playing cards in hand, the software development team members stare silently at their cards. Players glance at each other while pensively considering their options. Grant,
the development manager, announces, “Everybody ready?” and each member lays down a card. At once, the silence erupts into a team-wide conversation of opinions, perspective, and debate.
No, this isn’t your secret lunchtime poker game in the broom closet. Nor is this a naïve “team-building” activity from human resources. The team is playing Protection Poker,1 a new software security “game.”
Protection Poker is an informal game for security risk estimation that leads to proactive security fortification during development and prioritizes security- related validation and verification (V&V). Protection Poker provides structure for collaborative misuse case development and threat modeling that plays off the participants’ diversity of knowledge and perspective. The entire extended development team gets involved—software developers, testers, product managers or business owners, project managers, usability
engineers, security engineers, software security experts, and others. Protection Poker is based on a collaborative effort estimation practice, Planning Poker, which many agile software development teams use. (The “rules” of Planning Poker don’t at all resemble
actual poker’s rules, except that each participant hides his or her cards from the other participants until a designated time. Collocated teams often use special cards to do their estimation that contain only selected values.  The Red Hat IT team utilized the Scrum agile software development methodology and “played” Protection Poker during its biweekly iteration planning meetings over a four-month period.
Protection Poker
Protection Poker is a simple but effective software security game. Its tangible output is a list of each requirement’s relative security risk. The team can use this relative risk to determine the type and intensity of design and V&V effort the development team must include in the iteration for each requirement. The team can then use this list to help prioritize security engineering resources toward software areas with the highest risk of attack based on factors such as how easy the new functionality is to attack and the value
of the data accessed through the functionality. Consequently, the team properly estimates the necessary effort to implement the requirement securely, so it can proactively plan which resources are needed for secure implementation. This prioritization and increased
knowledge should lead a team toward developing more secure software. Protection Poker works best for teams that use an iterative development process with relatively short iterations, as agile software development teams often do

Protection Poker and Planning Poker are Wideband Delphi techniques. (Planning Poker’s creator likely chose its name for the catchy alliteration, whereas the term Wideband Delphi might have seemed less accessible to agile teams.) Wideband Delphi is based on the Delphi practice, developed at the RAND Corporation in the late 1940s for the purpose of making forecasts. With the Delphi practice, participants make estimates individually and anonymously in a preliminary round. They collect, tabulate, and return the first-round results to each participant for a second round, during which they must again make a new forecast regarding the same issue. This time, each participant knows what the other participants forecasted in the first round, but doesn’t know the other participants’ rationale behind those forecasts. The second round typically results in a narrowing of the group’s range in forecasts, pointing to some reasonable middle ground regarding the issue of concern. The original Delphi technique avoided group discussion to enable candid and anonymous input.
Barry Boehm created the Wideband Delphi technique as a variant of the Delphi technique where group discussion occurs between rounds in which participants explain why they’ve chosen their values. Wideband Delphi is useful for coming to some conclusion regarding an issue when the only information available is based more on experience than empirical data.

Inspired by Planning Poker Protection Poker uses the relative measures of ease points and value points in its security risk computation (for example, one requirement is five times easier to attack than another). Team members vote on their estimate of relative measures for ease of attack and asset value.

The team is constrained to nine possible values—for instance, 1, 2, 3, 5, 8, 13, 20, 40, and 100 (which Planning Poker uses)—for ease points and value points.
The game uses these particular values because humans are more accurate at estimating small things; hence more possible small values exist than large ones.5 Additionally, team members can do their estimations more quickly with a limited set of possible values. For
example, why argue over whether a requirement is 40 or 46 times easier to attack than another? At that point, we can only really know that the requirement is “a lot easier” to attack.

Using Protection Poker should reduce vulnerabilities in the product through an overall increase of software security knowledge in the team. We observed four major benefits to using the Protection Poker practice:

Security risk estimate and ranking : Albeit based on relative estimates, Protection Poker quantifies software security risk, which a team can then use to rank requirements. This ranking can help developers plan explicit actions to reduce security risk. The extended team obtains estimates via all members’ expert opinions. Incorporating these opinions leads to improved estimation accuracy,particularly over time.

Adaptation of requirement to reduce security risk. The initial requirement might not reflect the need for security functionality, such as role-based access or logging. Through the extended team’s think-like-an-attacker brainstorming, these needs could surface, and the team can update the requirement accordingly.

Proactive security fortification to reduce security risk. Teams who don’t consider security issues as they develop the software might realize too late that they didn’t allocate enough time in the development schedule to build a secure product, sometimes resorting to shipping a knowingly insecure one. Through Protection Poker, before requirement implementation begins, the extended development team has a chance to brainstorm and decide what explicit actions are necessary to reduce security risk, such as conducting a security inspection or intense input-injection testing for a Web form. The team can plan these explicit actions into the implementation schedule.
Software security knowledge sharing. Protection Poker inspires a structured discussion of security issues that incorporates the extended development team’s diverse perspectives. This discussion improves the team members’ knowledge and awareness of what is required to build security into a product.

PEARL V: A Purview on Techniques for Estimation in Agile s/w Methodology

PEARL V: A Purview on Techniques for Estimation in Agile s/w Methodology

Estimation is one of the most misused elements in all of software development. Estimates should reflect the relative difficulty and length of work to guide planning and prioritization — not commit the team to mandatory Saturdays. With estimates in hand, stakeholders can make smart tradeoffs and reasonable forecasts.

Plans are only as good as the estimates, that the plans are based on, and estimates always come second to actual. The real world has this horrible habit of destroying plans.The customer has the right to an overall plan, to see progress and to be informed of schedule changes, Whereas the developer has the right to make and update his own estimates and to accept responsibility instead of having responsibility assigned to him.

You can’t put 10 pounds of groceries into a 5 pound bag. Forecasting tomorrow’s weather is much more difficult than telling what the weather was like yesterday.
Don’t try to be too sophisticated; estimates will never be anything other than approximate, however hard you try. Most software is not a predictable or mass manufacturing problem. Software development is new product development.

It is rarely possible to create upfront unchanging and detailed specs.Near the beginning, it is not possible to estimate. As empirical data emerge, it becomes increasingly possible to plan and estimate.Adaptive steps driven by build-feedback cycles are required. Creative adaptation to unpredictable change is the norm. Change rates are high.

We can often spend a little time thinking about an estimate and come up with a number that is nearly as good as if we had spent a lot of time thinking about it. we often need to expend just a fraction of that effort to get adequate results.

As effort on estimation increases, the accuracy may decrease after a certain amount of effort on estimation.

Vary the effort you put into estimating according to purpose of the estimate. if  the estimate will be used to make a software build versus buy decision, it is likely enough to determine that the project will take six to twelve months. It may be unnecessary to refine that to the point where you can say it will take seven or eight months.

First, no matter how much effort is invested, the estimate is never at the top of the accuracy that is 90 % to 100 %  accurate.No matter how much effort you put into an estimate, an estimate is still an estimate. No amount of additional effort will make an estimate perfect. It is possible to put too much effort into estimating, with the result being a less accurate estimate.

Agile teams, acknowledge that we cannot eliminate uncertainty from estimates, but they embrace the idea that small efforts are rewarded with big gains. Even though they are less far up the accuracy/effort scale, agile teams can produce more reliable plans because they frequently deliver small increments of fully working, tested, integrated code.


Contrasting Traditional and Agile Estimation Techniques

An average software project begins when a team or person outlines a project and receives approval to go forward. The project may be started by a product manager with an idea for an existing product, or by a customer request, or by the signing of a contract.

In the early stages of a project, someone guesses how long it will take to deliver. This person may be a salesperson, project manager, or development manager. They may make a guess based on their experience, or they may have some quick chats with seasoned employees and solicit their opinions.

When the timeline guess is in place, the project begins. If the project is related to a product, there may be marketing requirements to reference. If the project is for a customer, there may be a statement of work to reference. In either case, it’s common for an analyst team to convert the information into functional specifications.

After the functional specifications are completed, a conversation begins with the development team, designs begin to evolve, and some teams may document a technical design and architectural plan. When this work is complete, the development team provides estimates based on the anticipated approach. The team also estimates their capacity by resource type. Then the estimates, capacity, and known dependencies are entered into a project plan. At this point, the team has a schedule that they feel confident in, and they share it with the stakeholders.

This exercise may take several weeks or months to complete. If a project is timeboxed, the team may find that there isn’t enough time to deliver all the features for which they created functional specifications, designs, and estimates. The team then has to scope back the features for the project to meet the timeline, realizing they’ve wasted valuable time in estimating features that won’t be pursued.

Agile estimation techniques address the shortcomings of this method. You don’t design and estimate all your features until there has been a level of prioritization and you’re sure the features are needed. You used a phased approach to estimation, recognizing that you can be more certain as the project progresses and you learn more about the features.

At a high level, the phased process looks like this:

  1. Estimate the features in a short, time-boxed exercise during which you estimate feature size, not duration.
  2. Use feature size to assign features to iterations and create a release plan.
  3. Break down the features you assigned to the first iteration. Breaking down means identifying the specific tasks needed to build the features and estimating the hours required.
  4. Re-estimate on a daily basis during an iteration, estimating the time remaining on open tasks.

Agile estimating is also different in that you involve the entire team in the estimation process.

Whole Team Estimation

Every year, Best Buy Corporation tries to predict how many gift cards will be sold at Christmas. The typical process is to solicit the opinion of upper management and internal estimation experts to forecast a number.

In 2005, the CEO of Best Buy decided to try an experiment. The CEO followed the normal process for obtaining the estimates but also sent an email to approximately 100 random employees throughout the company, asking them how many gift cards they believed would be sold. The only information provided to both groups was the sales number for the previous year.

After the Christmas season was completed, the predictions of both groups were reviewed. The expert panel was accurate within 95 percent of the actual number of cards sold. The random group of employees was accurate within 99.9 percent of the number of cards sold . How did a random group beat the internal estimation experts?

In his book The Wisdom of Crowds, author James Surowiecki makes a case that a diverse set of independently thinking individuals can provide better predictions than a group of experts. Surowiecki qualifies this assertion by stating that the diversity needs to be in the way a group views problems and the heuristics each individual uses to analyze a problem or question. For example, a person’s age can greatly influence their perspective on an issue.

 Best Buy Corporation realized improved estimation accuracy by querying a large, diverse group of employees. The diverse set of employees consistently delivered better estimates than the in-house estimation experts.

Referenced Screen

Surowiecki’s work draws many parallels to the issues with estimating software development. We often get together a group of specialists or experts to estimate the work that needs to be completed. These experts may be managers or leads who facilitate the work of their various teams. The fact that all the experts may be a part of management limits their diversity in opinion. And the fact that these experts may work together frequently may lead to standardized thinking, also known as groupthink.

In an Agile environment, you increase the accuracy of your feature estimates by estimating the features together as a team. Estimates aren’t limited to managers or leads but also include developers, testers, analysts, DBAs, and architects. The features are viewed from various perspectives, and you merge these perspectives to create a common, agreed-on estimate.

Entire-team estimation has additional benefits beyond diverse opinion. First, you get estimates from people who are closer to the work. Team members’ opinions may be diverse, but they provide better estimates because they know your existing code, architecture, and domains and what it takes to deliver in your environment.

A second benefit is team ownership of the estimate. If a manager provides the estimate, they hope the team supports the estimate and buys into it. If the team provides the estimate, they’re immediately closer to owning the estimate, and they feel more responsible for making the dates they provided.

Moving to team-based estimation isn’t easy. Managers may not welcome additional input, and team members may be reluctant to challenge the experts and instead echo whatever the experts say.

It will take time to overcome these hurdles, but you can do one thing to expedite the change: when you perform team-based estimation, have the meeting facilitated by an indirect manager such as a project manager or ScrumMaster. This person can treat all people as equals regardless of title and proactively query team members who are reluctant to contribute. You can also use the planning poker process discussed in the next section to prevent one person’s estimate from influencing another’s.

Estimates are not created by a single individual on the team. Agile teams do not rely on a single expert to estimate. Despite well-known evidence that estimates prepared by those who will do the work are better than estimates prepared by anyone else (Lederer and Prasad 1992), estimates are best derived collaboratively by the team, which includes those who will do the work. There are two reasons for this.

Size of story is given in “story points” (an abstract unit). The team defines how  a story point translates to effort (typically: 1 story point = 1 ideal day of work). ƒ The number of story points that a team can deliver in an iteration is called  “team velocity”.

Agile Estimation

There are three main concepts  team need to understand to do agile estimation,

  • Estimation of Size gives a high-level estimate for the work item, typically measured using a neutral unit such as story points
  • Velocity tells us how many points this project team can deliver within an iteration;
  • Estimation of Effort translates the size (measured in points) to a detailed estimate of effort typically using the units of Actual Days or Actual Hours. The estimation of effort indicates how long it will take the team member(s) to complete the assigned work item(s).

Estimation of Size

Story Points is a relative measure that can be used for agile estimation of size. The team decides how big a story point is, and based on that size, determines how many story points each work item is. To make estimation go fast, use only full story points, 1, 2, 3, 5, 8, and so on, rather than fractions of a point, such 0.25, or 1.65 story points. To get started, look at 10 or so representative work items, give the smallest the size of one story point, and then go through all other work items and give them a relative story point estimate based on that story point. Note that story points are used for high-level estimates, so do not spend too much time on any one item. This is especially true for work items of lower priority, to avoid wasting effort on things that are unlikely to be addressed within the current iteration.

A key benefit of story points is that they are neutral and relative. Let’s say that Ann is 3 times more productive than Jack. If Ann and Jack agree that work item A is worth 1 story point, and they both think work item B is roughly 5 times as big, they can rapidly agree that work item B is worth 5 points. Ann may however think work item B can be done in 12 hours, while Jack thinks it can be done in 36 hours. That is fine, they may disagree about the actual effort required to do it, but we do not care at this point in time, we only want the team to agree on the relative size. We will later use Velocity to determine how much ‘size’, or how many points, the team can take on within an iteration.

One project team may say that a work item of a certain size is worth 1 point. Another project team would estimate the same sized work item to be worth 5 points. That is fine, as long as you are consistent within the same project. Make sure that the entire team is involved in assessing size, or at least that the same people are involved in all your size estimates, to ensure consistency within your project. We will see how the concept of velocity will fix also this discrepancy in a point meaning different things to different project teams.

You can also use other measures of size, where the most common alternative is Ideal Days.


Velocity is a key metric used for iteration planning. It indicates how many points are delivered upon within an iteration for a certain team and project. As an example, a team planned to accomplish 20 points in the first iteration. At the end of the iteration, they noticed that they only delivered upon 14 points, their velocity was hence 14. For the next iteration, they may plan for fewer points, let’s say 18 points, since they think they can do a little better than in previous iteration. In this iteration, they delivered 17 points, giving them a velocity of 17.

Expect the velocity to change from iteration to iteration. Some iterations go smoother than others, and points are not always identical in terms of effort. Some team members are more effective than others, and some problems end up being harder than others. Also, changes to the team structure, learning new skills, changes to the tool environment, better teaming, or more overhead with meetings or tasks external to the project will all impact velocity. In general, velocity typically increases during the project as the team builds skills and becomes more cohesive.

Velocity compensates for differences between teams in terms of how big a point is. Let’s assume that project team Alpha and project team Beta are equally efficient in developing software, and they run the same project in parallel. Team Alpha, however, assesses all work items as being worth 3 times as many points as team Beta’s estimates. Team Alpha assesses work item A, B, C, and D to correspond to 30 points, and team Beta estimates the same work items to correspond to 10 points. Both teams deliver upon those 4 work items in the next iteration, giving team Alpha a velocity of 30, and team Beta a velocity of 10. It may sound as if team Alpha is more effective, but let’s look at what happens when they plan the next iteration. They both want to take on work item E-H, which team Alpha has estimated to be 30 points, and team Beta as normal has estimated to be 1/3 as many points, or 10 points. Since a team can typically take on as many points as indicated by their velocity, they can both take on all of E-H. The end result is that it does not matter how big a point is, as long as you are consistent within your team.

Velocity also averages out the efficiency of different team members. Let’s look at an example; Let’s assume that Ann always works 3 times as fast as Jack and Jane. Ann will perhaps deliver 9 points per iteration, and Jack and Jane 3 points each per iteration. The velocity of that 3-person team will be 15 points. As mentioned above, Ann and Jack may not agree on how much effort is associated with a work item, but they can agree on how many points it is worth. Since the team velocity is 15, the velocity will automatically translate the point estimate to how much work can be taken on. As you switch team members, or as team members become more or less efficient, your velocity will change, and you can hence take on more or less points. This does however not require you to change the estimate of the size. The size is still the same, and the velocity will help you to calculate how much size you can deliver upon with the team at hand for that iteration.

Estimation of Effort

Estimation of Effort translates the size (measured in points) to a detailed estimate of effort typically using the units of Actual Days or Actual Hours. As you plan an iteration, you will take on a work item, such as detail, design, implement and test a scenario, which may be sized to 5 points. Since this is still a reasonably big work item, break it down into a number of smaller work items, such as 4 separate work items for Detailing, Designing, Implementing and Testing Server portion, and Implementing and Testing Client portion of the scenario. Team members are asked to sign up for the tasks, and then detail the estimate of the actual effort, measured in hours or days, for their tasks. In this case, the following actual estimates were done (with person responsible within parenthesis):

  • Detailing scenario (Ann): 4 hours
  • Designing scenario (Ann and Jack):  6 hours
  • Implementing and Testing Server portion of scenario (Jack): 22 hours
  • Implementing and Testing Client portion of scenario (Ann): 12 hours
  • Total Effort Estimate for Scenario: 44 hours

If other people would be assigned to the tasks, the estimated actual hours could be quite different. There is hence no point doing detailed estimates until you know who will do the work, and what actual problems you will run into. Often, some level of analysis and design of the work item needs to take place before a reasonable estimate can be done. Remember that estimates are still estimates, and a person assigned to a task should feel free (and be encouraged) to re-estimate the effort required to complete the task, so we have a realistic view of progress within an iteration.

First, on an agile project we tend not to know specifically who will perform a given task. Yes, we may all suspect that the team’s database guru will be the one to do the complex stored procedure task that has been identified. However, there’s no guarantee that this will be the case. S/he may be busy when the time comes, and someone else will work on it. So because anyone may work on anything, it is important that everyone have input into the estimate. Second, even though we may expect the database guru to do the work, others may have something to say about her estimate. Suppose that the team’s database guru, Kristy, estimates a particular user story as three ideal days. Someone else on the project may not know enough to program the feature himself, but he may know enough to say, “Kristy, you’re nuts; the last time you worked on a feature like that, it took a lot longer. I think you’re forgetting how hard it was last time.” At that point Kristy may offer a good explanation of why it’s different this time. However, more often than not she will acknowledge that she was indeed underestimating the feature.

The Estimation Scale Studies have shown that we are best at estimating things that fall within one order of magnitude (Miranda 2001; Saaty 1996). Within your town, you should be able to estimate reasonably well the relative distances to things like the nearest grocery store, the nearest restaurant, and the nearest library. The library may be twice as far as the restaurant, for example.  Because we are best within a single order of magnitude, we would like to have most of our estimates in such a range. Two estimation scales are of good success

which are

◆ 1, 2, 3, 5, and 8

◆ 1, 2, 4, and 8

There’s a logic behind each of these sequences. The first is the Fibonacci sequence. Agilists found this to be a very useful estimation sequence because the gaps in the sequence become appropriately larger as the numbers increase. A one- point gap from 1 to 2 and from 2 to 3 seems appropriate, just as the gaps from 3 to 5 and from 5 to 8 do. The second sequence is spaced such that each number is twice the number that precedes it. These nonlinear sequences work well because they reflect the greater uncertainty associated with estimates for larger units of work. Either sequence works well, although  preference is for the first. Each of these numbers should be thought of as a bucket into which items of the appropriate size are poured.

Rather than thinking of work as water being poured into the buckets, think of the work as sand. If you are estimating using 1, 2, 3, 5, and 8, and have a story that you think is just the slightest bit bigger than the other five-point stories you’ve estimated, it would be OK to put it into the five-point bucket.

A story you think is a 7, however, clearly would not fit in the five-point bucket.You may want to consider including 0 as a valid number within your estimation range. Although it’s unlikely that a team will encounter many user stories or features that truly take no work, including 0 is often useful. There are two reasons for this. First, if we want to keep all features within a 10x range, assigning nonzero values to tiny features will limit the size of largest features. Second, if the work truly is closer to 0 than 1, the team may not want the completion of the feature to contribute to its velocity calculations. If the team earns one point in this iteration for something truly trivial, in the next iteration their velocity will either drop by one or they’ll have to earn that point by doing work that may not be as trivial. If the team does elect to include 0 in their estimation scale, everyone involved in the project (especially the product owner) needs to understand that 13 × 0 ≠ 0 .

 Agilists never had the slightest problem explaining this to product owners, who realize that a 0-point story is the equivalent of a free lunch. However, they also realize there’s a limit to the number of free lunches they can get in a single iteration. An alternative to using 0 is to group very small stories and estimate them as a single unit. Some teams prefer to work with larger numbers, such as 10, 20, 30, 50, and 100. This is fine, because these are also within a single order of magnitude. However, if you go with larger numbers, such as 10 to 100, Agilists still recommend that you pre-identify the numbers you will use within that range. Do not, for example, allow one story to be estimated at 66 story points or ideal days and another story to be estimated at 67. That is a false level of precision, and we cannot discern a 1.5% difference in size. It’s acceptable to have one-point differences be-tween values such as 1, 2, and 3. As percentages, those differences are much larger than between 66 and 67.

User Stories, Epics, and Themes

3-tier story sizing estimation is a great approach when managing work from the Portfolio to Program to Project level. Corporate initiatives (investment themes) are implemented by Features. These are coarse grain, high level items used for Product Road Mapping and the Story Point sizing is a very rough estimate of the total effort. As Features get broken down into Epics and then stories, each progressive refinement results in more granular Story Points estimates that can be used for Release and Sprint Planning.

Although in general, we want to estimate user stories whose sizes are within one order of magnitude, this cannot always be the case. If we are to estimate every-thing within one order of magnitude, it would mean writing all stories at a fairly fine-grained level.

For features that we’re not sure we want (a preliminary cost estimate is desired before too much investment is put into them) or for features that may not happen in the near future, it is often desirable to write one much larger user story.

A large user story is sometimes called an epic. Additionally, a set of related user stories may be combined (usually by a paper clip if working with note cards) and treated as a single entity for either estimating or release planning. Such a set of user stories is referred to as a theme.

An epic, by its very size alone, is often a theme on its own. By aggregating some stories into themes and writing some stories as epics, a team is able to reduce the effort they’ll spend on estimating. However, it’s important that they realize that estimates of themes and epics will be more uncertain than estimates of the more specific, smaller user stories.

User stories that will be worked on in the near future (the next few iterations) need to be small enough that they can be completed in a single iteration. These items should be estimated within one order of magnitude. Agilists use the sequence 1, 2, 3, 5, and 8 for this. User stories or other items that are likely to be more distant than a few iterations can be left as epics or themes. These items can be estimated in units beyond the 1 to 8 range. To accommodate estimating these larger items Agilists add 13, 20, 40, and 100 to preferred sequence of 1, 2, 3, 5, and 8.

Deriving an Estimate The three most common techniques for estimating are

◆ Expert opinion

◆ Analogy

◆ Disaggregation

Each of these techniques may be used on its own, but the techniques should be combined for best results.

Expert Opinion

If you want to know how long something is likely to take, ask an expert. At least, that’s one approach. In an expert opinion-based approach to estimating, an expert is asked how long something will take or how big it will be. The expert relies on her intuition or gut feel and provides an estimate. This approach is less useful on agile projects than on traditional projects. On an agile project, estimates are assigned to user stories or other user-valued functionality. Developing this functionality is likely to require a variety of skills normally performed by more than one person. This makes it difficult to find suitable experts who can assess the effort across all disciplines. On a traditional project for which estimates are associated with tasks, this is not as significant of a problem, because each task is likely performed by one person. A nice benefit of estimating by expert opinion is that it usually doesn’t take very long. Typically, a developer reads a user story, perhaps asks a clarifying question or two, and then provides an estimate based on her intuition. There is even evidence that says this type of estimating is more accurate than other, more analytical approaches (Johnson et al. 2000).


An alternative to expert opinion comes in the form of estimating by analogy, which is what we’re doing when we say, “This story is a little bigger than that story.” When estimating by analogy, the estimator compares the story being estimated with one or more other stories. If the story is twice the size, it is given an estimate twice as large. There is evidence that we are better at estimating relative size than we are at estimating absolute size (Lederer and Prasad 1998; Vicinanza et al. 1991). When estimating this way, you do not compare all stories against a single baseline or universal reference. Instead, you want to estimate each new story against an assortment of those that have already been estimated. This is referred to as triangulation. To triangulate, compare the story being estimated against a couple of other stories. To decide if a story should be estimated at five story points, see if it seems a little bigger than a story you estimated at three and a lit- tle smaller than a story you estimated at eight.


Disaggregation refers to splitting a story or feature into smaller, easier-to-estimate pieces. If most of the user stories to be included in a project are in the range of two to five days to develop, it will be very difficult to estimate a single story that may be 100 days. Not only are large things notoriously more difficult to estimate, but also in this case there will be very few similar stories to compare. Asking “Is this story fifty times as hard as that story” is a very different question from “Is this story about one-and-a-half times that one?” The solution to this, of course, is to break the large story or feature into multiple smaller items and estimate those. However, you need to be careful not to go too far with this approach. Not only does the likelihood of forgetting a task increase if we disaggregate too far, but summing estimates of lots of small tasks also leads to problems.

Planning Poker

The best way Agilists have found for agile teams to estimate is by playing planning poker (Grenning 2002). Planning poker combines expert opinion, analogy, and disaggregation into an enjoyable approach to estimating that results in quick but reliable estimates. Participants in planning poker include all of the developers on the team. Remember that developers refers to all programmers, testers, database engineers, analysts, user interaction designers, and so on. On an agile project, this will typically not exceed ten people. If it does, it is usually best to split into two teams. Each team can then estimate independently, which will keep the size down. The product owner participates in planning poker but does not estimate. At the start of planning poker, each estimator is given a deck of cards. Each card has written on it one of the valid estimates.

Each estimator may, for example, be given a deck of cards that reads 0, 1, 2, 3, 5, 8, 13, 20, 40, and 100. The cards should be prepared prior to the planning poker meeting, and the numbers should be large enough to see across a table. Cards can be saved and used for the next planning poker session.

For each user story or theme to be estimated, a moderator reads the description. The moderator is usually the product owner or an analyst. However, the moderator can be anyone, as there is no special privilege associated with the role. The product owner answers any questions that the estimators have. The goal in planning poker is not to derive an estimate that will withstand all future scrutiny.

Rather, the goal is to be somewhere well on the left of the effort line, where a valuable estimate can be arrived at cheaply. After all questions are answered, each estimator privately selects a card representing his or her estimate. Cards are not shown until each estimator has made a selection. At that time, all cards are simultaneously turned over and shown so that all participants can see each estimate. It is very likely at this point that the estimates will differ significantly. This is actually good news. If estimates differ, the high and low estimators explain their estimates. It’s important that this does not come across as attacking those estimators. Instead, you want to learn what they were thinking about.

As an example, the high estimator may say, “Well, to test this story, we need to create a mock database object. That might take us a day. Also, I’m not sure if our standard compression algorithm will work, and we may need to write one that is more memory efficient.” The low estimator may respond, “I was thinking we’d store that information in an XML file—that would be easier than a database for us. Also, I didn’t think about having more data—maybe that will be a problem.”

The group can discuss the story and their estimates for a few more minutes. The moderator can take any notes she thinks will be helpful when this story is being programmed and tested. After the discussion, each estimator re-estimates by selecting a card. Cards are once again kept private until everyone has estimated, at which point they are turned over at the same time. In many cases, the estimates will already converge by the second round. But if they have not, continue to repeat the process. The goal is for the estimators to converge on a single estimate that can be used for the story. It rarely takes more than three rounds, but continue the process as long as estimates are moving closer together. It isn’t necessary that everyone in the room turns over a card with exactly the same estimate written down.

Again, the point is not absolute precision but reasonableness.

Smaller Sessions 

It is possible to play planning poker with a subset of the team, rather than involving everyone. This isn’t ideal but may be a reasonable option, especially if there are many, many items to be estimated, as can happen at the start of a new project. The best way to do this is to split the larger team into two or three smaller teams, each of which must have at least three estimators. It is important that each of the teams estimates consistently.

What your team calls three story points or ideal days had better be consistent with what other team calls the same. To achieve this, start all teams together in a joint planning poker session for an hour or so. Have them estimate ten to twenty stories. Then make sure each team has a copy of these stories and their estimates and that they use them as base-lines for estimating the stories they are given to estimate.

When to Play Planning Poker

Teams will need to play planning poker at two different times. First, there will usually be an effort to estimate a large number of items before the project officially begins or during its first iterations.

Estimating an initial set of user stories may take a team two or three meetings of from one to three hours each. Naturally, this will depend on how many items there are to estimate, the size of the team, and the product owner’s ability to clarify the requirements succinctly. Second, teams will need to put forth some ongoing effort to estimate any new stories that are identified during an iteration. One way to do this is to plan to hold a very short estimation meeting near the end of each iteration. Normally, this is quite sufficient for estimating any work that came in during the iteration, and it allows new work to be considered in the prioritization of the coming iteration.

Alternatively, Kent Beck suggests hanging an envelope on the wall with all new stories placed in the envelope. As individuals have a few spare minutes, they will grab a story or two from the envelope and estimate them. Teams will establish a rule for themselves, typically that all stories must be estimated by the end of the day or by the end of the iteration.

Agilists like the idea of hanging an envelope on the wall to contain unestimated stories. However, they prefer that when someone has a few spare minutes to devote to estimating, he find at least one other person and that they estimate jointly.

Why Planning Poker Works

it’s worth spending a moment on some of the reasons why it works so well. First, planning poker brings together multiple expert opinions to do the estimating. Because these experts form a cross-functional team from all disciplines on a software project, they are better suited to the estimation task than anyone else.

After completing a thorough review of the literature on software estimation, Jørgensen (2004) concluded that “the people most competent in solving the task should estimate it.” Second, a lively dialogue ensues during planning poker, and estimators are called upon by their peers to justify their estimates. This has been found to improve the accuracy of the estimate, especially on items with large amounts of un-certainty (Hagafors and Brehmer 1983).

Being asked to justify estimates has also been shown to result in estimates that better compensate for missing information (Brenner et al. 1996). This is important on an agile project because the user stories being estimated are often intentionally vague. Third, studies have shown that averaging individual estimates leads to better results (Hoest and Wohlin 1998) as do group discussions of estimates (Jørgensen and Moløkken 2002). Group discussion is the basis of planning poker, and those discussions lead to an averaging of sorts of the individual estimates.

Finally, planning poker works because it’s fun.Expending more time and effort to arrive at an estimate does not necessarily increase the accuracy of the estimate. The amount of effort put into an estimate should be determined by the purpose of that estimate. Although it is well known that the best estimates are given by those who will do the work, on an agile team we do not know in advance who will do the work.

Therefore, estimating should be a collaborative activity for the team. Estimates should be on a predefined scale. Features that will be worked on in the near future and that need fairly reliable estimates should be made small enough that they can be estimated on a nonlinear scale from 1 to 10 such as 1, 2, 3, 5, and 8 or 1, 2, 4, and 8. Larger features that will most likely not be implemented in the next few iterations can be left larger and estimated in units such as 13, 20, 40, and 100. Some teams choose to include 0 in their estimation scale. To arrive at an estimate, we rely on expert opinion, analogy, and disaggregation.

A fun and effective way of combining these is planning poker. In planning poker, each estimator is given a deck of cards with a valid estimate shown on each card. A feature is discussed, and each estimator selects the card that represents his or her estimate. All cards are shown at the same time. The estimates are discussed and the process repeated until agreement on the estimate is reached.

Planning for Iterations

During project planning, iterations are identified, but the estimates have an acceptable uncertainty due to the lack of detail at the project inception. This task is repeated for each iteration within a release. It allows the team to increase the accuracy of the estimates for one iteration, as more detail is known along the project.

Ensure that the team commits to a reasonable amount of work for the iteration, based on team performance from previous iterations. Prioritize the work items list before you plan the next iteration. Consider what has changed since the last iteration plan (such as new change requests, shifting priorities of your stakeholders, or new risks that have been encountered).

When the team has decided to take on a work item, it will assign the work to one or several team members. Ideally, this is done by team members signing up to do the work, since this makes people motivated and committed to doing the job. However, based on your culture, you may instead assign the work to team members.

Wall Estimation

Planning poker is a fantastic tool for estimating user stories, but it would take an inordinate amount of time to estimate hundreds of stories, one at a time, using planning poker. If you have a raw backlog filled with hundreds of stories that have not been estimated or prioritized, you’re going to need a faster way to estimate.

Wall Estimation is designed to allow teams to eliminate discussions of 2 versus 3 and 5 versus 8 and instead group things in a purely relative manner along a continuum, at least initially. It also allows stakeholders to give a general prioritization to a large group of stories without getting hung up on whether one story is slightly more important than another.

Example Wall Estimation - Relative Sort

To do Wall Estimation, you must first print your user stories on cards. Then gather your team and stakeholders in a room with a big empty wall (about 14 feet long by 8-10 feet high); Understand two things about the wall:

  • Height determines priority. Stories at the top are higher; stories at the bottom are lower. A story’s priority can be based on ROI, business value, or something as simple as “it’s just important, and I don’t know why.”

  • Width is reserved for size. Stories on the left are smaller; stories on the right are bigger. (You can reverse this and move from right to left if you’re in, say, Japan and it’s more logical.) The important thing is to envision a line going horizontally and one going vertically. Team members and stakeholders should ask themselves, where, relative to the other stories, does this one fit?

The team will use the wall to size all of the stories. The stakeholders will use the wall to prioritize stories. As with planning poker, we’re using relative sizing, but instead of using two reference stories for comparison, the wall becomes the constant. Small story? Move to the left. Big story? Move to the right. Important story? Place it high. A story that we can live without for now? Place it low.

Although the stakeholders do not have to be there while the stories are being estimated, the team does need to be in the room while the stories are being prioritized. The ScrumMaster and product owner must attend both the estimation and prioritization activities.


Although  customers and stakeholders will want to know how big a story is to help them determine its priority, they’ll be much more focused on finding the stories that relate to them and making sure those stories get done. Expect your stakeholders to disagree about priority—your product owner will use this information to help decide the ultimate priority.

Ask the stakeholders to help determine the relative priority of all of these stories by moving these stories up or down inside the taped columns. Remind them that the higher up a story is on the wall, the higher its priority to the business. Set the following rules:

  • If you place a story at the top, be prepared to justify the placement.
  • You may ask each other why one story is more important than another. Feel free to ask each other, “Who moved this one down (or up)?” or to say aloud, “I think this one needs to move. Who wants to disagree?” This enables conversation between the interested parties, without facilitation.
  • If you move a story lower on the wall than someone else did, mark it with a colored dot to alert us.

The biggest benefit to prioritizing as a group is that all the stakeholders can better understand the priorities of various stories. If a discussion goes on too long without resolution, the product owner should collect the card, identify the two stakeholders who cannot agree, and make a note to meet with them privately later.

The exercise could take 2-6 hours, depending on the number of stories and the number of stakeholders. When you are finished, the wall will look something like the picture shown below.

Example Wall Estimation - Priority SortYour wall will break down roughly into four quadrants. The stories in the top left are high priority and small so they’ll end up in the top of the product backlog. The stories in the top right are high priority but are also large. These stories should be broken down soon so they can be brought into upcoming sprints.

Wall Estimation - Four QuadrantsThe lower left quadrant is made up of small stories that are lower in priority. They will likely fall to the bottom of the backlog. The lower right quadrant is filled with large stories that are also lower in priority. These stories are your epics or themes. They’ll eventually need to be broken down into smaller, more manageable stories but not until they rise in priority.

Spend some time looking at the wall as a whole with the group. If a story is in the wrong quadrant, move it. If a high-priority story must be broken down and time allows, do it while everyone is in the room.

At the end of wall estimation, you’ll have the start of a release plan. If you know the team’s historical velocity, you can even supply a rough range of which stories in the upper-left quadrant will be finished.

Estimation is hard because there is so much uncertainty at the beginning of a project. Product Owners and agile project managers try to maximize value early learning by having conversations with their product owners and stakeholders, producing working software, and integrating feedback about that software to get to a releasable state. But even agile projects must provide some estimate of when a set of features will be ready for release.

Quadrus Estimation Methodology

Quadrus is a recognized leader in IT professional services and solutions. Headquartered in Calgary, Alberta, Quadrus has delivered hundreds of successful projects across Western Canada since 1993. They are committed to providing the highest quality service to valued clients.

Quadrus has developed the Quadrus Estimation Methodology which is different for a few key reasons:
QEM recognizes that way in which people estimate – and when forming overall project estimates, QEM employs Monte-Carlo simulation to aggregate individual estimates in a statistically correct  way.
QEM takes into account those project tasks that are often left out of  project estimates. Activities such as requirements clarification, task  management and coordination, meetings, demos, testing and deployment are all included in the project estimate.
QEM recognizes that often not all project requirements are known  up front. QEM provides a mechanism for estimating the percentage of  known requirements (versus the percentage of unknown requirements), and the overall project estimate is appropriate scaled to  reflect the reality of unknown requirements.
QEM recognizes that most people naturally create single-point estimates for tasks or stories – and QEM is able to work with these (less-than-perfect) estimates. The input to QEM consists, quite simply, of the single-point estimate that the developer feels is the most intuitive estimate (the median) together with an uncertainty factor (Low, Medium or High) to indicate the range of the distribution curve to use. If a task seems to have many unknowns or perhaps uses new technologies – or if it may have an element of research or invention – then the uncertainty will be higher than a task that is known and recognized (lower uncertainty).
This uncertainty is very important; recognizing it exists can drastically change how much time we should assume a task can take. One story may have a lower numeric estimate than another but with a greater level of uncertainty could actually end up taking more time.
These inputs are not onerous to produce compared to the inputs demanded by other estimation methodologies that have been tried in the past (e.g., feature points, estimated lines of code, etc.). Together the single-point estimate/uncertainty pair provides enough information to calculate an expected average estimate for the individual story together with a range.

Quadrus has codified the Quadrus Estimation Methodology as a modern AJAX-powered, intuitive and easy to use web application named the Quadrus Estimator. The Quadrus Estimator has many features, such as support for the user-story importance to be ranked using a star-rating and enabling stories to be labeled. The star-rating and labels can be used to manage and filter the list of tasks and also to include or exclude stories from an estimate for “what-if” scenarios based on less important stories or certain features being postponed for future phases.
The simulation also accepts additional inputs to allow additional effort to be counted for which is often overlooked when creating estimates. This is in the form of “Story Glue” to represent non-development time that is still project related and must be accounted for (e.g., meetings, writing reports, demos, etc.).

The Quadrus Estimator calculates the total effort to complete the project in man-months. Along with the effort estimate, a combination of proven industry guidelines and research is applied to identify the ideal balance between resources (people) and duration (timescale). While either can be adjusted in the Estimator, the system shows the cost of favoring the other in terms of additional effort (overall cost), more resources (for faster delivery), or longer timescales (for team-size constraints).

PEARL VI : An epitome on Agile Modeling(AM)

PEARL VI : A epitome on Agile Modeling, (AM) which is a collection of values, principles, and practices for modeling software that can be applied on a software development project in an effective and light-weight manner

Agile Modeling (AM) is a practice-based methodology for effective modeling and documentation of software-based systems. Simply put, Agile Modeling (AM) is a collection of values, principles, and practices for modeling software that can be applied on a software development project in an effective and light-weight manner. As you see in Figure 1 AM is meant to be tailored into other, full-fledged methodologies such as XP or RUP, enabling you to develop a software process which truly meets your needs. In fact, this tailoring work has already been done for you in the form of the Disciplined Agile Delivery (DAD) process framework.

Figure 1. AM enhances other software processes.

Scope of AM

The values of AM, adopting and extending those of eXtreme Programming are communication, simplicity, feedback, courage, and humility. The keys to modeling success are to have effective communication between all project stakeholders, to strive to develop the simplest solution possible that meets all of your needs, to obtain feedback regarding your efforts often and early, to have the courage to make and stick to your decisions, and to have the humility to admit that you may not know everything, that others have value to add to your project efforts.

AM is based on a collection of principles, such as the importance of assuming simplicity when you are modeling and embracing change as you are working because requirements will change over time. You should recognize that incremental change of your system over time enables agility and that you should strive to obtain rapid feedback on your work to ensure that it accurately reflects the needs of your project stakeholders.

You should model with a purpose, if you don’t know why you are working on something or you don’t know what the audience of the model/document actually requires then you shouldn’t be working on it. Furthermore, you need multiple models in your intellectual toolkit to be effective. A critical concept is that models are not necessarily documents, a realization that enables you travel light by discarding most of your models once they have fulfilled their purpose.

Agile modelers believe that content is more important than representation, that there are many ways you can model the same concept yet still get it right. To be an effective modeler you need to recognize that open and honest communication is often the best policy to follow to ensure effective teamwork. Finally, a focus on quality work is important because nobody likes to produce sloppy work and that local adaptation of AM to meet the exact needs of your environment is important.
To model in an agile manner you will apply AM’s practices as appropriate. Fundamental practices include creating several models in parallel, applying the right artifact(s) for the situation, and iterating to another artifact to continue moving forward at a steady pace. Modeling in small increments, and not attempting to create the magical “all encompassing model” from your ivory tower, is also fundamental to your success as an agile modeler.

Because models are only abstract representations of software, abstractions that may not be accurate, you should strive to prove it with code to show that your ideas actually work in practice and not just in theory Active stakeholder participation is critical to the success of your modeling efforts because your project stakeholders know what they want and can provide you with the feedback that you require.

The principle of assume simplicity is a supported by the practices of creating simple content by focusing only on the aspects that you need to model and not attempting to creating a highly detailed model, depicting models simply via use of simple notations, and using the simplest tools to create your models. You travel light by single sourcing information, discarding temporary models and updating models only when it hurts. Communication is enabled by displaying models publicly, either on a wall or internal web site, through collective ownership of your project artifacts, through applying modeling standards, and by modeling with others. Your development efforts are greatly enhanced when you apply patterns gently. Because you often need to integrate with other systems, including legacy databases as well as web-based services, you will find that you need to formalize contract models with the owners of those systems.

With an Agile Model Driven Development (AMDD) (see Figure 2) approach you typically do just enough high-level modeling at the beginning of a project to understand the scope and potential architecture of the system, and then during development iterations you do modeling as part of your iteration planning activities and then take a just in time (JIT) model storming approach where you model for several minutes as a precursor to several hours of coding.

Figure 2. Agile Model Driven Development (AMDD).

Another way to look at Agile Modeling is as a collection of best practices, as you see in Figure 3.

Figure 3. The best practices of Agile Modeling.

Core Principles:

  • Model With A Purpose. Many developers worry about whether their artifacts — such as models, source code, or documents — are detailed enough or if they are too detailed, or similarly if they are sufficiently accurate. What they’re not doing is stepping back and asking why they’re creating the artifact in the first place and who they are creating it for. With respect to modeling, perhaps you need to understand an aspect of your software better, perhaps you need to communicate your approach to senior management to justify your project, or perhaps you need to create documentation that describes your system to the people who will be operating and/or maintaining/evolving it over time. If you cannot identify why and for whom you are creating a model then why are you bothering to work on it all? Your first step is to identify a valid purpose for creating a model and the audience for that model, then based on that purpose and audience develop it to the point where it is both sufficiently accurate and sufficiently detailed. Once a model has fulfilled its goals you’re finished with it for now and should move on to something else, such as writing some code to show that the model works. This principle also applies to a change to an existing model: if you are making a change, perhaps applying a known pattern, then you should have a valid reason to make that change (perhaps to support a new requirement or to refactor your work to something cleaner). An important implication of this principle is that you need to know your audience, even when that audience is yourself. For example, if you are creating a model for maintenance developers, what do they really need? Do they need a 500 page comprehensive document or would a 10 page overview of how everything works be sufficient? Don’t know? Go talk to them and find out.
  • Maximize Stakeholder ROI. Your project stakeholders are investing resources — time, money, facilities, and so on — to have software developed that meets their needs. Stakeholders deserve to invest their resources the best way possible and not to have resources frittered away by your team. Furthermore, they deserve to have the final say in how those resources are invested or not invested. If it was your resources, would you want it any other way? Note: In AM v1 this was originally called “Maximize Stakeholder Investment”. Over time we realized that this term wasn’t right because it sounded like we were saying you needed to maximize the amount of money spent, which wasn’t the message.
  • Travel Light. Every artifact that you create, and then decide to keep, will need to be maintained over time. If you decide to keep seven models, then whenever a change occurs (a new/updated requirement, a new approach is taken by your team, a new technology is adopted, …) you will need to consider the impact of that change on all seven models and then act accordingly. If you decide to keep only three models then you clearly have less work to perform to support the same change, making you more agile because you are traveling lighter. Similarly, the more complex/detailed your models are, the more likely it is that any given change will be harder to accomplish (the individual model is “heavier” and is therefore more of a burden to maintain). Every time you decide to keep a model you trade-off agility for the convenience of having that information available to your team in an abstract manner (hence potentially enhancing communication within your team as well as with project stakeholders). Never underestimate the seriousness of this trade-off. Someone trekking across the desert will benefit from a map, a hat, good boots, and a canteen of water they likely won’t make it if they burden themselves with hundreds of gallons of water, a pack full of every piece of survival gear imaginable, and a collection of books about the desert. Similarly, a development team that decides to develop and maintain a detailed requirements document, a detailed collection of analysis models, a detailed collection of architectural models, and a detailed collection of design models will quickly discover they are spending the majority of their time updating documents instead of writing source code.
  • Multiple Models. You potentially need to use multiple models to develop software because each model describes a single aspect of your software. “What models are potentially required to build modern-day business applications?” Considering the complexity of modern day software, you need to have a wide range of techniques in your intellectual modeling toolkit to be effective (see Modeling Artifacts for AM for a start at a list and Agile Models Distilled for detailed descriptions). An important point is that you don’t need to develop all of these models for any given system, but that depending on the exact nature of the software you are developing you will require at least a subset of the models. Different systems, different subsets. Just like every fixit job at home doesn’t require you to use every tool available to you in your toolbox, over time the variety of jobs you perform will require you to use each tool at some point. Just like you use some tools more than others, you will use some types of models more than others.
  • Rapid Feedback. The time between an action and the feedback on that action is critical. By working with other people on a model, particularly when you are working with a shared modeling technology (such as a whiteboard, CRC cards, or essential modeling materials such as sticky notes) you are obtaining near-instant feedback on your ideas. Working closely with your customer, to understand the requirements, to analyze those requirements, or to develop a user interface that meets their needs, provides opportunities for rapid feedback.
  • Assume Simplicity. As you develop you should assume that the simplest solution is the best solution. Don’t overbuild your software, or in the case of AM don’t depict additional features in your models that you don’t need today. Have the courage that you don’t need to over-model your system today, that you can model based on your existing requirements today and refactor your system in the future when your requirements evolve. Keep your models as simple as possible.
  • Embrace Change. Requirements evolve over time. People’s understanding of the requirements change over time. Project stakeholders can change as your project moves forward, new people are added and existing ones can leave. Project stakeholders can change their viewpoints as well, potentially changing the goals and success criteria for your effort. The implication is that your project’s environment changes as your efforts progress, and that as a result your approach to development must reflect this reality.

You need an agile approach to change management


  • Incremental Change. An important concept to understand with respect to modeling is that you don’t need to get it right the first time, in fact, it is very unlikely that you could do so even if you tried. Furthermore, you do not need to capture every single detail in your models, you just need to get it good enough at the time. Instead of futilely trying to develop an all encompassing model at the start, you instead can put a stake in the ground by developing a small model, or perhaps a high-level model, and evolve it over time (or simply discard it when you no longer need it) in an incremental manner.
  • Quality Work. Nobody likes sloppy work. The people doing the work don’t like it because it’s something they can’t be proud of, the people coming along later to refactor the work (for whatever reason) don’t like it because it’s harder to understand and to update, and the end users won’t like the work because it’s likely fragile and/or doesn’t meet their expectations.
  • Working Software Is Your Primary Goal. The goal of software development is to produce high-quality working software that meets the needs of your project stakeholders in an effective manner. The primary goal is not to produce extraneous documentation, extraneous management artifacts, or even models. Any activity that does not directly contribute to this goal should be questioned and avoided if it cannot be justified in this light.
  • Enabling The Next Effort Is Your Secondary Goal. Your project can still be considered a failure even when your team delivers a working system to your users – part of fulfilling the needs of your project stakeholders is to ensure that your system robust enough so that it can be extended over time. As Alistair Cockburn likes to say, when you are playing the software development game your secondary goal is to setup to play the next game. Your next effort may be the development of the next major release of your system or it may simply be the operations and support of the current version you are building. To enable it you will not only want to develop quality software but also create just enough documentation and supporting materials so that the people playing the next game can be effective. Factors that you need to consider include whether members of your existing team will be involved with the next effort, the nature of the next effort itself, and the importance of the next effort to your organization. In short, when you are working on your system you need to keep an eye on the future.

Supplementary Principles:

  • Content Is More Important Than Representation. Any given model could have several ways to represent it. For example, a UI specification could be created using Post-It notes on a large sheet of paper (an essential or low-fidelity prototype), as a sketch on paper or a whiteboard, as a “traditional” prototype built using a prototyping tool or programming language, or as a formal document including both a visual representation as well as a textual description of the UI. An interesting implication is that a model does not need to be a document. Even a complex set of diagrams created using a CASE tool may not become part of a document, instead they are used as inputs into other artifacts, very likely source code, but never formalized as official documentation. The point is that you take advantage of the benefits of modeling without incurring the costs of creating and maintaining documentation.
  • Open And Honest Communication. People need to be free, and to perceive that they are free, to offer suggestions. This includes ideas pertaining to one or more models, perhaps someone has a new way to approach a portion of the design or has a new insight regarding a requirement; the delivery of bad news such as being behind schedule; or simply the current status of their work. Open and honest communication enables people to make better decisions because the quality of the information that they are basing them on is more accurate.

Figure 4 depicts how the AMDD activities fit into the various iterations of the agile software development lifecycle. It’s simply another way to show that an agile project begins with some initial modeling and that modeling still occurs in each construction iteration.

Figure 4. AMDD Through the Agile Development Lifecycle.

As the name implies, AMDD is the agile version of Model Driven Development (MDD). MDD is an approach to software development where extensive models are created before source code is written. A primary example of MDD is the Object Management Group (OMG)’s Model Driven Architecture (MDA) standard. With MDD a serial approach to development is often taken, MDD is quite popular with traditionalists, although as the RUP/EUP shows it is possible to take an iterative approach with MDD. The difference with AMDD is that instead of creating extensive models before writing source code you instead create agile models which are just barely good enough that drive your overall development efforts. AMDD is a critical strategy for scaling agile software development beyond the small, co-located team approach that we saw during the first stage of agile adoption.

Figure 5 depicts a high-level lifecycle for AMDD for the release of a system. First, let’s start with how to read the diagram. Each box represents a development activity. The envisioning includes two main sub-activities, initial requirements envisioning and initial architecture envisioning. These are done during Inception, iteration being another term for cycle or sprint. “Iteration 0, or Inception”, is a common term for the first iteration before you start into development iterations, which are iterations one and beyond (for that release). The other activities – iteration modeling, model storming, reviews, and implementation – potentially occur during any iteration, including Inception. The time indicated in each box represents the length of an average session: perhaps you’ll model for a few minutes then code for several hours.

Figure 5. The AMDD lifecycle: Modeling activities throughout the lifecycle of a project.

Figure 6 depicts how the AMDD activities fit into the various iterations of the agile software development lifecycle. It’s simply another way to show that an agile project begins with some initial modeling and that modeling still occurs in each construction iteration.

Figure 6. AMDD Through the Agile Development Lifecycle.


The envisioning effort is typically performed during the first week of a project, the goal of which is to identify the scope of your system and a likely architecture for addressing it. To do this you will do both high-level requirements modeling and high-level architecture modeling. The goal isn’t to write detailed specifications, that proves incredibly risky in practice, but instead to explore the requirements and come to an overall strategy for your project. For short projects (perhaps several weeks in length) you may do this work in the first few hours and for long projects (perhaps on the order of twelve or more months) you may decide to invest two weeks in this effort. Agilists highly suggest not investing any more time than this as you run the danger of over modeling and of modeling something that contains too many problems (two weeks without the concrete feedback that implementation provides is a long time to go at risk).

Through initial, high-level modeling you can gain the knowledge that you need to guide the project but choose to wait to act on it.

 Initial Requirements Modeling

For the first release of a system you need to take several days to identify some high-level requirements as well as the scope of the release (what you think the system should do). The goal is to get a good gut feel what the project is all about. For your initial requirements model you need some form of usage model to explore how users will work with your system, an initial domain model which identifies fundamental business entity types and the relationships between then, and an initial user interface modelwhich explores UI and usabilityissues.

your goal is to build a shared understanding, it isn’t to write detailed documentation. A critical success factor is to use inclusive modeling techniques which enable active stakeholder participation.

Initial Architecture Modeling

The goal of the initial architecture modeling effort is to try to identify an architecture that has a good chance of working. This enables you to set a (hopefully) viable technical direction for your project and to provide sufficient information to organize your team around your architecture (something that is particularly important at scale with large or distributed teams).

On the architecture side of things we often create free-form diagrams which explore the technical infrastructure, initial domain models to explore the major business entities and their relationships, and optionally change cases to explore potential architecture-level requirements which your system may need to support one day. In later iterations both your initial requirements and your initial architect models will need to evolve as you learn more, but for now the goal is to get something that is just barely good enough so that your team can get going. In subsequent releases you may decide to shorten Inception to several days, several hours, or even remove it completely as your situation dictates. The secret is to keep things simple. You don’t need to model a lot of detail, you simply need to model enough. If you’re writing use cases this may mean that point-form notes are good enough. If you’re domain modeling a whiteboard sketch or collection of CRC cards is likely good enough. For your architecture a whiteboard sketch overviewing how the system will be built end-to-end is good enough.

Many traditional developers will struggle with an agile approach to initial modeling because for years they’ve been told they need to define comprehensive models early in a project. Agile software development isn’t serial, it’s iterative and incremental (evolutionary). With an evolutionary approach detailed modeling is done just in time (JIT) during development iterations in model storming sessions.

 Iteration Modeling: Thinking Through What You’ll Do This Iteration

At the beginning of each Construction iteration the team must plan the work that they will do that iteration. An often neglected aspect of Mike Cohn’s planning poker is the required modeling activities implied by the technique. Agile teams implement requirements in priority order, see Figure 7, pulling an iteration’s worth of work off the top of the stack. To do this successfully you must be able to accurately estimate the work required for each requirement, then based on your previous iteration’s velocity (a measure of how much work you accomplished) you pick that much work off the stack. For example, if last iteration you accomplished 15 points worth of work then the assumption is that all things being equal you’ll be able to accomplish that much work this iteration. This activity is often referred to as the “planning game” or simply iteration planning.

Figure 7. Agile requirements change management process.

To estimate each requirement accurately you must understand the work required to implement it, and this is where modeling comes in. You discuss how you’re going to implement each requirement, modeling where appropriate to explore or communicate ideas. This modeling in effect is the analysis and design of the requirements being implemented that iteration.

With initial iteration modeling you explore what you need to build so that you can estimate and plan the work for the iteration effectively.

 Model Storming: Just In Time (JIT) Modeling

Agilists experience is that the vast majority of modeling sessions involve a few people, usually just two or three, who discuss an issue while sketching on paper or a whiteboard. These “model storming sessions” are typically impromptu events, one project team member will ask another to model with them, typically lasting for five to ten minutes (it’s rare to model storm for more than thirty minutes). The people get together, gather around a shared modeling tool (e.g. the whiteboard), explore the issue until they’re satisfied that they understand it, then they continue on (often coding). Model storming is just in time (JIT) modeling: you identify an issue which you need to resolve, you quickly grab a few team mates who can help you, the group explores the issue, and then everyone continues on as before. Extreme programmers (XPers) would call modeling storming sessions stand-up design sessions or customer Q&A sessions.

 Executable Specification via Test Driven Development (TDD)

During development it is quite common to model storm for several minutes and then code, following common Agile practices such as Test-First Design (TFD) and refactoring, for several hours and even several days at a time to implement what you’ve just modeled. For the sake of discussion test-driven design (TDD) is the combination of TFD and refactoring. This is where your team will spend the majority of its time. Agile teams do the majority of their detailed modeling in the form of executable specifications, often customer tests or development tests. Why does this work? Because your model storming efforts enable you to think through larger, cross-entity issues whereas with TDD you think through very focused issues typically pertinent to a single entity at a time. With refactoring you evolve your design via small steps to ensure that your work remains of high quality.

TDD promotes confirmatory testing of your application code and detailed specification of that code. Customer tests, also called agile acceptance tests, can be thought of as a form of detailed requirements and developer tests as detailed design. Having tests do “double duty” like this is a perfect example of single sourcing information, a practice which enables developers to travel light and reduce overall documentation. However, detailed specification is only part of the overall picture – high-level specification is also critical to your success, when it’s done effectively. This is why we need to go beyond TDD to consider AMDD.

You may even want to “visually program” using a sophisticated modeling tool such as Rational Software Architect (RSA). This approach requires a greater modeling skillset than is typically found in most developers, although when you do have teams made up with people of these skills you find that you can be incredibly productive with the right modeling tools.

Some, but not all, of the potential models that you may want to create on a software development project include:

  • Acceptance Test
  • Business Rule (Template)
  • Change Case (Template)
  • Class Responsibility Collaborator (CRC) model
  • Constraint
  • Contract model (Template)
  • Data Flow Diagram (DFD)
  • Domain Model
  • Essential/Abstract Use Case (Template)
  • Essential/Abstract User Interface Prototype
  • Feature
  • Free-Form Diagrams
  • Flow Chart
  • Glossary
  • Logical Data Model (LDM)
  • Mind Map
  • Network Diagram
  • Object Role Model (ORM) Diagram
  • Personas
  • Physical Data Model (PDM)
  • Robustness Diagram
  • Security Threat Model
  • System Use Case (Template)
  • Technical Requirement
  • UML Activity Diagram
  • UML Class Diagram
  • UML Communication/Collaboration Diagram
  • UML Component Diagram
  • UML Composite Structure Diagram
  • UML Deployment Diagram
  • UML Interaction Overview Diagram
  • UML Object Diagram
  • UML Package Diagram
  • UML Sequence Diagram
  • UML State Machine Diagram
  • UML Timing Diagram
  • UML Use Case Diagram
  • Usage Scenario
  • User Interface Flow Diagram (Storyboard)
  • User Interface Prototype
  • User Story
  • Value Stream Map

The best practices of AMDD are:

  1. Active Stakeholder Participation. Stakeholders should provide information in a timely manner, make decisions in a timely manner, and be as actively involved in the development process through the use of inclusive tools and techniques.
  2. Architecture Envisioning. At the beginning of an agile project you will need to do some initial, high-level architectural modeling to identify a viable technical strategy for your solution.
  3. Document Continuously. Write deliverable documentation throughout the lifecycle in parallel to the creation of the rest of the solution.
  4. Document Late. Write deliverable documentation as late as possible, avoiding speculative ideas that are likely to change in favor of stable information.
  5. Executable Specifications. Specify requirements in the form of executable “customer tests”, and your design as executable developer tests, instead of non-executable “static” documentation.
  6. Iteration Modeling. At the beginning of each iteration you will do a bit of modeling as part of your iteration planning activities.
  7. Just Barely Good Enough (JBGE) artifacts. A model or document needs to be sufficient for the situation at hand and no more.
  8. Look Ahead Modeling. Sometimes requirements that are nearing the top of your priority stack are fairly complex, motivating you to invest some effort to explore them before they’re popped off the top of the work item stack so as to reduce overall risk.
  9. Model Storming. Throughout an iteration you will model storm on a just-in-time (JIT) basis for a few minutes to explore the details behind a requirement or to think through a design issue.
  10. Multiple Models. Each type of model has it’s strengths and weaknesses. An effective developer will need a range of models in their intellectual toolkit enabling them to apply the right model in the most appropriate manner for the situation at hand.
  11. Prioritized Requirements. Agile teams implement requirements in priority order, as defined by their stakeholders, so as to provide the greatest return on investment (ROI) possible.
  12. Requirements Envisioning. At the beginning of an agile project you will need to invest some time to identify the scope of the project and to create the initial prioritized stack of requirements.
  13. Single Source Information. Strive to capture information in one place and one place only.
  14. Test-Driven Design (TDD). Write a single test, either at the requirements or design level, and then just enough code to fulfill that test. TDD is a JIT approach to detailed requirements specification and a confirmatory approach to testing.


PEARL IX : Refactoring performed to Sustain Application Development Success in Agile Environments

PEARL IX : Refactoring performed to Sustain Application Development Success in Agile Environments

 The term “refactoring” was originally coined by Martin Fowler and Kent Beck which refers to “a change made to the internal structure of software to make it easier to understand and cheaper to modify without altering its actual observable behavior i.e. it is a disciplined way to clean up code that minimizes the chances of introducing bugs and also enables the code to be evolved slowly over time and facilitates taking an iterative and incremental approach to programming and/or design”. Importantly, the underlying objective behind refactoring is to give thoughtful consideration and improve some of the essential non-functional attributes of the software. So, to achieve this, the technique has been broadly classified into following major categories:

1. Code Refactoring (clean-up) : It is intended to remove the unused code, methods, variables etc. which are misleading.
2. Code Standard Refactoring It is done to achieve quality code.

3. Database Refactoring: Just like code refactoring, it is intended to clean (clean-up) or remove the unnecessary and redundant data without changing the architecture.
4. Database schema and  Design Refactoring : This includes enhancing the database schema by leaving the actual fields required by the application.
5. User-Interface Refactoring :  It is intended to change the UI without affecting the underlying functionality.
6. Architecture Refactoring :  It is done to achieve modularization at the application level.

Refactoring is actually a simple technique where you make structural changes to the code in small, independent and safe steps, and test the code after each of these steps just to ensure that you have not changed the behavior – i.e. the code still works the same, but just looks different. Nevertheless, refactoring is intended to fill in some short-cuts, eliminate duplication and dead code, and help ensure the design and logic have been made very clear. Further, it is equally important to understand that, although refactoring is driven by certain good characteristics and shares some common attributes with debugging and/ or optimization, etc., it is actually different because

  •  Refactoring is not all about fixing any bugs.
  •  Again, optimization is not refactoring at all.
  •  Likewise, revisiting and/or tightening up error handling code is not refactoring.
  •  Adding any defensive code is also not considered to be refactoring.
  •  Importantly, tweaking the code to make it more testable is also not refactoring.

Re-factoring Activities – Conceptualized
The refactoring process generally consists of a number of distinct activities which are dealt with in chronological order:

  • Firstly, identify where the software should be refactored, i.e. figure out the code smell areas in the software which might increase the risk of failures or bugs.
  • Next, determine what refactoring should be applied to the identified places based on the list identified.
  • Guarantee that the applied refactoring preserves the behavior of the software. This is the crucial step in which, based on the type of software such as real-time, embedded and safety-critical, measures have to be taken to preserve their behavior prior to subjecting them to refactoring.
  • Apply the appropriate refactoring technique.
  • Assess the effect of the refactoring on the quality characteristics of the software, e.g. complexity, understandability and maintainability, and of the process, e.g. productivity, cost and effort.
  • Ensure the requisite consistency is maintained between the refactored program code and other software artifacts.

Refactoring Steps – Application/System Perspective
The points below clearly summarize the important steps to be adhered to when refactoring an application:
1. Firstly, formulate the unit test cases for the application/ system – the unit test cases should be developed in such a way that they test the application behavior and ensure that this behavior remains intact even after every cycle of refactoring.
2. Identify the approach to the task for refactoring – this includes two essential steps:
– Finding the problem – this is about identifying wheth-er there is any code smell situation with the current piece of code and, if yes, then identifying what the problem is all about.
– Assess/Decompose the problem – after identifying the potential problem assess it against the risks involved.
3. Design a suitable solution – work out what the resultant state will be after subjecting the code to refactoring.
Accordingly, formulate a solution that will be helpful intransitioning the code from the current state to the resultant state.
4. Alter the code – now proceed with refactoring the code without changing the external behavior of the code.
5. Test the refactored code – to ensure that the results and/ or behavior are consistent. If the test fails, then rollback the changes made and repeat the refactoring in different way.
6. Continue the cycle with the aforementioned steps (1) to (5) until the problematic/current code moves to the resultant state.

So, having said about refactoring and its underlying intent, it can be taken up as a practice and can be implemented safely with ease because the majority of today’s modern IDEs (integrated development environments) are inbuilt and equipped with various refactoring tools and patterns which can be used readily to refactor any application/business-logic/middle-tier code seamlessly. However, the situation may not be the same when it comes to refactoring a database, because database refactoring is conceptually more difficult when compared to code refactoring since with code refactoring you only need to maintain the behavioral semantics, whereas with database refactoring you must also maintain information semantics.

Refactoring is the process of clarifying and simplifying the design of existing code, without changing its behavior. Agile teams are maintaining and extending their code a lot from iteration to iteration, and without continuous refactoring, this is hard to do. This is because un-refactored code tends to rot. Rot takes several forms: unhealthy dependencies between classes or packages, bad allocation of class responsibilities, way too many responsibilities per method or class, duplicate code, and many other varieties of confusion and clutter.

Every time we change code without refactoring it, rot worsens and spreads. Code rot frustrates us, costs us time, and unduly shortens the lifespan of useful systems. In an agile context, it can mean the difference between meeting or not meeting an iteration deadline.

Refactoring code ruthlessly prevents rot, keeping the code easy to maintain and extend. This extensibility is the reason to refactor and the measure of its success. But note that it is only “safe” to refactor the code this extensively if we have extensive unit test suites of the kind we get if we work Test-First. Without being able to run those tests after each little step in a refactoring, we run the risk of introducing bugs. If you are doing true Test-Driven Development (TDD), in which the design evolves continuously, then you have no choice about regular refactoring, since that’s how you evolve the design.

Code Hygiene

A popular metaphor for refactoring is cleaning the kitchen as you cook. In any kitchen in which several complex meals are prepared per day for more than a handful of people, you will typically find that cleaning and reorganizing occur continuously. Someone is responsible for keeping the dishes, the pots, the kitchen itself, the food, the refrigerator all clean and organized from moment to moment. Without this, continuous cooking would soon collapse. In your own household, you can see non-trivial effects from postponing even small amounts of dish refactoring: did you ever try to scrape the muck formed by dried Cocoa Crispies out of a bowl? A missed opportunity for 2 seconds worth of rinsing can become 10 minutes of aggressive scraping.

Specific “Refactorings”

Refactorings are the opposite of fiddling endlessly with code; they are precise and finite. Martin Fowler’s definitivebook on the subject describes 72 specific “refactorings” by name (e.g., “Extract Method,” which extracts a block of code from one method, and creates a new method for it). Each refactoring converts a section of code (a block, a method, a class) from one of 22 well-understood “smelly” states to a more optimal state. It takes awhile to learn to recognize refactoring opportunities, and to implement refactorings properly.

Refactoring to Patterns

Refactoring does not only occur at low code levels. In his recent book, Refactoring to Patterns, Joshua Kerievsky skillfully makes the case that refactoring is the technique we should use to introduce Gang of Four design patterns into our code. He argues that patterns are often over-used, and often introduced too early into systems. He follows Fowler’s original format of showing and naming specific “refactorings,” recipes for getting your code from point A to point B. Kerievsky’s refactorings are generally higher level than Fowler’s, and often use Fowler’s refactorings as building blocks. Kerievsky also introduces the concept of refactoring “toward” a pattern, describing how many design patterns have several different implementations, or depths of implementation. Sometimes you need more of a pattern than you do at other times, and this book shows you exactly how to get part of the way there, or all of the way there.

The Flow of Refactoring

In a Test-First context, refactoring has the same flow as any other code change. You have your automated tests. You begin the refactoring by making the smallest discrete change you can that will compile, run, and function. Wherever possible, you make such changes by adding to the existing code, in parallel with it. You run the tests. You then make the next small discrete change, and run the tests again. When the refactoring is in place and the tests all run clean, you go back and remove the old smelly parallel code. Once the tests run clean after that, you are done.

Refactoring Automation in IDEs

Refactoring is much, much easier to do automatically than it is to do by hand. Fortunately, more and more Integrated Development Environments (IDEs) are building in automated refactoring support. For example, one popular IDE for Java is eclipse, which includes more auto-refactorings all the time. Another favorite is IntelliJ IDEA, which has historically included even more refactorings. In the .NET world, there are at least two refactoring tool plugins for Visual Studio 2003, and we are told that future versions of Visual Studio will have built-in refactoring support.

To refactor code in eclipse or IDEA, you select the code you want to refactor, pull down the specific refactoring you need from a menu, and the IDE does the rest of the hard work. You are prompted appropriately by dialog boxes for new names for things that need naming, and for similar input. You can then immediately rerun your tests to make sure that the change didn’t break anything. If anything was broken, you can easily undo the refactoring and investigate.


Add Parameter

A method needs more information from its caller.

Add a parameter for an object that can pass on this information.

Customer                               Customer    
getContact()                                              getContact(data)

inverse of Remove Parameter

Naming: In IDEs this refactoring is usually done as part of “Change Method Signature”

Refactoring a Database – a Major and Typical Variant of Refactoring
“A database refactoring is a process or act of making simple changes to your database schema that improves its design while retaining both its behavioral and informational semantics.
It includes refactoring either structural aspects of the database such as table and view definitions or functional aspects such as stored procedures and triggers etc. Hence, it can be often thought of as the way to normalize your database schema.”
For a better understanding and appreciation of the concept,
let us consider the example of a typical database refactoring technique named Split Column, in which you replace a single table column with two or more other columns. For example, you are working on the PERSON table in your database and figure out that the DATE column is being used for two distinct purposes. a) to store the birth date when the person is a customer and b) to store the hire date when the person is an employee. Now, there is a problem if we have a requirement with the application to retrieve a person who is both customer and employee. So, before we proceed to implement and/or simulate such new requirement, we need to fix the database schema by replacing the DATE column with equivalent BirthDate and HireDate columns. Importantly, to maintain the behavioral semantics of the database schema we need to update all the supporting source code that accessed the DATE column earlier to now work with the newly introduced two columns. Likewise, to maintain the informational semantics we need to write a typical migration script that loops through the table, determines the appropriate type, and then copies the existing date data into the appropriate column.

Classification of Database Refactoring
The database refactoring process is classified into following
major categories:
1. Data quality – the database refactoring process which largely focuses on improving the quality of the data and information that resides within the database. Examples include introducing column constraints and replacing the type code with some boolean values, etc.
2. Structural – as the name implies this database refactoring process is intended to change the database schema.
Examples include renaming a column or splitting a column etc.
3. Referential Integrity – this is a kind of structural refactoring which is intended to refactor the database to ensure referential integrity. Examples include introducing cascading delete.
4. Architectural – this is a kind of structural refactoring which is intended to refactor one type of database item to another type.
5. Performance – this is a kind of structural refactoring which is aimed at improving the performance of the database. Examples include introducing alternate index to fasten the search during data selection.
6. Method – a refactoring technique which is intended to change a method (typically a stored procedure, stored function or trigger, etc.) to improve its quality. Examples include renaming a stored procedure to make it easier to refer and understand.
7. Non-Refactoring Transformations – this type of refactoring technique is intended to change the database schema that, in turn, changes its semantics. Examples include
adding new column to an existing table.
Why isn’t Database Refactoring Easy?
Generally, database refactoring is presumed to be a difficult and/or complicated task when compared to code refactoring. not just because there is the need to give thoughtful consideration to the behavioral and information semantics, but due to a distinct attribute referred to as coupling. The term coupling is understood to be the measure of the degree of the dependencies between two entities/items. So, the more coupling there is between entities/items, the greater the likelihood that a change in one will require a change in another. Hence, it is understood that coupling is the root cause of all the issues when it comes to database refactoring, i.e. the more things that your database is coupled to, the harder it is to refactor. Unfortunately, the majority of relational databases are coupled to a wide variety of things as mentioned below:

■ Application source code
■ Source code that facilitates data loading
■ Code that facilitates data extraction
■ Underlying Persistent layers/frameworks that govern the overall application process flow
■ The respective database schema
■ Data migration scripts, etc.

Refactoring Steps – Database Perspective
Generally, the need to refactor the database schema will be identified by a application developer who is actually trying to implement a new requirement or fix a defect. Then the application developer describes the required change to the concerned DBA of the project and then refactoring begins. Now, as part of this exercise, the DBA will typically work through all or a few of the following steps in chronological order:
1. Most importantly, verify whether database refactoring is required or not – this is the first thing that the DBA does, and it is where they will determine whether database refactoring is needed and/or if it is the right one to perform. Now the next important thing is to assess the overall impact of the refactoring.

2. If it is inevitable, choose the most appropriate database refactoring – this important step is about having several choices for implementing new logic and structures within a database and choosing the right one.

3. Deprecate the original schema – this is not a straightforward step, because you cannot simply make a change retaining the behavior. to the database schema instantly. Instead, adopt an approach that will work with both the old and the new schema in parallel for a while to provide the required time for the other team to both refactor and redeploy their
4. Modify the schema – this step is intended to make the requisite changes to the schema and ensure that the necessary logs are also updated accordingly, e.g. database change log which is typically the source code for implementing all database schema changes and update log which contains the source code for future changes to the database schema.
5. Migrate the data – this is the crucial step which involves migrating and/or copying the data from old versions of the schema to the new.
6. Modify all related external programs – this step is intended to ensure that all the programs which access the portion of database schema which is for the subject of refactoring must be updated to work with the new version of the database schema.
7. Conduct regression test – once the changes to the application code and database schema have been put in place, then it is good to run the regression test suite just to ensure that everything is right and working correctly.
8. Keep the team informed about the changes made and version control the work – this is an important step because the database is a shared resource and it is minimally shared by the application development team. So, it is the prime responsibility of the DBA to keep the team informed about the changes made to the database. Nevertheless, since database refactoring definitely includes some DDLs, change scripts, data migration scripts, data models related scripts, test data and its generation code, etc., all these scripts have to be put under configuration management by checking them into a version control system for better versioning, control, and consistency.

Once the database schema has been refactored successfully in the application development sandbox (a technical environment where your software, including both your application code and database schema, are developed and unit tested), the team can go ahead with refactoring the requisite Integration, Test/QA, and Production sandboxes as well, to ensure that the changes introduced are available and uniform across all environments.

Refactor Unit Tests

Unit test the current and rewritten code

Unit tests are tests to test small sections of the code. Ideally each test is independent, and stubs and drivers are used to get control over the environment. Since refactoring deals with small sections of code, unit tests provide the correct scope.

Refactor code that has no existing unit tests

When you work with very old code, in general you do not have unit tests. So can you just start refactoring? No, first add unit tests to the existing code. After refactoring, these unit tests should still hold. In this way you improve the maintainability of the code as well as the quality of the code. This is a complex task. First you need to find out what the functionality of thecode is. Then you need to think of test cases that properly cover the functionality. To discover the functionality, you provide several inputs to the code and observe the outputs. Functional equivalence is proven when the code is input/output conformant to the original code.

Refactor to increase the quality of the existing unit tests You also see code which contains badly designed unit tests. For example, the unit test verifies multiple scenarios at once. Usually this is caused by not properly decoupling the code from its dependencies . This is undesirable behaviour because the test must not depend on the state of the environment. A solution is to refactor the code to support substitutable dependencies. This allows the test to use a test stub or mock object. The unit test is split into three unit tests which test the three scenarios separately. The rewritten code has a configurable time provider. The test now uses its own time provider and has complete control over the environment.

Every change in the code needs to be tested. Therefore testing  is required when refactoring. You test the changes at different  levels. Since a small section of code is changed, unit testing  seems the most fitting level. But do not forget the business  value! Regression testing is of vital importance for the business.

Test-driven development (TDD)

Test-driven development (TDD) is an advanced technique of using automated unit tests to drive the design of software and force decoupling of dependencies. The result of using this practice is a comprehensive suite of unit tests that can be run at any time to provide feedback that the software is still working. This technique is heavily emphasized by those using Agile development methodologies

The motto of test-driven development is “Red, Green, Refactor.”

  • Red: Create a test and make it fail.
  • Green: Make the test pass by any means necessary.
  • Refactor: Change the code to remove duplication in your project and to improve the design while ensuring that all tests still pass.

The Red/Green/Refactor cycle is repeated very quickly for each new unit of code.

Key Benefits of Re-factoring
From a system/application standpoint, listed below are summaries of the key benefits that can be achieved seamlessly when implementing the refactoring process in a disciplined fashion:

  • Firstly, it improves the overall software extendability.
  • Reduces and optimizes the code maintenance cost.
  • Facilitates highly standardized and organized code.
  • Ensures that the system architecture is improved by retaining the behavior.
  • Guarantees three essential attributes: readability, understandability, and modularity of the code.
  • Ensures constant improvement in the overall quality of the system.

Justifying the refactoring task might be very difficult, but not impossible. Here are the tips for justifying the need for refactoring.
1. Future business changes will require less time. Refactoring will not give an immediate return but, in the long run, adding features will be less expensive as the code will become easier to maintain. Before refactoring, the code is fit for machine consumption but after refactoring it is fit for human as well as machine consumption.
2. Bugs will be fixed during refactoring. Hidden bugs or logics embedded in complicated unnecessary loops will be exposed, which might result in fixing some longstanding
non-reproducible issues.
3. The current application will have a longer life. Prevention is better than cure. Refactoring can be considered to be a prevention exercise which will help to optimize the structure of the application for future enhancements.
4. There might be performance gains. You cannot promise any apparent or measurable performance gain. But if you are planning to do refactoring to achieve some performance gain, then you should have measurable counters showing the performance of the current app before you start refactoring. And after each change the performance counters should be recalculated to check the optimization.Refactoring may result in a reduction in the lines of code, making it less expensive to maintain in the long run. During refactoring of your algorithm, you should follow the DRY (Don’t Repeat Yourself) principle. Any application
that has survived for 6 months to 1 year will have ample places to remove duplication of code.

Developers do not use the full potential of the refactoring tools available on the market.
This might be due to a lack of knowledge or pressure of timelines. During refactoring, these tools are extremely helpful and valuable as they reduce the chances of intro- ducing an error when making big changes

  • Resharper VIsual Studio Add on for .NET
  • XCode for Objective C #
  • iNTELLIJ idea For Java

Refactoring using the right tools and good software development practices will be a boon for any application’s long life and sustenance. Refactoring is an opportunity to solidify the foundation of an existing application that might have become weaker after adding a lot of changes and enhancements. If you are making changes to the same piece of code for the third time, it means there is some technical debt that you have created and there is a need to refactor this code.

PEARL XV : Beyond Scrum : A Scalable Agile Framework with Continuous Integration

PEARL XV : Beyond Scrum :  A Scalable Agile Framework with Continuous Integration using Test Automation and Build for Large Distributed Agile Projects.

scalable agile process

Scrum is the most popular Agile technique, but it doesn’t scale well. And while Scrum improves the effectiveness of individual teams, productivity gains fall off sharply on large projects with many teams.

Yet Agile methods have been used in very large commercial and open source projects to increase productivity, quality and release frequency. Here are a couple of examples:

  • Facebook incorporates code from 600 developers while delivering two releases per day.
  • The Google Android project utilizes thousands of contributors spread across the world.
  • Flickr reached a level of 10 deployments per day.

Can we learn from these companies and projects? What types of techniques and tools did they use to achieve those results?

This section will cover the problems with Scrum, changes in approach to help scalability and methods for supporting distributed teams and continuous delivery.

Problems With Scrum
Scrum techniques have been very successful in improving the effectiveness of individual development teams, employing concepts like self-directed co-located teams, time-boxed sprints, and regular customer feedback from working software. Yet many organizations have run into obstacles when trying to apply Scrum techniques to large projects. For example, no effective techniques have evolved to coordinate the work of multiple Scrum teams and manage dependencies among them. The “Scrum of Scrums” approach of holding meetings with representatives of every team becomes increasingly time-consuming and unwieldy as the number of teams multiply.

In part, this is because some of the assumptions underlying Scrum are too restrictive for large organizations, or clash with business requirements. Many groups refuse to be limited to co-located teams because they are already distributed, they want to take advantage of the global market for development talent, or simply because many of their employees work from home several days a week.

Many groups need to share key personnel such as architects, UI designers, and database specialists across many projects, and cannot assign them to a single team. Other companies need to fix bugs and release new functionality more frequently than the 2-8 week cycles typical of Scrum teams. This is particularly true of those providing web- and cloud-based applications, where customers expect a constant flow of enhancements.

This is not to say that Scrum practices have no place in a large development environment. But it is now clear that many organizations need to go “Beyond Scrum” and find new practices to manage distributed contributors and large, complex projects.

Changes in Approach That Help Scalability
The large commercial and open source projects that have successfully scaled Agile typically depart from conventional Scrum practices in several areas.

No Scrum meetings: Sprint planning meetings, retrospectives and Scrum-of-Scrum meetings are time-consuming, usually require that everyone be in one room, and usually don’t do a very good job of coordinating across teams. That’s why large-scale projects find ways to use online collaboration and planning tools to coordinate work within and across teams, with fewer meetings and conference calls.

“Pull,” “continuous flow,” and “publish what’s ready”: Although Scrum practices are far more agile than the waterfall methods they replaced, they still impose a degree of inflexibility. Once a sprint plan is complete, the features to be delivered are fixed, and so is the time frame to deliver them (usually 2-8 weeks).

Scalable projects typically use pull and continuous flow techniques (especially Kanban), so developers are always working on the highest priority tasks, and new functionality can be released as soon as it is ready.

Code review workflows (long used by open source projects) can be used to select contributions from hundreds of contributors and publish what’s ready. By helping organizations scale to more contributors and release more frequently, code review workflows can become a key building block of Scalable Agile.

Diverse contributors: Classic Scrum practices are designed for co-located teams of 8-10 members. But, in reality, large projects need to incorporate work from individual contributors, shared resources (e.g., architects and DBAs), outsourcing companies, and business partners as well as teams. Collaboration tools and code review workflows are central to meshing the work of these diverse contributors.

A Scalable Agile Process Framework
The question, then, is how can we apply the new approaches as a coherent whole?

Instead of each Scrum team having its own backlog, product management (or product owners) maintain a single project-wide backlog, with tasks sorted in priority order.

At any time, contributors can pull tasks from the top of the backlog into their own “Current Work” list. Ideally they will pull the highest-priority task, but if they do not have the necessary skills or resources they can move down the stack to find another high-priority assignment.

This process ensures that the highest-priority tasks are addressed first. If an urgent bug fix or a key customer feature request is placed at the top of the backlog it will receive immediate attention, instead of waiting for the next sprint.

Contributors can be individuals, teams, departments, or even entire organizations like an outsourcing firm or a business partner. There is no expectation or requirement that the tasks be done by Scrum teams of 8-10 members. This allows organizations to call on the talents of all kinds of individuals and companies, and in fact conforms to the reality of most large projects today.

Tasks are then managed using Kanban or lean principles. Typically this means that each person is working on one task at a time (i.e., the team has a work-in-process limit of one task for each person on the team).

Kanban principles ensure that once tasks are started they are completed as quickly as possible, which means that they can be released sooner, and also that other tasks which depend on the first task can be started sooner.

When tasks are completed, the contributor pulls in on-demand resources to build and test a release with the new code. This provides immediate feedback to the contributors, and allows them to catch and fix bugs right away. It also makes features available faster, because there is no wait for centralized build and test systems.

Finally, once new code submissions have been tested successfully, they can be pulled through a merge process into a staging area or into a final build. This means that a new version of the software can be assembled and released at any time, with whatever bug fixes and enhancements are available at that moment.

What Does This Accomplish?
How exactly does this Scalable Agile process framework address the shortcomings of Scrum and provide more scalable, responsive development efforts? Here are a few of the advantages:

  • There can be many types of contributors, including (but not limited to) conventional Scrum teams.
  • There is no need to spend time estimating tasks precisely, doing detailed sprint planning, or having long meetings to coordinate assignments across teams. As long as the backlog is maintained in priority order the highest-priority tasks will be addressed first.
  • Once tasks are started they are completed in the least possible time, meaning they can be released faster and dependent tasks can be started sooner.
  • Software quality is better, because test feedback is available as soon as a task is complete. Bugs can be fixed when it is clear what changes caused the problem, and when the code is fresh in the mind of the developer. Also, quality assurance does not become a bottleneck, a situation which often leads organizations to cut corners on testing (leading to yet more quality problems).
  • New versions of the application can be assembled and released at any time according to business demand. With sufficient automation this can be daily or even several times a day.

Scrum is the most popular Agile techniques, but it doesn’t scale well. And while Scrum improves the effectiveness of individual teams, productivity gains fall off sharply on large projects with many teams.

In the earlier section, methods on how to apply agile techniques to distributed teams and large projects was dealt. In the following section,  Tools and techniques for managing distributed teams will be addressed.

Some of the processes and tools needed to manage the Scalable Agile process framework are addressed in the following paragraphs.

The first of these “building blocks” is support for distributed teams. Large development organizations are almost always distributed because they have (1) business units and business partners in multiple locations, (2) “outsourcing” groups in different countries, (3) decided to take advantage of the global market for development talent, (4) remote employees who work from home.

So how can organizations support distributed teams well enough to reduce the need for face-to-face meetings?

Online Agile Planning
Online tools can replace paper-and-pencil planning exercises and physical whiteboards. This allows team members worldwide to create and maintain an overall project backlog, pull tasks to individual teams and contributors, move tasks through the steps in a Kanban process, and view tasks ready to be pulled into a release.

The  an online Agile planning tool is used for managing a central backlog and pulling tasks into “current” task buckets for individual teams, and an online card wall that can replace the physical variety.

online planning tools
Online planning tools can replace paper plans and physical white boards.

Online Collaboration

Development team members can collaborate most easily when they are in the same room, but online tools can provide a close approximation. Such tools include online standup reports, wikis, chat and IM products, and video and teleconferencing systems.

Another type of online tool, an activity stream gives developers real-time visibility into the activities of other team members—activities like code commits, new tickets, comments added to tickets, code reviews and posts on wikis.

activity stream

An activity stream shows commits, comments, and other events.

Global Code Management

Global collaboration can be undermined if developers need to share large repositories and large files over long distances and performance is slow.

Some of the technologies that Tool vendor Perforce uses, like proxies and replication, ensure that files are available immediately in remote locations . These solutions ensure that data is available where needed without artificial boundaries that impede sharing and collaboration.

Perforce technologies ensure that distributed team members don’t have to wait to get large repositories and files.

Decentralized Code Management
Developers often want highly decentralized code management so they can create their own local test branches and work independent of centralized corporate resources.

Development managers, however, want to maintain control over and visibility into activities at remote locations.

Git Fusion from Perforce answers both needs. Developers can quickly clone their own repositories and work in private Git repositories on their local systems, with easy code sharing between teams and products.

select directories

Release managers can make selected directories visible to Git users.

Release managers can model an entire product development effort with Perforce streams and branches, apply access controls, and control how much history and which files are cloned into new Git repositories . As changes are accepted, the enterprise release model guides changes to the right places: older releases, customizations, and parallel development efforts.

When developers commit code to the Perforce repository, the Perforce shared versioning service makes the changes visible to everyone and maintains a strong system of record about the source and nature of all changes.

The earlier paragraphs described the challenges of scaling large and distributed Agile teams, and investigated the tools and strategies that resolve them. Once the problem of scaling Agile development has been addressed, however, pressure is then applied next to the people and processes that are tasked with delivering or deploying the resulting product. Agile workflow is only successful once efficiency is attained in all stages of the workflow. This paragraph will cover the second building block of Scalable Agile: Continuous Delivery.


ScrumBan is a relatively easy first step for Scrum teams that want to move in the direction of Continuous Delivery.

Teams using ScrumBan work within a time-boxed sprint. But unlike conventional Scrum practices, a work-in-process limit is adopted, so team members are focused on finishing one task at a time. At a certain point in the sprint a “triage” process identifies which tasks can be completed within the time box, and drops the others from the sprint plan. At that point there is a “feature freeze,” and the remainder of the sprint is devoted to completing the tasks specified by the triage process.


ScrumBan represents a step toward Continuous Delivery because it emphasizes completing a small number of tasks as quickly as possible. Development teams avoid the pitfalls involved in pulling out all the stops to deliver the entire sprint plan, regardless of the cost in terms of quality and delays.

On-Demand Merge and Test by Contributor
The conventional software release process creates a huge bottleneck at the test phase. All teams send their contributions to a central QA team, which creates and tests a “release candidate.”

In traditional release processes, the QA lab becomes a serious bottleneck.

In theory this workflow makes very efficient use of the QA team and test systems, however:

  • It takes a long time to run all of the tests.
  • It is hard to debug and troubleshoot many code changes at once, especially if they may be interacting with each other.
  • Errors uncovered during the integration phase may require costly rework by several contributors.
  • The test lab becomes a huge bottleneck near the end of each sprint, causing stress and leading to sloppy testing practices.
  • Releases are delayed until the entire release candidate has been completely tested and debugged.

But what if each team can build and test based on just its own contributions?

There can be a different approach to testing. In this scenario each team and contributor has access to test resources. QA team members act as advisors and facilitators rather than being charged with managing all of the testing themselves. When a development team finishes a set of changes, the team then pulls a copy of the production version onto the test system. Changes can then be merged into this private production version, built and tested locally.
Each team pulls a product version, merges its changes, and performs its own tests.

team pulls

If testing uncovers problems, these can be solved right away by development, without worrying about interactions with changes from other teams. Multiple teams and contributors can now test and debug contributions independently, and submit them to the central staging area when ready
Teams submit tested contributions when they are ready; releases can be assembled at any time.

The advantages of this approach include:

  • QA is no longer a bottleneck—teams test independently, when they are ready.
  • It is easier to debug and troubleshoot problems, because each group is observing only its own changes to a previously tested production version.
  • Releases can be assembled at any time, constructed from whatever contributions have been tested and submitted.

These capabilities are not easy or cheap to implement. They require a considerable investment in automated build and test environments, which for distributed teams must be provided in the cloud. They also require that code merge and management be a fast and easy part of any developer’s daily work. A simple and easy-to-automate merge framework like Perforce Streams  provides merge notifications, merge pathway guidance, and intuitive tools.

In a complicated project consisting of several components, Perforce’s visibility over any part of the project also helps development teams share and reuse code. These teams can quickly adapt to a changing project structure, even if they are working in distributed repositories via Git Fusion. A merge in this environment would never require a complicated action that spans several independent repositories.

These capabilities are an indispensable aspect of Scalable Agile, because they allow very large numbers of teams to contribute to a project without overwhelming build and test resources.

Code Review Workflow
Another major challenge for Continuous Delivery is how to merge a growing number of contributions into production releases. How can you organize the flow of contributions from many sources? How can you decide when to assemble the next release? How do you avoid creating a bottleneck at the point where the contributions come together?

One very useful method is a code review workflow similar to those used in open source projects. In these projects hundreds of contributors might submit code and thousands might test it. Typically a core group of “maintainers” reviews submissions and selects the ones that will be included in the next release.

A code review workflow can be utilized in commercial environments as well. for example, the Assembla ticketing tool includes a merge request feature that allows contributors to submit code changes for review. Designated reviewers can review the submissions, hold online discussions about them, vote for or against accepting them, and make immediate decisions to accept or reject them

This code review workflow lets organizations manage the code review process and delegate the decision making for accepting contributions and assembling releases, which prevents these activities from becoming bottlenecks.

A code review workflow allows designated reviewers to vote on which contributions to include in the next release.

Streams for Managing Multiple Versions
Another common challenge among large projects is maintaining multiple releases and managing custom versions for individual customers.

Software vendors, for example, usually need to support several releases of an application at once. Bug fixes might need to be applied to many (but not all) of the supported releases. Enhancements to the current release might be retrofitted to the previous release and added to the upcoming release under development. Similarly, a service provider or enterprise IT department might be maintaining customized versions of an application for different customers or different business units within the enterprise.

perforce streams

It is much easier to navigate these complex scenarios with a tool like Perforce Streams. Perforce Streams not only helps development managers visualize the relationships between releases and versions, it guides release managers on where and when to apply bug fixes and feature enhancements when they are ready to be merged

Perforce Streams provide adaptable workflow for teams and promote efficiencies such as code re-use, automated merging, fast context switching, efficient workspace updates, and inherited workspace and branch views. An innovative addition to the Perforce branching and merging toolset, streams eliminate overhead, simplify common processes, and increase agility and scalability. In projects with a large volume of data, the time and performance savings are considerable.

Perforce Streams helps deploy bug fixes and enhancement across multiple releases and custom versions.

The typical perception of Agile development methodologies is that their benefits and promise are reserved for small, co-located teams. However, in the above sections we have seen how many, if not all, of the traditional Agile practices can be improved to the benefit of not only large teams, but large distributed teams as well. Ironically, this scalability has been achieved by employing the very processes and tools that the Agile Manifesto preaches against. However, while the tools enable scalability, they never require the sacrifice of developers’ freedoms or their ability to interact.

In these final paragraphs , tools will again be the focus for providing the solution for scaling one of the essential requirements of any Agile workflow: Continuous Integration with build and test automation. All the ideas addressed in the above paragraphs  will be reviewed along with detail on  how they fit together to make Agile scalable.

All of the examples provided can be implemented using software from Perforce, Assembla, Git and Jenkins.

Methods for Providing On-Demand Infrastructure

In a large project, the trickiest and costliest problems are found only when individuals put together all the pieces. It pays to find and fix these integration problems as early and often as possible.
Continuous Integration is a set of best practices in software development that supports project integration on a rapid, repeated basis . They are:

  •  Maintain a code repository
  •  Automate the build
  •  Make the build self-testing
  •  Everyone commits to the mainline every day
  •  Every commit (to the mainline) should be built
  •  Keep the build fast
  •  Test in a clone of the production environment
  •  Make it easy to get the latest deliverables
  •  Everyone can see the results of the latest build
  •  Automate deployment

The goal of is to perform a project-wide integration as often as possible. Striving to achieve this shapes your infrastructure, development practices, and attitudes.
Attitude is important. The entire team must commit to achieving successful integration at all stages of the project. For instance, a development task is not considered to be “done” until the feature appears in the integration build and is proven to work there. That shared commitment should make developers uneasy when they risk divergence by working in isolation for any lengthy period, e.g. when they are using an old build as a stable development base, or when they commit their changes their only infrequently. We cannot emphasize enough the importance of frequent integration. It really does reduce project risk.

Continuous Integration Tools in the Cloud
Organizations clearly need to invest in automated build and test processes if they want to scale up and deliver features faster and release more frequently. This investment can be expensive, but manual methods are obviously not scalable. Also, automated build and test processes tend to produce much higher software quality.

And if teams and contributors are highly distributed? Then the build and test tools must be accessible online, in the cloud.

Automated test tools like Jenkins can be integrated into the code review and merge workflows described earlier paragraphs. Whenever a contribution is accepted, a new version can be built and a series of automated tests can be run against it. Tools like Jenkins will then provide developers and the QA staff with detailed information on test results. Results from the test tool can even be used to vote to accept or reject contributions, as part of the code review workflow.

Automated test tools can provide detailed information on test results, and even vote to accept or reject contributions.

jenkins ci

Managing the impact of on-demand continuous integration is a logistical challenge for the version management service. Perforce addresses this challenge by providing flexible configurations of proxies and replicas to meet a variety of build demands.

supporting ci

In the earlier paragraphs,  some of the shortcomings of Scrum, such as lack of techniques to coordinate teams, assumptions about co-located teams and fixed release cycles that are unrealistic for many organizations, and a tendency to spend too much time in planning and coordination meetings was covered.

Continuous, iterative development is supported for different workflow methodology employed. With Perforce,  can:

  • Build and confirm your work from a private workspace before submitting your code
  • Execute automated builds and tests on specific branches upon check-in
  • Improve the quality of software and delivery time to market
  • Popular continuous integration tools, like Electric Commander from Electric Cloud, Parabuild from Viewtier, and Anthill Pro from UrbanCode, all support their own integrations with Perforce.

 A Scalable Agile process framework featuring the following was outlined:

  • A single prioritized backlog for all teams, so high-priority tasks always receive immediate attention.
  • Kanban processes with WIP limits, so teams don’t have to spend a lot of time in release planning, and to ensure that individual tasks are completed as quickly as possible.
  • On-demand resources, so each team can build and test its own contributions quickly and avoid making the QA lab a bottleneck.
  • A code review process that allows designated reviewers to accept or reject a large number of code submissions.
  • A “take what’s ready” approach to releases, so organizations could provide new functionality as frequently as required by customer needs and expectations.

The processes and tools that could facilitate Scalable Agile was discussed. These include online Agile planning tools, online collaboration, global code management, decentralized code management, ScrumBan processes, tools for on-demand merging and testing by contributors, code review workflows, stream-based tools for managing multiple releases and custom versions, and continuous integration tools provided in the cloud.

While the journey to Scalable Agile may be a long one, each of the steps down the path provides immediate benefits. The growing development groups consider:

  • Implementing ScrumBan, to start moving toward lean methods.
  • Deploying online planning and collaboration tools, to improve the effectiveness of distributed teams and contributors.
  • Deploying advanced code management platforms, to support distributed development and manage multiple releases and versions.
  • Begin investing in Continuous Integration and on-demand build and test systems.
  • Adjust the dial on continuous delivery gradually, to allow time for all your teams to adjust.