PEARL III: Principles of Lean Software Development for Agile Methodology

PEARL III: Principles of Lean Software Development for Agile Methodology

The market for software is fast paced with frequently changing customer needs. In order to stay competitive companies have to be able to react to the changing needs in a rapid manner. Failing to do so often results in a higher risk of market lock-out , reduced probability of market dominance , and it is less likely that the product conforms to the needs of the market. In consequence software companies need to take action in order to be responsive whenever there is a shift in the customers’ needs on the market. That is, they need to meet the current requirements of the market, the requirements being function or quality related. Two development paradigms emerged in the last decade to address this challenge, namely agile and lean software development.

The term Lean Software Development was first coined as the title for a conference organized by the ESPRIT initiative of the European Union, in Stuttgart Germany, October 1992. Independently, the following year, Robert Charette in 1993 suggested the concept of “Lean Software Development” as part of his work exploring better ways of managing risk in software projects. The term “Lean” dates to 1991, suggested by James Womack, Daniel Jones, and Daniel Roos, in their book The Machine That Changed the World: The Story of Lean Production as the English language term to describe the management approach used at Toyota. The idea that Lean might be applicable in software development was established very early, only 1 to 2 years after the term was first used in association with trends in manufacturing processes and industrial engineering.

In their 2nd book, published in 1995, Womack and Jones defined five core pillars of Lean Thinking. These were:

  • Value
  • Value Stream
  • Flow
  • Pull
  • Perfection

This became the default working definition for Lean over most of the next decade. The pursuit of perfection, it was suggested, was achieved by eliminating waste. While there were 5 pillars, it was the 5th one, pursuit of perfection through the systemic identification of wasteful activities and their elimination, that really resonated with a wide audience. Lean became almost exclusively associated with the practice of elimination of waste through the late 1990s and the early part of the 21st Century.

The Womack and Jones definition for Lean is not shared universally. The principles of management at Toyota are far more subtle. The single word “waste” in English is described more richly with three Japanese terms:

  • Muda – literally meaning “waste” but implying non-value-added activity
  • Mura – meaning “unevenness” and interpreted as “variability in flow”
  • Muri – meaning “overburdening” or “unreasonableness”

Perfection is pursued through the reduction of non-value-added activity but also through the smoothing of flow and the elimination of overburdening. In addition, the Toyota approach was based in a foundational respect for people and heavily influenced by the teachings of 20th century quality assurance and statistical process control experts such as W. Edwards Deming.

Unfortunately, there are almost as many definitions for Lean as there are authors on the subject.

Lean and Agile

Bob Charette was invited but unable to attend the 2001 meeting at Snowbird, Utah, where the Manifesto for Agile Software Development was authored. Despite missing this historic meeting, Lean Software Development was considered as one of several Agile approaches to software development. Jim Highsmith dedicated a chapter of his 2002 book to an interview with Bob about the topic. Later, Mary & Tom Poppendieck went on to author a series of 3 books. During the first few years of the 21st Century, Lean principles were used to explain why Agile methods were better. Lean explained that Agile methods contained little “waste” and hence produced a better economic outcome. Lean principles were used as a “permission giver” to adopt Agile methods.
Early lean software ideas were developed by Poppendieck and Middleton and Sutton. These books explored how lean thinking could be transferred from manufacturing to the more intangible world and different culture of software engineers. Specific techniques on how the concept of kanban could be applied to software were also developed . Note
that the use of these methods is partly a metaphor rather than a direct copying. For example, kanban in factories literally is a binary signal to replenish an inventory buffer, based on what the customer has taken away. In software it performs a similar function, but more broadly displays information on the status of the process and potential problems.
Moving upstream and applying lean thinking to influence project selection and definition also creates great benefits .
The proceedings of the first Lean & Kanban Software conference and the work of Shalloway et al. show adoption is spreading.

Defining Lean Software Development

Defining Lean Software Development is challenging because there is no specific Lean Software Development method or process. Lean is not an equivalent of Personal Software Process, V-Model, Spiral Model, EVO, Feature-Driven Development, Extreme Programming, Scrum, or Test-Driven Development. A software development lifecycle process or a project management process could be said to be “lean” if it was observed to be aligned with the values of the Lean Software Development movement and the principles of Lean Software Development. So those anticipating a simple recipe that can be followed and named Lean Software Development will be disappointed. Individuals must fashion or tailor their own software development process by understanding Lean principles and adopting the core values of Lean.

There are several schools of thought within Lean Software Development. The largest, and arguably leading, school is the Lean Systems Society, which includes Donald Reinertsen, Jim Sutton, Alan Shalloway, Bob Charette, Mary Poppendeick, and David J. Anderson. Mary and Tom Poppendieck’s work developed prior to the formation of the Society and its credo stands separately, as does the work of Craig Larman, Bas Vodde , and, most recently, Jim Coplien . This section seeks to be broadly representative of the Lean Systems Society viewpoint as expressed in its credo and to provide a synthesis and summary of their ideas.

Values

The Lean Systems Society published its credo at the 2012 Lean Software & Systems Conference . This was based on a set of values published a year earlier. Those values include:
  • Accept the human condition
  • Accept that complexity & uncertainty are natural to knowledge work
  • Work towards a better Economic Outcome
  • While enabling a better Sociological Outcome
  • Seek, embrace & question ideas from a wide range of disciplines
  • A values-based community enhances the speed & depth of positive change

Accept the Human Condition

Knowledge work such as software development is undertaken by human beings. We humans are inherently complex and, while logical thinkers, we are also led by our emotions and some inherent animalistic traits that can’t reasonably be overcome. Our psychology and neuro-psychology must be taken into account when designing systems or processes within which we work. Our social behavior must also be accommodated. Humans are inherently emotional, social, and tribal, and our behavior changes with fatigue and stress. Successful processes will be those that embrace and accommodate the human condition rather than those that try to deny it and assume logical, machine-like behavior.

Accept that Complexity & Uncertainty are Natural to Knowledge Work

The behavior of customers and markets are unpredictable. The flow of work through a process and a collection of workers is unpredictable. Defects and required rework are unpredictable. There is inherent chance or seemingly random behavior at many levels within software development. The purpose, goals, and scope of projects tend to change while they are being delivered. Some of this uncertainty and variability, though initially unknown, is knowable in the sense that it can be studied and quantified and its risks managed, but some variability is unknowable in advance and cannot be adequately anticipated. As a result, systems of Lean Software Development must be able to react to unfolding events, and the system must be able to adapt to changing circumstances. Hence any Lean Software Development process must exist within a framework that permits adaptation (of the process) to unfolding events.

Work towards a better Economic Outcome

Human activities such as Lean Software Development should be focused on producing a better economic outcome. Capitalism is acceptable when it contributes both to the value of the business and the benefit of the customer. Investors and owners of businesses deserve a return on investment. Employees and workers deserve a fair rate of pay for a fair effort in performing the work. Customers deserve a good product or service that delivers on its promised benefits in exchange for a fair price paid. Better economic outcomes will involve delivery of more value to the customer, at lower cost, while managing the capital deployed by the investors or owners in the most effective way possible.

Enable a better Sociological Outcome

Better economic outcomes should not be delivered at the expense of those performing the work. Creating a workplace that respects people by accepting the human condition and provides systems of work that respect the psychological and sociological nature of people is essential. Creating a great place to do great work is a core value of the Lean Software Development community.

Principles of Lean Software Development for Scaling Agile

The Lean Software & Systems community seems to agree on a few principles that underpin Lean Software Development processes. These are the principles of Lean software development for Agile Methodology.
  • Follow a Systems Thinking & Design Approach
  • Emergent Outcomes can be Influenced by Architecting the Context of a Complex Adaptive System
  • Respect People (as part of the system)
  • Use the Scientific Method (to drive improvements)
  • Encourage Leadership
  • Generate Visibility (into work, workflow, and system operation)
  • Reduce Flow Time
  • Reduce Waste to Improve Efficiency

Follow a Systems Thinking & Design Approach

This is often referred to in Lean literature as “optimize the whole,” which implies that it is the output from the entire system (or process) that we desire to optimize, and we shouldn’t mistakenly optimize parts in the hope that it will magically optimize the whole. Most practitioners believe the corollary to be true, that optimizing parts (local optimization) will lead to a suboptimal outcome.

A Lean Systems Thinking and Design Approach requires that we consider the demands on the system made by external stakeholders, such as customers, and the desired outcome required by those stakeholders. We must study the nature of demand and compare it with the capability of our system to deliver. Demand will include so-called “value demand,” for which customers are willing to pay, and “failure demand,” which is typically rework or additional demand caused by a failure in the supply of value demand. Failure demand often takes two forms: rework on previously delivered value demand and additional services or support due to a failure in supplying value demand. In software development, failure demand is typically requests for bug fixes and requests to a customer care or help desk function.

A systems design approach requires that we also follow the Plan-Do-Study-Act (PDSA) approach to process design and improvement. W. Edwards Deming used the words “study” and “capability” to imply that we study the natural philosophy of our system’s behavior. This system consists of our software development process and all the people operating it. It will have an observable behavior in terms of lead time, quality, quantity of features or functions delivered (referred to in Agile literature as “velocity”), and so forth.

Velocity : At the end of each iteration, the team adds up effort estimates associated with user stories that were completed during that iteration. This total is called velocity.
Knowing velocity, the team can compute (or revise) an estimate of how long the project will take to complete, based on the estimates associated with remaining user stories and assuming that velocity over the remaining iterations will remain approximately the same. This is generally an accurate prediction, even though rarely a precise one.

“Lead time” is a term borrowed from the manufacturing method known as Lean or Toyota Production System, where it is defined as the time elapsed between a customer placing an order and receiving the product ordered.
Translated to the software domain, lead time can be described more abstractly as the time elapsed between the identification of a requirement and its fulfillment. Defining a more concrete measurement depends on the situation being examined: for instance, when focusing on the software development process, the “lead time” elapsed between the formulation of a user story and that story being used “in production”, that is, by actual users under normal conditions.
Teams opting for the kanban approach favor this measure, over the better known velocity. Instead of aiming at increasing velocity, improvement initiatives intend to reduce lead time.

These metrics will exhibit variability and, by studying the mean and spread of variation, Individuals can develop an understanding of their capability. If this is mismatched with the demand and customer expectations, then the system will need to be redesigned to close the gap.

Deming also taught that capability is 95% influenced by system design, and only 5% is contributed by the performance of individuals. In other words, we can respect people by not blaming them for a gap in capability compared to demand and by redesigning the system to enable them to be successful.

To understand system design, we must have a scientific understanding of the dynamics of system capability and how it might be affected. Models are developed to predict the dynamics of the system. While there are many possible models, several popular ones are in common usage: the understanding of economic costs; so-called transaction and coordination costs that relate to production of customer-valued products or services; the Theory of Constraints – the understanding of bottlenecks; and The Theory of Profound Knowledge – the study and recognition of variability as either common to the system design or special and external to the system design.

Emergent Outcomes can be Influenced by ARCHITECTURE of the Context for a Complex Adaptive System

Complex systems have starting conditions and simple rules that, when run iteratively, produce an emergent outcome. Emergent outcomes are difficult or impossible to predict given the starting conditions. The computer science experiment “The Game of Life” is an example of a complex system. A complex adaptive system has within it some self-awareness and an internal method of reflection that enables it to consider how well its current set of rules is enabling it to achieve a desired outcome. The complex adaptive system may then choose to adapt itself – to change its simple rules – to close the gap between the current outcome and the desired outcome. The Game of Life adapted such that the rules could be re-written during play would be a complex adaptive system.

In software development processes, the “simple rules” of complex adaptive systems are the policies that make up the process definition. The core principle here is based in the belief that developing software products and services is not a deterministic activity, and hence a defined process that cannot adapt itself will not be an adequate response to unforeseeable events. Hence, the process designed as part of our system thinking and design approach must be adaptable. It adapts through the modification of the policies of which it is made.

The Kanban approach to Lean Software Development utilizes this concept by treating the policies of the kanban pull system as the “simple rules,” and the starting conditions are that work and workflow is visualized, that flow is managed using an understanding of system dynamics, and that the organization uses a scientific approach to understanding, proposing, and implementing process improvements.

The term “kanban” with the sense of a sign, poster or billboard, and derived from roots which literally translate as “visual board”.
Its meaning within the Agile context is borrowed from the Toyota Production System, where it designates a system to control the inventory levels of various parts. It is analogous to (and in fact inspired by) cards placed behind products on supermarket shelves to signal “out of stock” items and trigger a resupply “just in time”.
The Toyota system affords a precise accounting of inventory or “work in process”, and strives for a reduction of inventory levels, considered wasteful and harmful to performance.
The phrase “Kanban method” also refers to an approach to continuous improvement which relies on visualizing the current system of work scheduling, managing “flow” as the primary measure of performance, and whole-system optimization – as a process improvement approach, it does not prescribe any particular practices.

Respect People

The Lean community adopts Peter Drucker’s definition of knowledge work that states that workers are knowledge workers if they are more knowledgeable about the work they perform than their bosses. This creates the implication that workers are best placed to make decisions about how to perform work and how to modify processes to improve how work is performed. So the voice of the worker should be respected. Workers should be empowered to self-organize to complete work and achieve desired outcomes. They should also be empowered to suggest and implement process improvement opportunities or “kaizen events” as they are referred to in Lean literature. Making process policies explicit so that workers are aware of the rules that constrain them is another way of respecting them. Clearly defined rules encourage self-organization by removing fear and the need for courage. Respecting people by empowering them and giving them a set of explicitly declared policies holds true with the core value of respecting the human condition.
SAP has been using SCRUM and other Agile methodologies for several years at the team level. Herbert Illgner, COO Business Solutions and Technology at SAP, who has been involved with the effort, says that team empowerment and faster feedback cycles with customers are two significant benefits. Illgner added that SAP is expanding application of Agile methods to the entire product creation process using a Lean framework that includes empowered cross-functional teams, continuous improvement process and managers as support and teachers..

Use the Scientific Method

Seek to use models to understand the dynamics of how work is done and how the system of Lean Software Development is operating. Observe and study the system and its capability, and then develop and apply models for predicting its behavior. Collect quantitative data in the applicable studies, and use that data to understand how the system is performing and to predict how it might change when the process is changed.

The Lean Software & Systems community uses statistical methods such as statistical process control charts and spectral analysis histograms of raw data for lead time and velocity to understand system capability. They also use models such as: the Theory of Constraints to understand bottlenecks; The System of Profound Knowledge to understand variation that is internal to the system design versus that which is externally influenced; and an analysis of economic costs in the form of tasks performed to merely coordinate, set up, deliver, or clean up after customer-valued product or services are created. Some other models are coming into use, such as Real Option Theory, which seeks to apply financial option theory from financial risk management to real-world decision making.

The scientific method suggests: we study; we postulate an outcome based on a model; we perturb the system based on that prediction; and we observe again to see if the perturbation produced the results the model predicted. If it doesn’t, then we check our data and reconsider whether our model is accurate. Using models to drive process improvements moves it to a scientific activity and elevates it from a superstitious activity based on intuition.

Encourage Leadership

Leadership and management are not the same. Management is the activity of designing processes, creating, modifying, and deleting policy, making strategic and operational decisions, gathering resources, providing finance and facilities, and communicating information about context such as strategy, goals, and desired outcomes. Leadership is about vision, strategy, tactics, courage, innovation, judgment, advocacy, and many more attributes. Leadership can and should come from anyone within an organization. Small acts of leadership from workers will create a cascade of improvements that will deliver the changes needed to create a Lean Software Development process.

Generate Visibility

Knowledge work is invisible. If you can’t see something, it is (almost) impossible to manage it. It is necessary to generate visibility into the work being undertaken and the flow of that work through a network of individuals, skills, and departments until it is complete. It is necessary to create visibility into the process design by finding ways of visualizing the flow of the process and by making the policies of the process explicit for everyone to see and consider. When all of these  are visible, then the use of the scientific method is possible, and conversations about potential improvements can be collaborative and objective. Collaborative process improvement is almost impossible if work and workflow are invisible and if process policies are not explicit.

Reduce Flow Time

The software development profession and the academics who study software engineering have traditionally focused on measuring time spent working on an activity. The Lean Software Development community has discovered that it might be more useful to measure the actual elapsed calendar time something takes to be processed. This is typically referred to as Cycle Time and is usually qualified by the boundaries of the activities performed. For example, Cycle Time through Analysis to Ready for Deployment would measure the total elapsed time for a work item, such as a user story, to be analyzed, designed, developed, tested in several ways, and queued ready for deployment to a production environment. In consultation with the customer or product owner, the team divides up the work to be done into functional increments called “user stories”.

Lead time clock starts when the request is made and ends at delivery. Cycle time clock starts when work begins on the request and ends when the item is ready for delivery. Cycle time is a more mechanical measure of process capability. Lead time is what the customer sees.

Lead time depends on cycle time, but also depends on your willingness to keep a backlog, the customer’s patience, and the customer’s readiness for delivery.

Focusing on the time work takes to flow through the process is important in several ways. Longer cycle times have been shown to correlate with a non-linear growth in bug rates. Hence shorter cycle times lead to higher quality. This is counter-intuitive as it seems ridiculous that bugs could be inserted in code while it is queuing and no human is actually touching it. Traditionally, the software engineering profession and academics who study it have ignored this idle time. However, empirical evidence suggests that cycle time is important to initial quality.

Alan Shalloway has also talked about the concept of “induced work.” His observation is that a lag in performing a task can lead to that task taking a lot more effort than it may have done. For example, a bug found and fixed immediately may only take 20 minutes to fix, but if that bug is triaged, is queued and then waits for several days or weeks to be fixed, it may involve several or many hours to make the fix. Hence, the cycle time delay has “induced” additional work. As this work is avoidable, in Lean terms, it must be seen as “waste.”

The third reason for focusing on cycle time is a business related reason. Every feature, function, or user story has a value. That value may be uncertain but, nevertheless, there is a value. The value may vary over time. The concept of value varying over time can be expressed economically as a market payoff function. When the market payoff function for a work item is understood, even if the function exhibits a spread of values to model uncertainty, it is possible to evaluate a “cost of delay.” The cost of delay allows us to put a value on reducing cycle time.

With some work items, the market payoff function does not start until a known date in the future. For example, a feature designed to be used during the 4th of July holiday in the United States has no value prior to that date. Shortening cycle time and being capable of predicting cycle time with some certainty is still useful in such an example. Ideally, we want to start the work so that the feature is delivered “just in time” when it is needed and not significantly prior to the desired date, nor late, as late delivery incurs a cost of delay. Just-in-time delivery ensures that optimal use was made of available resources. Early delivery implies that we might have worked on something else and have, by implication, incurred an opportunity cost of delay.

As a result of these three reasons, Lean Software Development seeks to minimize flow time and to record data that enables predictions about flow time. The objective is to minimize failure demand from bugs, waste from over-burdening due to delay in fixing bugs, and to maximize value delivered by avoiding both cost of delay and opportunity cost of delay.

Reduce Waste to Improve Efficiency

A value stream mapping technique is used to identify waste. The second step is to point out sources of waste and to eliminate them. Waste-removal should take place iteratively until even essential-seeming processes and procedures are liquidated.

For every valued-added activity, there are setup, cleanup and delivery activities that are necessary but do not add value in their own right. For example, a project iteration that develops an increment of working software requires planning (a setup activity), an environment and perhaps a code branch in version control (collectively known as configuration management and also a setup activity), a release plan and performing the actual release (a delivery activity), a demonstration to the customer (a delivery activity), and perhaps an environment teardown or reconfiguration (a cleanup activity.) In economic terms, the setup, cleanup, and delivery activities are transaction costs on performing the value-added work. These costs (or overheads) are considered waste in Lean.

Any form of communication overhead can be considered waste. Meetings to determine project status and to schedule or assign work to team members would be considered a coordination cost in economic language. All coordination costs are waste in Lean thinking. Lean software development methods seek to eliminate or reduce coordination costs through the use of colocation of team members, short face-to-face meetings such as standups, and visual controls such as card walls.

The third common form of waste in Lean Software Development is failure demand. Failure demand is a burden on the system of software development. Failure demand is typically rework or new forms of work generated as a side-effect of poor quality. The most typical forms of failure demand in software development are bugs, production defects, and customer support activities driven out of a failure to use the software as intended. The percentage of work-in-progress that is failure demand is often referred to as Failure Load. The percentage of value-adding work against failure demand is a measure of the efficiency of the system.

The percentage of value-added work against the total work, including all the non-value adding transaction and coordination costs, determines the level of efficiency. A system with no transaction and coordination costs and no failure load would be considered 100% efficient.

Traditionally, Western management science has taught that efficiency can be improved by increasing the batch size of work. Typically, transaction and coordination costs are fixed or rise only slightly with an increase in batch size. As a result, large batches of work are more efficient. This concept is known as “economy of scale.” However, in knowledge work problems, coordination costs tend to rise non-linearly with batch size, while transaction costs can often exhibit a linear growth. As a result, the traditional 20th Century approach to efficiency is not appropriate for knowledge work problems like software development.

It is better to focus on reducing the overheads while keeping batch sizes small in order to improve efficiency. Hence, the Lean way to be efficient is to reduce waste. Lean software development methods focus on fast, cheap, and quick planning methods; low communication overhead; and effective low overhead coordination mechanisms, such as visual controls in kanban systems. They also encourage automated testing and automated deployment to reduce the transaction costs of delivery. Modern tools for minimizing the costs of environment setup and teardown, such as modern version control systems and use of virtualization, also help to improve efficiency of small batches of software development.

Lean Software Development Practices for Agile 

Lean software development are viewed as a set of thinking tools that could easily blend in with any agile approach.So as you can see, lean and agile are deeply intertwined in the software world.
In practice, Agile seems to be changing for the better by adopting Lean thinking in a large way. Rally Development says that its customers get to market 50% faster and are 25% more productive when they employ a hybrid of Lean and Agile development methods.

Lean Software Development does not prescribe practices. It is more important to demonstrate that actual process definitions are aligned with the principles and values. However, a number of practices are being commonly adopted. This section provides a brief overview of some of these.

Continuous learning

Software development is a continuous learning process with the additional challenge of development teams and end product sizes. The best approach for improving a software development environment is to amplify learning. The accumulation of defects should be prevented by running tests as soon as the code is written. Instead of adding more documentation or detailed planning, different ideas could be tried by writing code and building. The process of user requirements gathering could be simplified by presenting screens to the end-users and getting their input.

The learning process is sped up by usage of short iteration cycles – each one coupled with refactoring and integration testing. Refactoring consists of improving the internal structure of an existing program’s source code, while preserving its external behavior. The noun “refactoring” refers to one particular behaviour-preserving transformation, such as “Extract Method” or “Introduce Parameter”. Refactoring does “not” mean: rewriting code, fixing bugs or improve observable aspects of software such as its interface

Refactoring in the absence of safeguards against introducing defects (i.e. violating the “behaviour preserving” condition) is risky. Safeguards include aids to regression testing including automated unit tests or automated acceptance tests, and aids to formal reasoning such as type systems.

Increasing feedback via short feedback sessions with customers helps when determining the current phase of development and adjusting efforts for future improvements. During those short sessions both customer representatives and the development team learn more about the domain problem and figure out possible solutions for further development. Thus the customers better understand their needs, based on the existing result of development efforts, and the developers learn how to better satisfy those needs. Another idea in the communication and learning process with a customer is set-based development – this concentrates on communicating the constraints of the future solution and not the possible solutions, thus promoting the birth of the solution via dialogue with the customer.

Decide as late as possible

As software development is always associated with some uncertainty, better results should be achieved with an options-based approach, delaying decisions as much as possible until they can be made based on facts and not on uncertain assumptions and predictions. The more complex a system is, the more capacity for change should be built into it, thus enabling the delay of important and crucial commitments. The iterative approach promotes this principle – the ability to adapt to changes and correct mistakes, which might be very costly if discovered after the release of the system.

An agile software development approach can move the building of options earlier for customers, thus delaying certain crucial decisions until customers have realized their needs better. This also allows later adaptation to changes and the prevention of costly earlier technology-bounded decisions. This does not mean that no planning should be involved – on the contrary, planning activities should be concentrated on the different options and adapting to the current situation, as well as clarifying confusing situations by establishing patterns for rapid action. Evaluating different options is effective as soon as it is realized that they are not free, but provide the needed flexibility for late decision making.

Deliver as fast as possible

In the era of rapid technology evolution, it is not the biggest that survives, but the fastest. The sooner the end product is delivered without major defects, the sooner feedback can be received, and incorporated into the next iteration. The shorter the iterations, the better the learning and communication within the team. With speed, decisions can be delayed. Speed assures the fulfilling of the customer’s present needs and not what they required yesterday. This gives them the opportunity to delay making up their minds about what they really require until they gain better knowledge. Customers value rapid delivery of a quality product.

The just-in-time production ideology could be applied to software development, recognizing its specific requirements and environment. This is achieved by presenting the needed result and letting the team organize itself and divide the tasks for accomplishing the needed result for a specific iteration. At the beginning, the customer provides the needed input. This could be simply presented in small cards or stories – the developers estimate the time needed for the implementation of each card. Thus the work organization changes into self-pulling system – each morning during a stand-up meeting, each member of the team reviews what has been done yesterday, what is to be done today and tomorrow, and prompts for any inputs needed from colleagues or the customer. This requires transparency of the process, which is also beneficial for team communication.

Another key idea in Toyota’s Product Development System is set-based design. If a new brake system is needed for a car, for example, three teams may design solutions to the same problem. Each team learns about the problem space and designs a potential solution. As a solution is deemed unreasonable, it is cut. At the end of a period, the surviving designs are compared and one is chosen, perhaps with some modifications based on learning from the others – a great example of deferring commitment until the last possible moment. Software decisions could also benefit from this practice to minimize the risk brought on by big up-front design

Empower the team

There has been a traditional belief in most businesses about the decision-making in the organization – the managers tell the workers how to do their own job. In a “Work-Out technique”, the roles are turned – the managers are taught how to listen to the developers, so they can explain better what actions might be taken, as well as provide suggestions for improvements. The lean approach favors the aphorism “find good people and let them do their own job,” encouraging progress, catching errors, and removing impediments, but not micro-managing.

Another mistaken belief has been the consideration of people as resources. People might be resources from the point of view of a statistical data sheet, but in software development, as well as any organizational business, people do need something more than just the list of tasks and the assurance that they will not be disturbed during the completion of the tasks. People need motivation and a higher purpose to work for – purpose within the reachable reality, with the assurance that the team might choose its own commitments. The developers should be given access to the customer; the team leader should provide support and help in difficult situations, as well as ensure that skepticism does not ruin the team’s spirit.

Build integrity in

The customer needs to have an overall experience of the System – this is the so-called perceived integrity: how it is being advertised, delivered, deployed, accessed, how intuitive its use is, price and how well it solves problems.

Conceptual integrity means that the system’s separate components work well together as a whole with balance between flexibility, maintainability, efficiency, and responsiveness. This could be achieved by understanding the problem domain and solving it at the same time, not sequentially. The needed information is received in small batch pieces – not in one vast chunk with preferable face-to-face communication and not any written documentation. The information flow should be constant in both directions – from customer to developers and back, thus avoiding the large stressful amount of information after long development in isolation.

One of the healthy ways towards integral architecture is refactoring. As more features are added to the original code base, the harder it becomes to add further improvements. Refactoring is about keeping simplicity, clarity, minimum amount of features in the code. Repetitions in the code are signs for bad code designs and should be avoided. The complete and automated building process should be accompanied by a complete and automated suite of developer and customer tests, having the same versioning, synchronization and semantics as the current state of the System. At the end the integrity should be verified with thorough testing, thus ensuring the System does what the customer expects it to. Automated tests are also considered part of the production process, and therefore if they do not add value they should be considered waste. Automated testing should not be a goal, but rather a means to an end, specifically the reduction of defects.

See the whole

Software systems nowadays are not simply the sum of their parts, but also the product of their interactions. Defects in software tend to accumulate during the development process – by decomposing the big tasks into smaller tasks, and by standardizing different stages of development, the root causes of defects should be found and eliminated. The larger the system, the more organizations that are involved in its development and the more parts are developed by different teams, the greater the importance of having well defined relationships between different vendors, in order to produce a system with smoothly interacting components. During a longer period of development, a stronger subcontractor network is far more beneficial than short-term profit optimizing, which does not enable win-win relationships.

Lean thinking has to be understood well by all members of a project, before implementing in a concrete, real-life situation. “Think big, act small, fail fast; learn rapidly” – these slogans summarize the importance of understanding the field and the suitability of implementing lean principles along the whole software development process. Only when all of the lean principles are implemented together, combined with strong “common sense” with respect to the working environment, is there a basis for success in software development.

Model Storming: 
Agile Modeling’s practices of light weight, initial requirements envisioning followed by iteration modeling and just-in-time (JIT) model storming work because they reflect deferment of commitment regarding what needs to be built until it’s actually needed, and the practices help eliminate waste because you’re only modeling what needs to be built.
Agility by Self Organization :
It is possible to deliver high-quality systems quickly. By limiting the work of a team to its capacity, which is reflected by the team’s velocity (this is the number of “points” of functionality which a team delivers each iteration), you can establish a reliable and repeatable flow of work. An effective organization doesn’t demand teams do more than they are capable of, but instead asks them to self-organize and determine what they can accomplish. Constraining these teams to delivering potentially shippable solutions on a regular basis motivates them to stay focused on continuously adding value.

Cumulative Flow Diagrams

Cumulative Flow Diagrams have been a standard part of reporting in Team Foundation Server since 2005. Cumulative flow diagrams plot an area graph of cumulative work items in each state of a workflow. They are rich in information and can be used to derive the mean cycle time between steps in a process as well as the throughput rate (or “velocity”). Different software development lifecycle processes produce different visual signatures on cumulative flow diagrams. Practitioners can learn to recognize patterns of dysfunction in the process displayed in the area graph. A truly Lean process will show evenly distributed areas of color, smoothly rising at a steady pace. The picture will appear smooth without jagged steps or visible blocks of color.

In their most basic form, cumulative flow diagrams are used to visualize the quantity of work-in-progress at any given step in the work item lifecycle. This can be used to detect bottlenecks and observe the effects of “mura” (variability in flow).

Visual Controls

In addition to the use of cumulative flow diagrams, Lean Software Development teams use physical boards, or projections of electronic visualization systems, to visualize work and observe its flow. Such visualizations help team members observe work-in-progress accumulating and enable them to see bottlenecks and the effects of “mura.” Visual controls also enable team members to self-organize to pick work and collaborate together without planning or specific management direction or intervention. These visual controls are often referred to as “card walls” or sometimes (incorrectly) as “kanban boards.”

Virtual Kanban Systems

A kanban system is a practice adopted from Lean manufacturing. It uses a system of physical cards to limit the quantity of work-in-progress at any given stage in the workflow. Such work-in-progress limited systems create a “pull” where new work is started only when there are free kanban indicating that new work can be “pulled” into a particular state and work can progress on it.

In Lean Software Development, the kanban are virtual and often tracked by setting a maximum number for a given step in the workflow of a work item type. In some implementations, electronic systems keep track of the virtual kanban and provide a signal when new work can be started. The signal can be visual or in the form of an alert such as an email.

Virtual kanban systems are often combined with visual controls to provide a visual virtual kanban system representing the workflow of one or several work item types. Such systems are often referred to as “kanban boards” or “electronic kanban systems.” A visual virtual kanban system is available as a plug-in for Team Foundation Server, called Visual WIP[20]. This project was developed as open source by Hakan Forss in Sweden.

Small Batch Sizes / Single-piece Flow

Lean Software Development requires that work is either undertaken in small batches, often referred to as “iterations” or “increments,” or that work items flow independently, referred to as “single-piece flow.” Single-piece flow requires a sophisticated configuration management strategy to enable completed work to be delivered while incomplete work is not released accidentally. This is typically achieved using branching strategies in the version control system. A small batch of work would typically be considered a batch that can be undertaken by a small team of 8 people or less in under 2 weeks.

Small batches and single-piece flow require frequent interaction with business owners to replenish the backlog or queue or work. They also require a capability to release frequently. To enable frequent interaction with business people and frequent delivery, it is necessary to shrink the transaction and coordination costs of both activities. A common way to achieve this is the use of automation.

Automation

Lean Software Development expects a high level of automation to economically enable single-piece flow and to encourage high quality and the reduction of failure demand. The use of automated testing, automated deployment, and software factories to automate the deployment of design patterns and creation of repetitive low variability sections of source code will all be commonplace in Lean Software Development processes.

Kaizen Events

In Lean literature, the term kaizen means “continuous improvement” and a kaizen event is the act of making a change to a process or tool that hopefully results in an improvement.

The Lean concept of Kaizen also has a strong influence on the way Agile is being practiced, filling a gap relating to continuous improvement

Lean Software Development processes use several different activities to generate kaizen events. These are listed here. Each of these activities is designed to stimulate a conversation about problems that adversely affect capability and, consequently, ability to deliver against demand. The essence of kaizen in knowledge work is that we must provoke conversations about problems across groups of people from different teams and with different skills.

The evolution of Agile is primarily focused on evolving the product toward a better fit with requirements. In Agile, both the product and the requirements are refined as more is known through experience. Kaizen, a continuous improvement method used in Lean, focuses on the development process itself. When Kaizen is practiced in an Agile project, the participants not only suggest ways to improve the fit between the product and the requirements but also offer ways to improve the process being used, something usually not emphasized in Agile methods. Eckfeldt described the use of Kaizen snakes and project thermometers to capture process improvement feedback.

Daily standup meetings

Teams of software developers, often up to 50, typically meet in front of a visual control system such as a whiteboard displaying a visualization of their work-in-progress. They discuss the dynamics of flow and factors affecting the flow of work. Particular focus is made to externally blocked work and work delayed due to bugs. Problems with the process often become evident over a series of standup meetings. The result is that a smaller group may remain after the meeting to discuss the problem and propose a solution or process change. A kaizen event will follow. These spontaneous meetings are often referred to as spontaneous quality circles in older literature. Such spontaneous meetings are at the heart of a truly kaizen culture. Managers will encourage the emergence of kaizen events after daily standup meetings in order to drive adoption of Lean within their organization.

Retrospectives

Project teams may schedule regular meetings to reflect on recent performance. These are often done after specific project deliverables are complete or after time-boxed increments of development known as iterations or sprints in Agile software development.

Retrospectives typically use an anecdotal approach to reflection by asking questions like “what went well?”, “what would we do differently?”, and “what should we stop doing?”

Retrospectives typically produce a backlog of suggestions for kaizen events. The team may then prioritize some of these for implementation.

A retrospective is intended to reveal facts or feelings which have measurable effects on the team’s performance, and to construct ideas for improvement based on these observations. It will not be useful if it devolves into a verbal joust, or a whining session.

On the other hand, an effective retrospective requires that each participant feel comfortable speaking up. The facilitator is responsible for creating the conditions of mutual trust; this may require taking into accounts such factors as hierarchical relationships, the presence of a manager for instance may inhibit discussion of performance issues.
Being an all-hands meeting, a retrospective comes at a significant cost in person-hours. Poor execution, either from the usual causes of bad meetings (lack of preparation, tardiness, inattention) or from causes specific to this format (lack of trust and safety, taboo topics), will result in the practice being discredited, even though a vast majority of the Agile community views it as valuable.
An effective retrospective will normally result in decisions, leading to action items; it’s a mistake to have too few (there is always room for improvement) or too many (it would be impractical to address “all” issues in the next iteration). One or two improvement ideas per iteration retrospective may well be enough.
Identical issues coming up at each retrospective, without measurable improvement over time, may signal that the retrospective has become an empty ritual

Operations Reviews

An operations review is typically larger than a retrospective and includes representatives from a whole value stream. It is common for as many as 12 departments to present objective, quantitative data that show the demand they received and reflect their capability to deliver against the demand. Operations reviews are typically held monthly. The key differences between an operations review and a retrospective is that operations reviews span a wider set of functions, typically span a portfolio of projects and other initiatives, and use objective, quantitative data. Retrospectives, in comparison, tend to be scoped to a single project; involve just a few teams such as analysis, development, and test; and are generally anecdotal in nature.

An operations review will provoke discussions about the dynamics affecting performance between teams. Perhaps one team generates failure demand that is processed by another team? Perhaps that failure demand is disruptive and causes the second team to miss their commitments and fail to deliver against expectations? An operations review provides an opportunity to discuss such issues and propose changes. Operations reviews typically produce a small backlog of potential kaizen events that can be prioritized and scheduled for future implementation.

There is no such thing as a single Lean Software Development process. A process could be said to be Lean if it is clearly aligned with the values and principles of Lean Software Development. Lean Software Development does not prescribe any practices, but some activities have become common. Lean organizations seek to encourage kaizen through visualization of workflow and work-in-progress and through an understanding of the dynamics of flow and the factors (such as bottlenecks, non-instant availability, variability, and waste) that affect it. Process improvements are suggested and justified as ways to reduce sources of variability, eliminate waste, improve flow, or improve value delivery or risk management. As such, Lean Software Development processes will always be evolving and uniquely tailored to the organization within which they evolve. It will not be natural to simply copy a process definition from one organization to another and expect it to work in a different context. It will also be unlikely that returning to an organization after a few weeks or months to find the process in use to be the same as was observed earlier. It will always be evolving.

The organization using a Lean software development process could be said to be Lean if it exhibited only small amounts of waste in all three forms (“mura,” “muri,” and “muda”) and could be shown to be optimizing the delivery of value through effective management of risk. The pursuit of perfection in Lean is always a journey. There is no destination. True Lean organizations are always seeking further improvement.

Lean Software Development is still an emerging field, and we can expect it to continue to evolve over the next decade.

Lean software development at BBC WorldWide

The lean ideas behind the Toyota production system can be applied to software project
management. This paragraph explains the  investigation of the performance of a nine-person software development team employed by BBC Worldwide based in London. The data collected in 2009 involved direct observations of the development team, the kanban boards, the daily stand-up meetings, semistructured interviews with a wide variety of staff, and statistical analysis. The evidence shows that over the 12-month period, lead time to deliver software improved by 37%, consistency of delivery rose by 47%, and defects reported by customers fell 24%. The significance of this work is showing that the use of lean methods including visual management, team-based problem solving, smaller batch sizes, and statistical process control can improve software development. It also summarizes key differences between agile and lean approaches to software development.
The conclusion is that the performance of the software development team was improved by adopting a lean approach. The faster delivery with a focus on creating the highest value to the customer also reduced both technical and market risks.

Lean software development at IMVU INc

IMVU Inc. (www.imvu.com) is a virtual company where users meet as personalized avatars in 3D digital rooms. Founded in 2004, IMVU has 25 million  registered users, 100,000 registered developers and  reached $1 million in monthly revenue. Over 90 percent of IMVU‘s revenue is from the direct sale of  virtual credits (a form of currency) to users who purchase digital products from its 1.8 million item digital  catalog. IMVU has won the 2008 Virtual Worlds  Innovation Award and was also named a Rising Star  in the 2008 Silicon Valley Technology Fast 50 program. IMVU receives funding from top venture investors Menlo Ventures, Allegis Capital and Bridgescale Partners. Its offices are located in Palo Alto,  CA. (http://crunchbase.com/company/imvu)

Software Development at IMVU
IMVU‘s founders had previously founded  There.com—a virtual world startup that took three  years to build, burned through a ton of money, and  was an abysmal failure after launch. However, from  an engineering perspective, There.com was an  amazing success, as they built it ahead of schedule,  maintained tight quality standards, and solved multiple difficult technical problems. Still, it wasn‘t a  commercial success—and large amounts of time and  money were wasted. As a result, IMVU‘s founding  team decided to build the minimum viable product  and then test it with users—even if the product  seemed only half-built (an engineer‘s nightmare).
As a result, IMVU was one of the startups that pioneered the ―build-just-a-little-and-get-customer-feedback model. This model was only possible because  of the application of several lean principles at the  technical level in the development process.
Lean Principle #1: Specify Value in the Eyes of the  Customer
From the beginning, IMVU‘s founders decided they  wanted to build a culture of ―ship, ship, ship. From a  business perspective, this makes a lot of sense, but  from an engineering perspective, it‘s like pulling fingernails with a pair of rusty pliers:―Bugs were all over the place, extremely ugly looking, and only the most rudimentary features.
In essence, releasing a sub-par product allowed  IMVU to avoid over-production wastes by putting  man hours only into features their customers liked.
Lean Principle #2: Identify Value Stream and  Eliminate Waste
The IMVU team worked hard to cultivate the ―ship,  ship, ship mentality. For example, on their very first  day, most developers were expected to write some  code and push it into production. Even though it was  generally just a small bug fix or a miniscule feature,  this ―release-code-on-the-first-day-of-work idea  seemed revolutionary for most new hires.
Continuous deployment reduced the wastes of overproduction, waiting, and processing. In a traditional  development process, multiple engineers are busy  building multiple features based on the last bit of  stable code. When they try to deploy their feature  after two weeks of work, they find that someone else  deployed a different feature the previous day and the
two features don‘t play well together. Continuous  deployment allows engineers to upload their work  instantaneously – thus ensuring engineers are always  working from the same base code. This avoids  spending extra weeks making the feature code  compatible.
Lean Principle #3: Make Value Flow at the Pull of the Customer
IMVU projects have an eight week Return on Investment (ROI) target. Whenever someone suggests a  small project, they are asked to provide a general  roadmap showing that the project could repay the  time investment in eight weeks. Projects are continuously tested on small numbers of IMVU users –who often had no idea they were part of a bucket test.  If a project shows success, they keep working on it.
After a few weeks, if the numbers shows the project  had zero chance of positive ROI, it is shut down immediately. Over time, as IMVU matures, this project  ROI target is expanded:
Lean Principle #4: Involve and Empower
Employees IMVU implemented the 5 Why‘s process, also known as ―Root cause analysis, to involve and empower its employees during trouble-shooting  processes. 5 Why‘s process is the technique of asking  why five times to get to the root cause of a problem  when it occurs.
in blog posts by Ries, each IMVU engineer has his/her own sandbox which mimicked
production as close as possible. IMVU has a comprehensive set of unit, acceptance, functional, and performance tests, and practiced Test-Drive-Development across the whole team. Engineers build a series of test tags, and quickly run a subset of tests in their sandboxes. Revisions are required if a test fails. To keep developers on the same code before it passed the various tests, IMVU created the equivalent of a Kanban system plus an Andon cord (automated testing and immediate rollback system). Developers are assigned a single task, and not allowed to move onto the next task until their code not only successfully passes the automated testing, but also has successfully deployed. Only then, they can pull the next task from the queue. This means that developers have a little bit of idle time while the tests were running. It also means that code is fully completed before a developer moves on. As a result, engineering is optimized for productivity rather than activity:

Lean Principle #5: Continuously Improve in Pursuit of Perfection
The problem with all this emphasis on ―ship, ship, ship, was different bugs in the code kept taking the site down. Sometimes it was simply a scaling issue— new upgrades worked fine on an engineer‘s computer, but crashed when hundreds of thousands of users tried it. Other times, it was a new employee releasing some feature without understanding how the previous code base worked. From a business perspective, it didn‘t matter what the problem was; if the site was down, IMVU was losing money.
From a technical perspective, each new problem required a different solution. Solving scaling issues is very different than solving a single infinite loop problem. The only practical fix was either cease continuous deployment or institute automated tests that checked the code, plus allowed for immediate code rollbacks if any server started to crash.

Eventually, IMVU architected a series of automated tests that looked at every new code check-in, tested it, and then pushed it onto the live servers. If at any point the code crashed—either during testing or once it started running in the wild—the automated tests instituted a rollback to the last verified good version, and sent a nice little e-mail back to the engineer that said ―We‘re sorry, but it looks like your code ABC caused a problem at XYZ. Afraid we can‘t let your code go live until this is fixed. As a result, the automated testing caught an amazing number of errors, and IMVU management started pursuing massively high quality expectations.

IMVU Successfully implemented lean principles at the technical level in the software development  process. They encountered many common challenges that software companies face: choosing the  right product feature, long development cycle, endless testing and debugging. IMVU found solutions  by sticking with the basic lean principles. They were  able to identify and reduce common wastes in software development process, specifically, overproduction, waiting, process, and defects.

IMVU clearly demonstrated the importance of lean  implementation in the software development process.
The implementation of lean principles cannot turn  software development into a production line environment, with scientific methods for each step of the  way. However, it can help turn a chaotic, constantly  changing process, into a much more predictable, fast  moving, and streamlined process. Lean implementation coupled with brilliant designs and fully engaged  intellectual team can help deliver great software  products.
It would seem that the rapid release cycles called for  by lean principles can only be effective if there is a  comprehensive and rigorous testing environment. An  interesting question is whether IMVU‘s practices  (such as daily release online) would be applicable to  software companies that focus on packaged rather  than online products. In this case, the ―customer is a  combination of the other developers and the ultimate  consumer. IMVU‘s experience challenges the conventional wisdom in software development. Can it be  beneficial to all software companies striving to deliver the right product, at the right time, and at the  right price? Middleton and Sutton  believe that the benefits work across different types of software. Yet,  they also recognize that lean software is far too early  in its evolution.

Lean Beyond Agile

In recent years, Lean Software Development has really emerged as its own discipline related to, but not specifically a subset of the Agile movement. This evolution started with the synthesis of ideas from Lean Product Development and the work of Donald G. Reinertsen and ideas emerging from the non-Agile world of large scale system engineering and the writing of James Sutton and Peter Middleton. David J. Anderson also synthesized the work of Eli Goldratt and W. Edwards Deming and developed a focus on flow rather than waste reduction . At the behest of Reinertsen around 2005,  David J. Anderson introduced the use of kanban systems that limit work-in-progress and “pull” new work only when the system is ready to process it. Alan Shalloway added his thoughts on Lean software development in his 2009 book on the topic. Since 2007, the emergence of Lean as a new force in the progress of the software development profession has been focused on improving flow, managing risk, and improving (management) decision making. Kanban has become a major enabler for Lean initiatives in IT-related work. It appears that a focus on flow, rather than a focus on waste elimination, is proving a better catalyst for continuous improvement within knowledge work activities such as software development.

PEARL V: A Purview on Techniques for Estimation in Agile s/w Methodology

PEARL V: A Purview on Techniques for Estimation in Agile s/w Methodology

Estimation is one of the most misused elements in all of software development. Estimates should reflect the relative difficulty and length of work to guide planning and prioritization — not commit the team to mandatory Saturdays. With estimates in hand, stakeholders can make smart tradeoffs and reasonable forecasts.

Plans are only as good as the estimates, that the plans are based on, and estimates always come second to actual. The real world has this horrible habit of destroying plans.The customer has the right to an overall plan, to see progress and to be informed of schedule changes, Whereas the developer has the right to make and update his own estimates and to accept responsibility instead of having responsibility assigned to him.

You can’t put 10 pounds of groceries into a 5 pound bag. Forecasting tomorrow’s weather is much more difficult than telling what the weather was like yesterday.
Don’t try to be too sophisticated; estimates will never be anything other than approximate, however hard you try. Most software is not a predictable or mass manufacturing problem. Software development is new product development.

It is rarely possible to create upfront unchanging and detailed specs.Near the beginning, it is not possible to estimate. As empirical data emerge, it becomes increasingly possible to plan and estimate.Adaptive steps driven by build-feedback cycles are required. Creative adaptation to unpredictable change is the norm. Change rates are high.

We can often spend a little time thinking about an estimate and come up with a number that is nearly as good as if we had spent a lot of time thinking about it. we often need to expend just a fraction of that effort to get adequate results.

As effort on estimation increases, the accuracy may decrease after a certain amount of effort on estimation.

Vary the effort you put into estimating according to purpose of the estimate. if  the estimate will be used to make a software build versus buy decision, it is likely enough to determine that the project will take six to twelve months. It may be unnecessary to refine that to the point where you can say it will take seven or eight months.

First, no matter how much effort is invested, the estimate is never at the top of the accuracy that is 90 % to 100 %  accurate.No matter how much effort you put into an estimate, an estimate is still an estimate. No amount of additional effort will make an estimate perfect. It is possible to put too much effort into estimating, with the result being a less accurate estimate.

Agile teams, acknowledge that we cannot eliminate uncertainty from estimates, but they embrace the idea that small efforts are rewarded with big gains. Even though they are less far up the accuracy/effort scale, agile teams can produce more reliable plans because they frequently deliver small increments of fully working, tested, integrated code.

 

Contrasting Traditional and Agile Estimation Techniques

An average software project begins when a team or person outlines a project and receives approval to go forward. The project may be started by a product manager with an idea for an existing product, or by a customer request, or by the signing of a contract.

In the early stages of a project, someone guesses how long it will take to deliver. This person may be a salesperson, project manager, or development manager. They may make a guess based on their experience, or they may have some quick chats with seasoned employees and solicit their opinions.

When the timeline guess is in place, the project begins. If the project is related to a product, there may be marketing requirements to reference. If the project is for a customer, there may be a statement of work to reference. In either case, it’s common for an analyst team to convert the information into functional specifications.

After the functional specifications are completed, a conversation begins with the development team, designs begin to evolve, and some teams may document a technical design and architectural plan. When this work is complete, the development team provides estimates based on the anticipated approach. The team also estimates their capacity by resource type. Then the estimates, capacity, and known dependencies are entered into a project plan. At this point, the team has a schedule that they feel confident in, and they share it with the stakeholders.

This exercise may take several weeks or months to complete. If a project is timeboxed, the team may find that there isn’t enough time to deliver all the features for which they created functional specifications, designs, and estimates. The team then has to scope back the features for the project to meet the timeline, realizing they’ve wasted valuable time in estimating features that won’t be pursued.

Agile estimation techniques address the shortcomings of this method. You don’t design and estimate all your features until there has been a level of prioritization and you’re sure the features are needed. You used a phased approach to estimation, recognizing that you can be more certain as the project progresses and you learn more about the features.

At a high level, the phased process looks like this:

  1. Estimate the features in a short, time-boxed exercise during which you estimate feature size, not duration.
  2. Use feature size to assign features to iterations and create a release plan.
  3. Break down the features you assigned to the first iteration. Breaking down means identifying the specific tasks needed to build the features and estimating the hours required.
  4. Re-estimate on a daily basis during an iteration, estimating the time remaining on open tasks.

Agile estimating is also different in that you involve the entire team in the estimation process.

Whole Team Estimation

Every year, Best Buy Corporation tries to predict how many gift cards will be sold at Christmas. The typical process is to solicit the opinion of upper management and internal estimation experts to forecast a number.

In 2005, the CEO of Best Buy decided to try an experiment. The CEO followed the normal process for obtaining the estimates but also sent an email to approximately 100 random employees throughout the company, asking them how many gift cards they believed would be sold. The only information provided to both groups was the sales number for the previous year.

After the Christmas season was completed, the predictions of both groups were reviewed. The expert panel was accurate within 95 percent of the actual number of cards sold. The random group of employees was accurate within 99.9 percent of the number of cards sold . How did a random group beat the internal estimation experts?

In his book The Wisdom of Crowds, author James Surowiecki makes a case that a diverse set of independently thinking individuals can provide better predictions than a group of experts. Surowiecki qualifies this assertion by stating that the diversity needs to be in the way a group views problems and the heuristics each individual uses to analyze a problem or question. For example, a person’s age can greatly influence their perspective on an issue.

 Best Buy Corporation realized improved estimation accuracy by querying a large, diverse group of employees. The diverse set of employees consistently delivered better estimates than the in-house estimation experts.

Referenced Screen

Surowiecki’s work draws many parallels to the issues with estimating software development. We often get together a group of specialists or experts to estimate the work that needs to be completed. These experts may be managers or leads who facilitate the work of their various teams. The fact that all the experts may be a part of management limits their diversity in opinion. And the fact that these experts may work together frequently may lead to standardized thinking, also known as groupthink.

In an Agile environment, you increase the accuracy of your feature estimates by estimating the features together as a team. Estimates aren’t limited to managers or leads but also include developers, testers, analysts, DBAs, and architects. The features are viewed from various perspectives, and you merge these perspectives to create a common, agreed-on estimate.

Entire-team estimation has additional benefits beyond diverse opinion. First, you get estimates from people who are closer to the work. Team members’ opinions may be diverse, but they provide better estimates because they know your existing code, architecture, and domains and what it takes to deliver in your environment.

A second benefit is team ownership of the estimate. If a manager provides the estimate, they hope the team supports the estimate and buys into it. If the team provides the estimate, they’re immediately closer to owning the estimate, and they feel more responsible for making the dates they provided.

Moving to team-based estimation isn’t easy. Managers may not welcome additional input, and team members may be reluctant to challenge the experts and instead echo whatever the experts say.

It will take time to overcome these hurdles, but you can do one thing to expedite the change: when you perform team-based estimation, have the meeting facilitated by an indirect manager such as a project manager or ScrumMaster. This person can treat all people as equals regardless of title and proactively query team members who are reluctant to contribute. You can also use the planning poker process discussed in the next section to prevent one person’s estimate from influencing another’s.

Estimates are not created by a single individual on the team. Agile teams do not rely on a single expert to estimate. Despite well-known evidence that estimates prepared by those who will do the work are better than estimates prepared by anyone else (Lederer and Prasad 1992), estimates are best derived collaboratively by the team, which includes those who will do the work. There are two reasons for this.

Size of story is given in “story points” (an abstract unit). The team defines how  a story point translates to effort (typically: 1 story point = 1 ideal day of work). ƒ The number of story points that a team can deliver in an iteration is called  “team velocity”.

Agile Estimation

There are three main concepts  team need to understand to do agile estimation,

  • Estimation of Size gives a high-level estimate for the work item, typically measured using a neutral unit such as story points
  • Velocity tells us how many points this project team can deliver within an iteration;
  • Estimation of Effort translates the size (measured in points) to a detailed estimate of effort typically using the units of Actual Days or Actual Hours. The estimation of effort indicates how long it will take the team member(s) to complete the assigned work item(s).

Estimation of Size

Story Points is a relative measure that can be used for agile estimation of size. The team decides how big a story point is, and based on that size, determines how many story points each work item is. To make estimation go fast, use only full story points, 1, 2, 3, 5, 8, and so on, rather than fractions of a point, such 0.25, or 1.65 story points. To get started, look at 10 or so representative work items, give the smallest the size of one story point, and then go through all other work items and give them a relative story point estimate based on that story point. Note that story points are used for high-level estimates, so do not spend too much time on any one item. This is especially true for work items of lower priority, to avoid wasting effort on things that are unlikely to be addressed within the current iteration.

A key benefit of story points is that they are neutral and relative. Let’s say that Ann is 3 times more productive than Jack. If Ann and Jack agree that work item A is worth 1 story point, and they both think work item B is roughly 5 times as big, they can rapidly agree that work item B is worth 5 points. Ann may however think work item B can be done in 12 hours, while Jack thinks it can be done in 36 hours. That is fine, they may disagree about the actual effort required to do it, but we do not care at this point in time, we only want the team to agree on the relative size. We will later use Velocity to determine how much ‘size’, or how many points, the team can take on within an iteration.

One project team may say that a work item of a certain size is worth 1 point. Another project team would estimate the same sized work item to be worth 5 points. That is fine, as long as you are consistent within the same project. Make sure that the entire team is involved in assessing size, or at least that the same people are involved in all your size estimates, to ensure consistency within your project. We will see how the concept of velocity will fix also this discrepancy in a point meaning different things to different project teams.

You can also use other measures of size, where the most common alternative is Ideal Days.

Velocity

Velocity is a key metric used for iteration planning. It indicates how many points are delivered upon within an iteration for a certain team and project. As an example, a team planned to accomplish 20 points in the first iteration. At the end of the iteration, they noticed that they only delivered upon 14 points, their velocity was hence 14. For the next iteration, they may plan for fewer points, let’s say 18 points, since they think they can do a little better than in previous iteration. In this iteration, they delivered 17 points, giving them a velocity of 17.

Expect the velocity to change from iteration to iteration. Some iterations go smoother than others, and points are not always identical in terms of effort. Some team members are more effective than others, and some problems end up being harder than others. Also, changes to the team structure, learning new skills, changes to the tool environment, better teaming, or more overhead with meetings or tasks external to the project will all impact velocity. In general, velocity typically increases during the project as the team builds skills and becomes more cohesive.

Velocity compensates for differences between teams in terms of how big a point is. Let’s assume that project team Alpha and project team Beta are equally efficient in developing software, and they run the same project in parallel. Team Alpha, however, assesses all work items as being worth 3 times as many points as team Beta’s estimates. Team Alpha assesses work item A, B, C, and D to correspond to 30 points, and team Beta estimates the same work items to correspond to 10 points. Both teams deliver upon those 4 work items in the next iteration, giving team Alpha a velocity of 30, and team Beta a velocity of 10. It may sound as if team Alpha is more effective, but let’s look at what happens when they plan the next iteration. They both want to take on work item E-H, which team Alpha has estimated to be 30 points, and team Beta as normal has estimated to be 1/3 as many points, or 10 points. Since a team can typically take on as many points as indicated by their velocity, they can both take on all of E-H. The end result is that it does not matter how big a point is, as long as you are consistent within your team.

Velocity also averages out the efficiency of different team members. Let’s look at an example; Let’s assume that Ann always works 3 times as fast as Jack and Jane. Ann will perhaps deliver 9 points per iteration, and Jack and Jane 3 points each per iteration. The velocity of that 3-person team will be 15 points. As mentioned above, Ann and Jack may not agree on how much effort is associated with a work item, but they can agree on how many points it is worth. Since the team velocity is 15, the velocity will automatically translate the point estimate to how much work can be taken on. As you switch team members, or as team members become more or less efficient, your velocity will change, and you can hence take on more or less points. This does however not require you to change the estimate of the size. The size is still the same, and the velocity will help you to calculate how much size you can deliver upon with the team at hand for that iteration.

Estimation of Effort

Estimation of Effort translates the size (measured in points) to a detailed estimate of effort typically using the units of Actual Days or Actual Hours. As you plan an iteration, you will take on a work item, such as detail, design, implement and test a scenario, which may be sized to 5 points. Since this is still a reasonably big work item, break it down into a number of smaller work items, such as 4 separate work items for Detailing, Designing, Implementing and Testing Server portion, and Implementing and Testing Client portion of the scenario. Team members are asked to sign up for the tasks, and then detail the estimate of the actual effort, measured in hours or days, for their tasks. In this case, the following actual estimates were done (with person responsible within parenthesis):

  • Detailing scenario (Ann): 4 hours
  • Designing scenario (Ann and Jack):  6 hours
  • Implementing and Testing Server portion of scenario (Jack): 22 hours
  • Implementing and Testing Client portion of scenario (Ann): 12 hours
  • Total Effort Estimate for Scenario: 44 hours

If other people would be assigned to the tasks, the estimated actual hours could be quite different. There is hence no point doing detailed estimates until you know who will do the work, and what actual problems you will run into. Often, some level of analysis and design of the work item needs to take place before a reasonable estimate can be done. Remember that estimates are still estimates, and a person assigned to a task should feel free (and be encouraged) to re-estimate the effort required to complete the task, so we have a realistic view of progress within an iteration.

First, on an agile project we tend not to know specifically who will perform a given task. Yes, we may all suspect that the team’s database guru will be the one to do the complex stored procedure task that has been identified. However, there’s no guarantee that this will be the case. S/he may be busy when the time comes, and someone else will work on it. So because anyone may work on anything, it is important that everyone have input into the estimate. Second, even though we may expect the database guru to do the work, others may have something to say about her estimate. Suppose that the team’s database guru, Kristy, estimates a particular user story as three ideal days. Someone else on the project may not know enough to program the feature himself, but he may know enough to say, “Kristy, you’re nuts; the last time you worked on a feature like that, it took a lot longer. I think you’re forgetting how hard it was last time.” At that point Kristy may offer a good explanation of why it’s different this time. However, more often than not she will acknowledge that she was indeed underestimating the feature.

The Estimation Scale Studies have shown that we are best at estimating things that fall within one order of magnitude (Miranda 2001; Saaty 1996). Within your town, you should be able to estimate reasonably well the relative distances to things like the nearest grocery store, the nearest restaurant, and the nearest library. The library may be twice as far as the restaurant, for example.  Because we are best within a single order of magnitude, we would like to have most of our estimates in such a range. Two estimation scales are of good success

which are

◆ 1, 2, 3, 5, and 8

◆ 1, 2, 4, and 8

There’s a logic behind each of these sequences. The first is the Fibonacci sequence. Agilists found this to be a very useful estimation sequence because the gaps in the sequence become appropriately larger as the numbers increase. A one- point gap from 1 to 2 and from 2 to 3 seems appropriate, just as the gaps from 3 to 5 and from 5 to 8 do. The second sequence is spaced such that each number is twice the number that precedes it. These nonlinear sequences work well because they reflect the greater uncertainty associated with estimates for larger units of work. Either sequence works well, although  preference is for the first. Each of these numbers should be thought of as a bucket into which items of the appropriate size are poured.

Rather than thinking of work as water being poured into the buckets, think of the work as sand. If you are estimating using 1, 2, 3, 5, and 8, and have a story that you think is just the slightest bit bigger than the other five-point stories you’ve estimated, it would be OK to put it into the five-point bucket.

A story you think is a 7, however, clearly would not fit in the five-point bucket.You may want to consider including 0 as a valid number within your estimation range. Although it’s unlikely that a team will encounter many user stories or features that truly take no work, including 0 is often useful. There are two reasons for this. First, if we want to keep all features within a 10x range, assigning nonzero values to tiny features will limit the size of largest features. Second, if the work truly is closer to 0 than 1, the team may not want the completion of the feature to contribute to its velocity calculations. If the team earns one point in this iteration for something truly trivial, in the next iteration their velocity will either drop by one or they’ll have to earn that point by doing work that may not be as trivial. If the team does elect to include 0 in their estimation scale, everyone involved in the project (especially the product owner) needs to understand that 13 × 0 ≠ 0 .

 Agilists never had the slightest problem explaining this to product owners, who realize that a 0-point story is the equivalent of a free lunch. However, they also realize there’s a limit to the number of free lunches they can get in a single iteration. An alternative to using 0 is to group very small stories and estimate them as a single unit. Some teams prefer to work with larger numbers, such as 10, 20, 30, 50, and 100. This is fine, because these are also within a single order of magnitude. However, if you go with larger numbers, such as 10 to 100, Agilists still recommend that you pre-identify the numbers you will use within that range. Do not, for example, allow one story to be estimated at 66 story points or ideal days and another story to be estimated at 67. That is a false level of precision, and we cannot discern a 1.5% difference in size. It’s acceptable to have one-point differences be-tween values such as 1, 2, and 3. As percentages, those differences are much larger than between 66 and 67.

User Stories, Epics, and Themes

3-tier story sizing estimation is a great approach when managing work from the Portfolio to Program to Project level. Corporate initiatives (investment themes) are implemented by Features. These are coarse grain, high level items used for Product Road Mapping and the Story Point sizing is a very rough estimate of the total effort. As Features get broken down into Epics and then stories, each progressive refinement results in more granular Story Points estimates that can be used for Release and Sprint Planning.

Although in general, we want to estimate user stories whose sizes are within one order of magnitude, this cannot always be the case. If we are to estimate every-thing within one order of magnitude, it would mean writing all stories at a fairly fine-grained level.

For features that we’re not sure we want (a preliminary cost estimate is desired before too much investment is put into them) or for features that may not happen in the near future, it is often desirable to write one much larger user story.

A large user story is sometimes called an epic. Additionally, a set of related user stories may be combined (usually by a paper clip if working with note cards) and treated as a single entity for either estimating or release planning. Such a set of user stories is referred to as a theme.

An epic, by its very size alone, is often a theme on its own. By aggregating some stories into themes and writing some stories as epics, a team is able to reduce the effort they’ll spend on estimating. However, it’s important that they realize that estimates of themes and epics will be more uncertain than estimates of the more specific, smaller user stories.

User stories that will be worked on in the near future (the next few iterations) need to be small enough that they can be completed in a single iteration. These items should be estimated within one order of magnitude. Agilists use the sequence 1, 2, 3, 5, and 8 for this. User stories or other items that are likely to be more distant than a few iterations can be left as epics or themes. These items can be estimated in units beyond the 1 to 8 range. To accommodate estimating these larger items Agilists add 13, 20, 40, and 100 to preferred sequence of 1, 2, 3, 5, and 8.

Deriving an Estimate The three most common techniques for estimating are

◆ Expert opinion

◆ Analogy

◆ Disaggregation

Each of these techniques may be used on its own, but the techniques should be combined for best results.

Expert Opinion

If you want to know how long something is likely to take, ask an expert. At least, that’s one approach. In an expert opinion-based approach to estimating, an expert is asked how long something will take or how big it will be. The expert relies on her intuition or gut feel and provides an estimate. This approach is less useful on agile projects than on traditional projects. On an agile project, estimates are assigned to user stories or other user-valued functionality. Developing this functionality is likely to require a variety of skills normally performed by more than one person. This makes it difficult to find suitable experts who can assess the effort across all disciplines. On a traditional project for which estimates are associated with tasks, this is not as significant of a problem, because each task is likely performed by one person. A nice benefit of estimating by expert opinion is that it usually doesn’t take very long. Typically, a developer reads a user story, perhaps asks a clarifying question or two, and then provides an estimate based on her intuition. There is even evidence that says this type of estimating is more accurate than other, more analytical approaches (Johnson et al. 2000).

Analogy

An alternative to expert opinion comes in the form of estimating by analogy, which is what we’re doing when we say, “This story is a little bigger than that story.” When estimating by analogy, the estimator compares the story being estimated with one or more other stories. If the story is twice the size, it is given an estimate twice as large. There is evidence that we are better at estimating relative size than we are at estimating absolute size (Lederer and Prasad 1998; Vicinanza et al. 1991). When estimating this way, you do not compare all stories against a single baseline or universal reference. Instead, you want to estimate each new story against an assortment of those that have already been estimated. This is referred to as triangulation. To triangulate, compare the story being estimated against a couple of other stories. To decide if a story should be estimated at five story points, see if it seems a little bigger than a story you estimated at three and a lit- tle smaller than a story you estimated at eight.

Disaggregation

Disaggregation refers to splitting a story or feature into smaller, easier-to-estimate pieces. If most of the user stories to be included in a project are in the range of two to five days to develop, it will be very difficult to estimate a single story that may be 100 days. Not only are large things notoriously more difficult to estimate, but also in this case there will be very few similar stories to compare. Asking “Is this story fifty times as hard as that story” is a very different question from “Is this story about one-and-a-half times that one?” The solution to this, of course, is to break the large story or feature into multiple smaller items and estimate those. However, you need to be careful not to go too far with this approach. Not only does the likelihood of forgetting a task increase if we disaggregate too far, but summing estimates of lots of small tasks also leads to problems.

Planning Poker

The best way Agilists have found for agile teams to estimate is by playing planning poker (Grenning 2002). Planning poker combines expert opinion, analogy, and disaggregation into an enjoyable approach to estimating that results in quick but reliable estimates. Participants in planning poker include all of the developers on the team. Remember that developers refers to all programmers, testers, database engineers, analysts, user interaction designers, and so on. On an agile project, this will typically not exceed ten people. If it does, it is usually best to split into two teams. Each team can then estimate independently, which will keep the size down. The product owner participates in planning poker but does not estimate. At the start of planning poker, each estimator is given a deck of cards. Each card has written on it one of the valid estimates.

Each estimator may, for example, be given a deck of cards that reads 0, 1, 2, 3, 5, 8, 13, 20, 40, and 100. The cards should be prepared prior to the planning poker meeting, and the numbers should be large enough to see across a table. Cards can be saved and used for the next planning poker session.

For each user story or theme to be estimated, a moderator reads the description. The moderator is usually the product owner or an analyst. However, the moderator can be anyone, as there is no special privilege associated with the role. The product owner answers any questions that the estimators have. The goal in planning poker is not to derive an estimate that will withstand all future scrutiny.

Rather, the goal is to be somewhere well on the left of the effort line, where a valuable estimate can be arrived at cheaply. After all questions are answered, each estimator privately selects a card representing his or her estimate. Cards are not shown until each estimator has made a selection. At that time, all cards are simultaneously turned over and shown so that all participants can see each estimate. It is very likely at this point that the estimates will differ significantly. This is actually good news. If estimates differ, the high and low estimators explain their estimates. It’s important that this does not come across as attacking those estimators. Instead, you want to learn what they were thinking about.

As an example, the high estimator may say, “Well, to test this story, we need to create a mock database object. That might take us a day. Also, I’m not sure if our standard compression algorithm will work, and we may need to write one that is more memory efficient.” The low estimator may respond, “I was thinking we’d store that information in an XML file—that would be easier than a database for us. Also, I didn’t think about having more data—maybe that will be a problem.”

The group can discuss the story and their estimates for a few more minutes. The moderator can take any notes she thinks will be helpful when this story is being programmed and tested. After the discussion, each estimator re-estimates by selecting a card. Cards are once again kept private until everyone has estimated, at which point they are turned over at the same time. In many cases, the estimates will already converge by the second round. But if they have not, continue to repeat the process. The goal is for the estimators to converge on a single estimate that can be used for the story. It rarely takes more than three rounds, but continue the process as long as estimates are moving closer together. It isn’t necessary that everyone in the room turns over a card with exactly the same estimate written down.

Again, the point is not absolute precision but reasonableness.

Smaller Sessions 

It is possible to play planning poker with a subset of the team, rather than involving everyone. This isn’t ideal but may be a reasonable option, especially if there are many, many items to be estimated, as can happen at the start of a new project. The best way to do this is to split the larger team into two or three smaller teams, each of which must have at least three estimators. It is important that each of the teams estimates consistently.

What your team calls three story points or ideal days had better be consistent with what other team calls the same. To achieve this, start all teams together in a joint planning poker session for an hour or so. Have them estimate ten to twenty stories. Then make sure each team has a copy of these stories and their estimates and that they use them as base-lines for estimating the stories they are given to estimate.

When to Play Planning Poker

Teams will need to play planning poker at two different times. First, there will usually be an effort to estimate a large number of items before the project officially begins or during its first iterations.

Estimating an initial set of user stories may take a team two or three meetings of from one to three hours each. Naturally, this will depend on how many items there are to estimate, the size of the team, and the product owner’s ability to clarify the requirements succinctly. Second, teams will need to put forth some ongoing effort to estimate any new stories that are identified during an iteration. One way to do this is to plan to hold a very short estimation meeting near the end of each iteration. Normally, this is quite sufficient for estimating any work that came in during the iteration, and it allows new work to be considered in the prioritization of the coming iteration.

Alternatively, Kent Beck suggests hanging an envelope on the wall with all new stories placed in the envelope. As individuals have a few spare minutes, they will grab a story or two from the envelope and estimate them. Teams will establish a rule for themselves, typically that all stories must be estimated by the end of the day or by the end of the iteration.

Agilists like the idea of hanging an envelope on the wall to contain unestimated stories. However, they prefer that when someone has a few spare minutes to devote to estimating, he find at least one other person and that they estimate jointly.

Why Planning Poker Works

it’s worth spending a moment on some of the reasons why it works so well. First, planning poker brings together multiple expert opinions to do the estimating. Because these experts form a cross-functional team from all disciplines on a software project, they are better suited to the estimation task than anyone else.

After completing a thorough review of the literature on software estimation, Jørgensen (2004) concluded that “the people most competent in solving the task should estimate it.” Second, a lively dialogue ensues during planning poker, and estimators are called upon by their peers to justify their estimates. This has been found to improve the accuracy of the estimate, especially on items with large amounts of un-certainty (Hagafors and Brehmer 1983).

Being asked to justify estimates has also been shown to result in estimates that better compensate for missing information (Brenner et al. 1996). This is important on an agile project because the user stories being estimated are often intentionally vague. Third, studies have shown that averaging individual estimates leads to better results (Hoest and Wohlin 1998) as do group discussions of estimates (Jørgensen and Moløkken 2002). Group discussion is the basis of planning poker, and those discussions lead to an averaging of sorts of the individual estimates.

Finally, planning poker works because it’s fun.Expending more time and effort to arrive at an estimate does not necessarily increase the accuracy of the estimate. The amount of effort put into an estimate should be determined by the purpose of that estimate. Although it is well known that the best estimates are given by those who will do the work, on an agile team we do not know in advance who will do the work.

Therefore, estimating should be a collaborative activity for the team. Estimates should be on a predefined scale. Features that will be worked on in the near future and that need fairly reliable estimates should be made small enough that they can be estimated on a nonlinear scale from 1 to 10 such as 1, 2, 3, 5, and 8 or 1, 2, 4, and 8. Larger features that will most likely not be implemented in the next few iterations can be left larger and estimated in units such as 13, 20, 40, and 100. Some teams choose to include 0 in their estimation scale. To arrive at an estimate, we rely on expert opinion, analogy, and disaggregation.

A fun and effective way of combining these is planning poker. In planning poker, each estimator is given a deck of cards with a valid estimate shown on each card. A feature is discussed, and each estimator selects the card that represents his or her estimate. All cards are shown at the same time. The estimates are discussed and the process repeated until agreement on the estimate is reached.

Planning for Iterations

During project planning, iterations are identified, but the estimates have an acceptable uncertainty due to the lack of detail at the project inception. This task is repeated for each iteration within a release. It allows the team to increase the accuracy of the estimates for one iteration, as more detail is known along the project.

Ensure that the team commits to a reasonable amount of work for the iteration, based on team performance from previous iterations. Prioritize the work items list before you plan the next iteration. Consider what has changed since the last iteration plan (such as new change requests, shifting priorities of your stakeholders, or new risks that have been encountered).

When the team has decided to take on a work item, it will assign the work to one or several team members. Ideally, this is done by team members signing up to do the work, since this makes people motivated and committed to doing the job. However, based on your culture, you may instead assign the work to team members.

Wall Estimation

Planning poker is a fantastic tool for estimating user stories, but it would take an inordinate amount of time to estimate hundreds of stories, one at a time, using planning poker. If you have a raw backlog filled with hundreds of stories that have not been estimated or prioritized, you’re going to need a faster way to estimate.

Wall Estimation is designed to allow teams to eliminate discussions of 2 versus 3 and 5 versus 8 and instead group things in a purely relative manner along a continuum, at least initially. It also allows stakeholders to give a general prioritization to a large group of stories without getting hung up on whether one story is slightly more important than another.

Example Wall Estimation - Relative Sort

To do Wall Estimation, you must first print your user stories on cards. Then gather your team and stakeholders in a room with a big empty wall (about 14 feet long by 8-10 feet high); Understand two things about the wall:

  • Height determines priority. Stories at the top are higher; stories at the bottom are lower. A story’s priority can be based on ROI, business value, or something as simple as “it’s just important, and I don’t know why.”

  • Width is reserved for size. Stories on the left are smaller; stories on the right are bigger. (You can reverse this and move from right to left if you’re in, say, Japan and it’s more logical.) The important thing is to envision a line going horizontally and one going vertically. Team members and stakeholders should ask themselves, where, relative to the other stories, does this one fit?

The team will use the wall to size all of the stories. The stakeholders will use the wall to prioritize stories. As with planning poker, we’re using relative sizing, but instead of using two reference stories for comparison, the wall becomes the constant. Small story? Move to the left. Big story? Move to the right. Important story? Place it high. A story that we can live without for now? Place it low.

Although the stakeholders do not have to be there while the stories are being estimated, the team does need to be in the room while the stories are being prioritized. The ScrumMaster and product owner must attend both the estimation and prioritization activities.

Prioritization

Although  customers and stakeholders will want to know how big a story is to help them determine its priority, they’ll be much more focused on finding the stories that relate to them and making sure those stories get done. Expect your stakeholders to disagree about priority—your product owner will use this information to help decide the ultimate priority.

Ask the stakeholders to help determine the relative priority of all of these stories by moving these stories up or down inside the taped columns. Remind them that the higher up a story is on the wall, the higher its priority to the business. Set the following rules:

  • If you place a story at the top, be prepared to justify the placement.
  • You may ask each other why one story is more important than another. Feel free to ask each other, “Who moved this one down (or up)?” or to say aloud, “I think this one needs to move. Who wants to disagree?” This enables conversation between the interested parties, without facilitation.
  • If you move a story lower on the wall than someone else did, mark it with a colored dot to alert us.

The biggest benefit to prioritizing as a group is that all the stakeholders can better understand the priorities of various stories. If a discussion goes on too long without resolution, the product owner should collect the card, identify the two stakeholders who cannot agree, and make a note to meet with them privately later.

The exercise could take 2-6 hours, depending on the number of stories and the number of stakeholders. When you are finished, the wall will look something like the picture shown below.

Example Wall Estimation - Priority SortYour wall will break down roughly into four quadrants. The stories in the top left are high priority and small so they’ll end up in the top of the product backlog. The stories in the top right are high priority but are also large. These stories should be broken down soon so they can be brought into upcoming sprints.

Wall Estimation - Four QuadrantsThe lower left quadrant is made up of small stories that are lower in priority. They will likely fall to the bottom of the backlog. The lower right quadrant is filled with large stories that are also lower in priority. These stories are your epics or themes. They’ll eventually need to be broken down into smaller, more manageable stories but not until they rise in priority.

Spend some time looking at the wall as a whole with the group. If a story is in the wrong quadrant, move it. If a high-priority story must be broken down and time allows, do it while everyone is in the room.

At the end of wall estimation, you’ll have the start of a release plan. If you know the team’s historical velocity, you can even supply a rough range of which stories in the upper-left quadrant will be finished.

Estimation is hard because there is so much uncertainty at the beginning of a project. Product Owners and agile project managers try to maximize value early learning by having conversations with their product owners and stakeholders, producing working software, and integrating feedback about that software to get to a releasable state. But even agile projects must provide some estimate of when a set of features will be ready for release.

Quadrus Estimation Methodology

Quadrus is a recognized leader in IT professional services and solutions. Headquartered in Calgary, Alberta, Quadrus has delivered hundreds of successful projects across Western Canada since 1993. They are committed to providing the highest quality service to valued clients.

Quadrus has developed the Quadrus Estimation Methodology which is different for a few key reasons:
QEM recognizes that way in which people estimate – and when forming overall project estimates, QEM employs Monte-Carlo simulation to aggregate individual estimates in a statistically correct  way.
QEM takes into account those project tasks that are often left out of  project estimates. Activities such as requirements clarification, task  management and coordination, meetings, demos, testing and deployment are all included in the project estimate.
QEM recognizes that often not all project requirements are known  up front. QEM provides a mechanism for estimating the percentage of  known requirements (versus the percentage of unknown requirements), and the overall project estimate is appropriate scaled to  reflect the reality of unknown requirements.
QEM recognizes that most people naturally create single-point estimates for tasks or stories – and QEM is able to work with these (less-than-perfect) estimates. The input to QEM consists, quite simply, of the single-point estimate that the developer feels is the most intuitive estimate (the median) together with an uncertainty factor (Low, Medium or High) to indicate the range of the distribution curve to use. If a task seems to have many unknowns or perhaps uses new technologies – or if it may have an element of research or invention – then the uncertainty will be higher than a task that is known and recognized (lower uncertainty).
This uncertainty is very important; recognizing it exists can drastically change how much time we should assume a task can take. One story may have a lower numeric estimate than another but with a greater level of uncertainty could actually end up taking more time.
These inputs are not onerous to produce compared to the inputs demanded by other estimation methodologies that have been tried in the past (e.g., feature points, estimated lines of code, etc.). Together the single-point estimate/uncertainty pair provides enough information to calculate an expected average estimate for the individual story together with a range.

Quadrus has codified the Quadrus Estimation Methodology as a modern AJAX-powered, intuitive and easy to use web application named the Quadrus Estimator. The Quadrus Estimator has many features, such as support for the user-story importance to be ranked using a star-rating and enabling stories to be labeled. The star-rating and labels can be used to manage and filter the list of tasks and also to include or exclude stories from an estimate for “what-if” scenarios based on less important stories or certain features being postponed for future phases.
The simulation also accepts additional inputs to allow additional effort to be counted for which is often overlooked when creating estimates. This is in the form of “Story Glue” to represent non-development time that is still project related and must be accounted for (e.g., meetings, writing reports, demos, etc.).

The Quadrus Estimator calculates the total effort to complete the project in man-months. Along with the effort estimate, a combination of proven industry guidelines and research is applied to identify the ideal balance between resources (people) and duration (timescale). While either can be adjusted in the Estimator, the system shows the cost of favoring the other in terms of additional effort (overall cost), more resources (for faster delivery), or longer timescales (for team-size constraints).

PEARL VII : Agile metrics measurement for Software Quality

PEARL VII :

The metrics for measuring software quality in an agile environment are drastically different from that of those used in traditional IT landscapes.  How to effectively measure Agile S/w development  Quality in Agile Enterprise using Agile Metrics  is dealt here.

 

Agile software development grew in part from the intersection of  erstwhile alternative practices that predated it. And even more  derivatives had sprung up throughout the years. This had led to  the evolution of Agile in its current state. Wherein, all the defined  practices becomes a catalog from which development teams could choose from, and adapt what is appropriate for their case. This evolution then leads to the fact that development teams will  not easily have uniform processes. Forcing uniformity will most  likely negate the benefits of Agile. That said, most metrics are  intimately tied to actual practices, but measurements should be  able to cope with this perceived variability

The traditional metrics are also in conflict with agile’s and  lean’s principles. For example, a focus on adherence to  estimates is incompatible with agile’s principle of embracing  change. It will lead to chasing obstacles, instead of seizing  opportunities.

 In Agile Management for Software Engineering: Applying the Theory of Constraints for Business Results, David Anderson combines TOC with Agile software development, with the objective of creating a process that “scales in scope and discipline to be acceptable in the boardrooms of the Fortune 1000″ . Anderson compares traditional “waterfall”, FDD, XP and RAD approaches, and proposes a rigorous metrics approach. 

Anderson provides a convincing argument for the traditional metrics inability to measure agile software  development . By demonstrating how they violate Reinertsen’s criteria for a good metric. Traditional metrics do not meet the criterion of being relevant, because of the high cost focus. The cost should not be the  main concern.  Moreover, they  elude the requirements of being simple and easy to collect. For  example, the once popular traditional metric to count the lines  of code has no simple correlation with the actual effort. The  software complexity results in a nonlinear function between the effort and the lines of code. It also motivate to squeeze in

It is impossible to create a metric set that would suit all agile projects. Every project has different goals and needs, and, as  the incremental and emergent nature of agile methods
[Mnkandla and28 Dwolatzky, 2007] implies that metrics – as part of the framework of development technologies – should also be allowed to emerge. There are, however, methods for choosing suitable metrics.

When measuring the production side of the development it is important to select metrics that support and reflect the financial counterparts.  The most commonly recommended agile production metrics, which are described in the following sections.

Total project duration: Agile projects get done quicker than traditional projects. By starting development sooner and cutting out bloatware — unnecessary requirements — agile project teams can deliver products quicker. Measure total project duration to help demonstrate efficiency

Time to market: Time to market is the amount of time an agile project takes to provide value, either through internal use or by generating income, by releasing working products and features to users.

Time to market is especially important for companies with revenue-generating products, because it aids in budgeting throughout the year. It’s also very important if you have a self-funding project— a project being paid for by the income from the product.

Total project cost: Cost on agile projects is directly related to duration. Because agile projects are faster than traditional projects, they can also cost less. Organizations can use project cost metrics to plan budgets, determine return on investment, and know when to exercise capital redeployment.

Return on investment: Return on investment (ROI) is income generated by the product, less project costs: money in versus money out. On agile projects, ROI is fundamentally different than it is on traditional projects. Agile projects have the potential to generate income with the very first release and can increase revenue with each new release.

New requests within ROI budgets: Agile projects’ ability to quickly generate high ROI provides organizations with a unique way to fund additional product development. New product features may translate to higher product income. If a project is already generating income, it can make sense for an organization to roll that income back into new development and see higher revenue.

Capital redeployment: On an agile project, when the cost of future development is higher than the value of that future development, it’s time for the project to end. The organization may then use the remaining budget from the old project to start a new, more valuable project.

Team member turnover: Agile projects tend to have higher morale. One way of quantifying morale is by measuring turnover through a couple metrics:

  • Scrum team turnover: Low scrum team turnover can be one sign of a healthy team environment. High scrum team turnover can indicate problems with the project, the organization, the work, individual scrum team members, burnout, ineffective product owners forcing development team commitments, personality incompatibility, a scrum master who fails to remove impediments, or overall team dynamics.

  • Company turnover: High company turnover, even if it doesn’t include the scrum team, can affect morale and effectiveness. High company turnover can be a sign of problems within the organization. As a company adopts agile practices, it may see turnover decrease.

Requirements and design quantification

Tom Gilb strongly believes that quantification of requirements is an essential concept missing from the agile paradigm, or even from software engineering in general. He claims that this lack is a risk for project failure, as software engineers and project managers cannot properly manage project results, control risks and costs, or prioritize tasks [Gilb and Cockburn, 2008]. Especially important are quality characteristics, because “functions and use cases are far less interesting” [Gilb and Cockburn, 2008] and “that is where most people have problems with quantification” [Gilb and Brodie, 2007]. Gilb assures that only numerically expressed quality goals are clear enough and therefore quantification is a step needed on the way from high level requirements to design ideas.
Design ideas also should be quantified [Gilb and Brodie, 2007]. It is done by the estimation of their value (to the customer – business value, not technical) and cost (effort). Then it is possible to identify the best designs – the ones with the highest value-to-cost ratio – and reprioritize the following development steps. A similar idea is described by Kile and Inampudi [2007]. The value and cost for implementing a requirement is estimated (without considering different design ideas) and the requirements are prioritized according to the result. This was a solution to a problem with the requirements prioritization ability of the customer and the development team. With some requirements having high value and high cost, and others having moderate value but very low cost, it was not easy to compare their desirability without quantification.

Metrics can also aid the application of agile practices. For example, in refactoring they can give information on the appropriate time for and the significance of a refactoring step [Kunz et al., 2008].

Heuristics for wise agile measurement [Hartmann and Dymond, 2006]

A good metric or diagnostic:

  1. Affirms and reinforces Lean and Agile principles.
  2. Measures outcome, not output.
  3. Follows trends, not numbers.
  4. Belongs to a small set of metrics and diagnostics.
  5. Is easy to collect.
  6. Reveals, rather than conceals, its context and significant variables.
  7. Provides fuel for meaningful conversation.
  8. Provides feedback on a frequent and regular basis.
  9. May measure Value (Product) or Process.
  10. Encourages “good-enough” quality.

Leffingwell argues that measurement should happen during iteration retrospectives and release retrospectives.

He proposes two categories of metrics: quantitative and qualitative. Quantitative metrics for an iteration consist of process metrics,  such  as  the  number  of  user   stories   accepted / rejected / rescheduled / added, and product metrics related to quality (defect count) and testing (number of test cases, percentage of automated test cases, unit test coverage).

Quantitative metrics for a release measure the release progress with value delivery (number of features delivered to the customer and their total value expressed in feature points, feature debt – existing customer commitments), conformance to release date, and technical debt (number of refactoring targets and number of refactorings completed, also called architectural debt or design debt). The qualitative assessment for both iteration and release requires listing what went well and what did not, revealing what should be continued and what needs to be improved.

Kunz et al. [2008] propose to combine software measurement with refactoring in order to indicate when a refactoring step is necessary, how important it is, how it affects quality, and what side effects it has. Metrics would be used here as triggers for needed refactoring steps.

Categories of Metrics

Quality

  • Defect Count
  • Technical Debt
  • Faults-Slip-Through
  • Sprint Goal success rate

Predictability

  • Velocity
  • Running Automated Tests

Value

  • Customer Satisfaction Survey
  • Business Value Delivered

Lean

  • Lead Time
  • Work In Progress
  • Queues

Cost

  • Average Cost Per Functions

Commonly recommended agile production metrics.

Lean Metrics

The selection of production metrics must carefully consider what has been advised in the previous sections. Inventory based metrics possess all these characteristics and give the advantage of addressing the importance of flow,  The most significant inventory based metrics are summarized below

Lead time – Relates to the financial metric Throughput. The lead time should be as short and stable as possible. It reduces the risk that the requirements become outdated and provides predictability. The metric is supported by Poppendieck, who states that the most important to measure is the ―concept-to-cash -time together with financial metrics .

 Queues – In software development queue time is a large part of the lead time. In contrast to the lead time, queue metrics are leading indicators. Large queues indicate that the future lead time will be long, which enables preventive actions. By calculating the cost of delay of the items in the queues, precedence can be given to the most urgent ones.

 Work in Progress – Constraining the WIP in different phases is one of the best ways to prevent large queues. If used in combination with queue metrics, WIP constraints prevent dysfunctional behavior such as simply renaming the objects in queues to work in progress. The metric is also an indicator of how well the team collaborates . A low WIP shows that the team works together on the same tasks. In addition, the Kanban method, which is built around the idea of constraining the WIP promises that it will result in an overall better software development,

These metrics can be visualized in a cumulative flow diagram, By tracking the investment’s way along the value chain towards becoming throughput, the inventory based metrics correlates well with the financial metric Investment.

Cost Metrics

Anderson argues the only cost metric needed is Average Cost Per Function (ACPF) and should only be used to estimate future operation expenses

Business Value Metrics

Agile software development puts the focus on the delivery of business value. Methods such as Scrum prioritize the work by value, making it sensible to measure the business value. It has also been observed that the trend in the industry is to measure value .

Hartmann notes that agile methods encourage the development to be responsible for delivering value rapidly and that the core metric should oversee this accountability. The quick delivery of value means that the investment is converted into value producing software as soon as possible. Leading metrics of business value includes estimations and is not an exact science.

Mike Cohn offers a possible solution to measure the business value , which involves dividing the business case’s value between the tasks. The delivery of value can be displayed in a Business Value Burnup Chart, One way to verify the delivery of business value, is to ask the customer if the features are actually used . It has proved useful to survey the customer over the time of a release and is much in line with the agile principle of customer cooperation.

 Quality Metrics

Lean metrics can indicate the products’ quality and provide predictability. For example, large queues in the implementation phase indicate poor quality and a stable lead time contributes to predictability. However, it might be necessary to supplement and balance them with more specific metrics.

A quality metric recommended by the agile community is Technical Debt .

Technical debt is a metaphor referring to the consequences of taking shortcuts in the software development. For example, code written in haste that is in need of refactoring. The debt can be represented in financial figures, which makes the metric suitable to communicate to upper management .

Technical metric measures the team’s overall technical indebtedness; known problems and issues being delivered at the end of the sprint. This is usually counted using bugs but could also be deliverables such as training material, user documentation, delivery media, and other

The counting of defects can be used as a quality metric. The defect count may occur in various stages of the development.

Counting defects in individual iterations can have a fairly large variation and may paint a misleading picture . Another aspect of defects is where they have been introduced. The fault-slips-through metric measures the test efficiency by where the defects should have been found and where it actually was . It monitors how well the test process works and addresses the cost savings of finding defects early. In case studies on implementation of lean metrics, the faults-slip-through has been recommended as the quality metric of choice

The primary purpose of measuring Faults Slip Through is to make sure that  the test process finds the right faults in the right phase, i.e. commonly earlier. Fault Slip Through represents the number of faults not  detected in a certain activity. These faults have instead been  detected in a later activity

Sprint goal success rates: A successful sprint should have a working product feature that fulfills the sprint goals and meets the scrum team’s definition of done: developed, tested, integrated, and documented.

Throughout the project, the scrum team can track how frequently it succeeds in reaching the sprint goals and use success rates to see whether the team is maturing or needs to correct its course.

EVM Metrics

AgileEVMTerms

AgileEVMTerms

AgileEVMMetrics

AgileEVMMetrics

 Predictability Metrics

What many organizations hope to gain from the measurement is predictability . In several of the agile methods the velocity of delivered requirements is used to achieve predictability and estimate the delivery capacity. The average velocity can serve as a good predictability metric, but can easily be gamed if used for other purposes. For example, Velocity used to measure productivity can degrade the quality.

Velocity is a capacity planning tool sometimes used in Agile software development. Velocity tracking is the act of measuring said velocity. The velocity is calculated by counting the number of units of work completed in a certain interval, the length of which is determined at the start of the project

The main idea behind velocity is to help teams estimate how much work they can complete in a given time period based on how quickly similar work was previously completed.

The following terminology is used in velocity tracking.

Unit of work
The unit chosen by the team to measure velocity. This can either be a real unit like hours or days or an abstract unit like story points or ideal days. Each task in the software development process should then be valued in terms of the chosen unit.
Interval
The interval is the duration of each iteration in the software development process for which the velocity is measured. The length of an interval is determined by the team. Most often, the interval is a week, but it can be as long as a month.

To calculate velocity, a team first has to determine how many units of work each task is worth and the length of each interval. During development, the team has to keep track of completed tasks and, at the end of the interval, count the number of units of work completed during the interval. The team then writes down the calculated velocity in a chart or on a graph.

The first week provides little value, but is essential to provide a basis for comparison.Each week after that, the velocity tracking will provide better information as the team provides better estimates and becomes more used to the methodology.

Velocity chart

Velocity chart

ss_iteration_burnup_2

The Velocity Chart shows the amount of value delivered in each sprint, enabling you to predict the amount of work the team can get done in future sprints. It is useful during your sprint planning meetings, to help you decide how much work you can feasibly commit to.

You can estimate your team’s velocity based on the total Estimate (for all completed stories) for each recent sprint. This isn’t an exact science — looking at several sprints will help you to get a feel for the trend. For each sprint, the Velocity Chart shows the sum of the Estimates for complete and incomplete stories. Estimates can be based on story points, business value, hours, issue count, or any numeric field of your choice . Please note that the values for each issue are recorded at the time the sprint is started.

 Running Automated Tests measures the productivity by the size of the product . It counts test points defined as each step in every running automated test. The belief is that the number of tests written is better in proportion to the requirement’s size, than the traditional lines-of-code metric. The metric addresses the risk of neglected testing, which is usually associated with productivity metrics. It motivates to write tests and to design smaller, more adaptive tests. Moreover, it has proven to be a good indicator of the complexity and to some extent on the quality . For measuring release predictability, Dean Leffingwell proposes to measure the projected value of each feature relative to the actual . However, the goal should not be to achieve total adherence. Instead, the objective should be to stay within a range of compliance to plan, which allows for both predictability and capturing of opportunities.

Actual Stories Completed vs. Committed Stories

The measure is taken by comparing the number of stories committed to in sprint planning and the number of stories identified in the sprint review as completed.

Visualization

To get the full value of agile measurement, the metrics need to be acted upon. The visualization of the metrics helps to ensure that actions are taken and achieves transparency in the organization . The company’s strategies become communicated and the coordination increases.

Spider Chart

Spider Chart

In Kanban the visualization of the workflow is an important activity and facilitates self-organizational behavior . For example, when a bottleneck is shown the employees tend to work together to elevate the bottleneck. Both Kanban and Scrum use card walls to visualize the work flow where each card represents a task and its current location in the value chain. The inventory based metrics can then be collected using the card walls. A very effective way to visualize the inventory based metrics is cumulative flow diagrams .

The cumulative flow diagram is an area graph, which shows the workflow on a daily basis.

Cumulative Flow Diagram

Are you a manager or business stakeholder working on an Agile points are the unit of measure used by this team for estimation Scrum Project and facing the following issues?

You have a strong feeling that there are bottlenecks in the process but are facing a lot of difficulty in mapping it to the process.

You are not satisfied with the burn-up and burn-down charts produced by the team and are interested in getting more insight into the process.
There is a panacea to solve all of the above issues. The name of the magic potion is “cumulative flow diagram”. Cumulative flow diagrams (CFDs) applied on top of the basic principles of KAN (visual) + Ban (card) can give you an insight into the project and keep everyone updated. CFD can be an extremely powerful tool when applied to a Scrum model.

A Cumulative Flow Diagram (CFD) is an area chart that shows the various statuses of work items for a product, version, or sprint. The horizontal x-axis in a CFD indicates time, and the vertical y-axis indicates cards (issues). Each coloured area of the chart equates to a workflow status (i.e. a column on your board).

A CFD can be useful for identifying bottlenecks. If your chart contains an area that is widening vertically over time, the column that equates to the widening area will generally be a bottleneck.

Multi-color CFDS look complicated but pretty. The pain of understanding it is worth the gain you get from it. These diagrams can help you in making critical business decisions. They will help give you better visibility regarding the time to market dates for features. Applying it on top of a Scrum project will help you see an accurate picture of the progress on your project.
Giving CFD to this team will help them in following the “Inspect and Adapt” Scrum principle. They can further zoom into the work in progress to see various flow states. CFD will help you in analyzing the actual progress and bottlenecks in any project. CFD can be drawn using area charts in MS Excel.

Workflow for CFD

Workflow for CFD

Cumulative Flow Diagram

Cumulative Flow Diagram

There are many other usages of CFDs besides finding bottlenecks. CFDs are multi-utility graphs that continuously report the true status of an Agile project. CFDs can help in determining lead time, cycle time, size of backlog, WIP, and bottlenecks at any point in time
Lead time is the time from when the feature entered  backlog to its completion status. This is of utmost interest to business stakeholders. It can help business people decide about the time  to market for features. They can plan marketing campaigns based  on lead times.
Cycle time is the time taken by team from when they started work  on it to completion status. This helps the project leads make important decisions in selecting items for working.
WIP is the work currently lying in different stages of the software lifecycle. Cycle time is directly proportional to the work in progress items. Keeping a limit on WIP is a key to success in any Agile project. In Scrum we try to limit the WIP within a sprint, but limiting
it across the various flow states will help further in gaining better control and visibility in a sprint.
Controlling WIP is the mantra for victory in any project. With CFDs, WIP is no longer a black box and anyone can see the work distribution at any point in time. Thus CFDs provide better insight and the power for better governance in any Agile methodology.A single diagram can contain information about lead time, WIP, queues and bottlenecks.

In Scrum, the Burndown Chart is a standard artifact. It allows the teams to monitor its progress and trends. The Burndown Chart tracks completed stories and the estimated remaining work. There are also variations of the Burndown Chart . For example, the Burnup Chart contains information about scope changes. For even better predictability, story points may be used. The stories are assigned points by the estimated effort to implement them.

A Burndown Chart shows the actual and estimated amount of work to be done in a sprint. The horizontal x-axis in a Burndown Chart indicates time, and the vertical y-axis indicates cards (issues).

Iteration Burn Down chart

Iteration Burn Down chart

Use a Burndown Chart to track the total work remaining and to project the likelihood of achieving the sprint goal. By tracking the remaining work throughout the iteration, a team can manage its progress and respond accordingly.

To communicate the KPIs, many organizations use Balanced Scorecards or Dashboards Dashboards are used to effectively monitor, analyze and manage the organization’s performance . The level of detail of the dashboards varies, ranging from graphical high-level KPIs to low-level data for root cause analysis. In order to communicate and facilitate that metrics are acted upon the measurement practice should: Visualize the metrics to achieve transparency. Be careful to not create dysfunctional behavior with the visualization.

Continuous Improvement

Kaizen is the Japanese word for continuous improvement and is a part of the lean software development . It is also found in agile software development. For example, Scrum has retrospectives after each sprint where improvements are identified. The retrospectives have similarities to Deming’s Plan-Do- Check-Act (PDCA) .

The PDCA is a cycle of four phases, which should drive the continuous improvement. What is notable is that the PDCA prescribe measurement to verify that improvements are achieved. Petri Heiramo observes that the retrospectives lack measurements and argues that it can lead to undesirable results . Without any metrics, it will be difficult to determine whether any targets have been met. This in turn can be demoralizing for the commitment to the improvement efforts.

Heiramo, suggest that these three questions should be added to the retrospective: What benefit or outcome do we expect out of this improvement/change? How do we measure it? Who is responsible for measuring it? Diagnostics can be used to obtain these measurements, In order for the diagnostics to achieve process improvement, the measurement practice should Be an integrated part of a process improvement framework.

Agile Metrics at  MAMDAS – a software development unit in the Israeli Air Force 

MAMDAS – a software development unit in the Israeli Air Force develops large scale , enterprise critical applications for Israeli Airforce. The project is developed by a team of 60 skilled developers and testers, organized in a hierarchical structure of small groups.
The project develops large-scale, enterprise-critical software, intended to be used by a large and varied user population. During December 2004, the first XP team was established for MAMDAS to implement XP . It was encouraging to observe that after the first two weeks iteration “managers were very surprised to see something running” and everyone agreed that “the pressure to deliver every two weeks leads to amazing results”  Still, accurate metrics are required in order to take professional decisions, to analyze long-term effects, and to increase confidence of all management levels with respect to the process that XP inspires.

They described four metrics and the kinds of data that are gathered to calculate them. These four metrics present information about the amount and quality of work that is performed, about the pace of the work progresses, and about the status of the remaining
work versus remaining human resources.

Product size, initially just called ‘Product’, is the first metrics. It aims at presenting the amount of completed work. The data that was selected to reflect the amount of work is the number of test points. One test point is defined as one test step in an automatic acceptance testing scenario  or as one line of unit tests. The number of test points is calculated for all kinds of written test and is gathered per iteration per component. Additional information is gathered with respect to the number of test points for tests that pass, the number of points for tests that fail, and the number of points for tests that do not run at all.

Pulse is the second metrics, which aims to measure how continuous the integration is. The data is automatically gathered from the development environment by counting how many check-in operations occur per day. The data is gathered for code check-ins, automatic-test check-ins, and detailed specifications check-ins. When referring to code it means code plus its unit test.

Burn-down is the third metrics. It presents the project remaining work versus the remaining human resources. This metrics is supported by the main planning table that is updated for each task according to kinds of activities (code, tests, or detailed specifications), dates of opening and closing, estimate and real time of development and, the component that it belongs to. In addition, this metrics is supported with the human resources table that is updated when new information regarding teammates’ absence arrives. This table also contains the product’s component assigned to each of the teammates and with the percentages of her/his position in the project. By using the data of these tables, this metrics can present the remaining work in days versus the remaining human resources in days. This information can be presented per week or for any group of weeks till a complete release, both for the entire team or for any specific component.

The burn-down graph answers a very basic managerial question: are we going to meet the goals of this release, and if not, what can we do about it? Release goals were set before each release – each goal is a high-level feature. Goals are defined by the user, and are verified by matching a rough estimate of the effort required to complete each goal (given by the development team) to the total available resources.
Once goals are defined and estimated, both remaining work and remaining resources are based on this initial estimation, which is refined as the release progresses.

Faults is the fourth metrics, which counts faults per iteration. During the release on which  all faults that were discovered in a specific iteration were fixed at the beginning of the next iteration. The faults metrics is required to continuously metrics the product’s quality. Note that the product size metrics doesn’t do it, since although it metrics test points, it does not correlate between the number of failed or un-run test steps to the number of actual bugs.The feedback on the Size Metrics was that it motivates writing tests and that it can be referred also as complexity metrics.

The use of the presented metrics mechanism increases confidence of the team members as well of the unit’s management with respect to using agile methods.Further, these metrics enable an accurate and professional decision making process for both short- and long-term purposes.

Technical Debt in Petrobras , Brazil

As an oil and gas company, Petrobras (http://www.petrobras.com.br) develops software in areas which demands increasingly innovative solutions in short time intervals. The company started officially with Scrum in March of 2009, using its lightweight framework to create collaborative self-organizing teams that could effectively deliver products. After the first team had adopted Scrum with a relative success, the manager noticed that the framework could be used in other teams, and thus he invested in training and coaching so that the teams could also have the opportunity to try the methodology. At that time, only the software development department for E&P, whose software helps to Exploit and Produce oil and gas, had management endorsement in adopting scrum and agile practices that would let teams deliver better products faster. About one year and half later, all teams in the department were using Scrum as its software development process. The developers and the stakeholders in general noticed expressive gains with the adoption of Scrum.
The results had varied from the skill of the team leadership in agile methodologies, customer participation, level of collaboration between team members, technical expertise among other factors.
The architecture team of the software development department for E&P was composed of four employees, whose responsibility was to help teams and offer support for resolution of problems related to agile methods and architecture. At that time, it had to work with 25 teams which had autonomy regarding its technical decisions. In fact, autonomy was one of the main managerial concerns when adopting agile methods.
After Scrum adoption, there was active debate, training and architectural meetings about whether Agile engineering practices should also be adopted in parallel with managerial practices; in hindsight, it would have accelerated the benefits had they been adopted. But the constraints of time and budget, decisions made by non- technical staff, and the bureaucracy in areas such as infrastructure and database, led to the postponement of those efforts initially. Moreover, the infrastructure area had only build and continuous integration (CI) tools available. And, unfortunately, these tools were not taken seriously by the teams. The automated deployment was relatively new and was postponed because of fear of implementing it in immature phase. Other tools and monitoring mechanisms were not used by the teams even so the architecture team was aware of its possible benefits.
Despite all initiatives in training and supporting in agile practices such as configuration management, automated tests and code analysis, teams, represented by 25 focal points in architectural meetings, did not show much interest in adopting many agile practices – particularly technical practices. Delivering the product on the date agreed with the customer and maintaining the legacy code were the most urgent issues. Analyzing retrospectively it seems that the main cause for this situation was that debt was getting accrued unconsciously. Serving the client was a much more visible and imperative goal. This can be one explanation for the ineffectiveness of prior attempts in introducing technical practices. Be it by the means of specialized training or by the support of the architecture team.
With these not so effective attempts to promote continuous improvement with teams, the architecture team sought a way to motivate them to experiment agile practices without a top-down “forced adoption”. The technical debt metaphor,  was the basis for the approach.

Given the context aforementioned, especially regarding the role of the architecture team serving various teams in parallel and the fact that the teams have autonomy in its technical decisions, there was a need for the technical debt estimation and visualization in a “macro-level”, i.e., not only associated with source code aspects but
with technical practices involving the product in general. This would give the opportunity to see the actual state of the department and indicate the “roadmap” for future interaction with the development teams.
The actions involved in introducing this kind of visualization and the management activities based on that visualization. The architecture team modeled a board, where the lines corresponds to teams and the columns are the categories and subcategories of technical debt, based on the work of Chris Sterling .In each cell, formed by the pair team x technical debt category, the maturity of the team was evaluated according to predefined criteria. They used the colors red, yellow and green to show the compliance level of each criterion.

After the design of the technical debt board, each team was invited to a rapid meeting in front of the board, where all team members talked about the status of each criterion, translating it to the respective color. During these meeting the teams could also conclude that some categories were not relevant or applicable for their systems.
This meeting should happen every month, so that the progress of each category could
be updated. At the end of the meeting, the team members agreed which of the categories would be the aim for the next meeting or, to put it another way, where they would invest their efforts in reducing the technical debt.

To measure the technical debt at source code level, the architecture team has made use of the tool Sonar (http://www.sonarsource.org/) . Sonar has a plugin that allows estimating how much effort would be required to fix each debt of the project. Sonar considers as debts: cohesion and complexity metrics, duplications, lack of comments, coding rules violation, potential bugs and no unit tests or useless ones.  The important aspect is that an estimative is calculated, and Sonar shows the results financially and the effort in man days necessary to take the debt to zero (the daily rate of the developer in the context of the project must be informed).
It is important to mention that Sonar, in fact, use many other tools internally to analyze the source code – each one for different aspects of the analysis. It works as an aggregator to display results of other tools such as PMD, Findbugs, Cobertura and Checkstyle among others.

The teams could make the debt rise during a whole month without even knowing about it.
To address this situation, the architecture team created a virtual tiled board, where each tile had information about the build state of each team in the department. The major information was the actual state of the build and the project name. If everything was ok (compilation and automated tests), the tile is green , if the compilation was broken, the tile turns red  and if there were failed tests, the tile turns yellow . Besides the build information, there is other information: total number of tests, number of failed tests, test coverage, number of lines and technical debt (calculated in Sonar).

The virtual tiled board was placed in a big screen in a place where everybody in the room could see it from their workplaces. The main objective was that when the team members saw their failed build and that instant feedback would lead them to make corrective actions so the build could go green again.
As the mechanisms of feedback were implemented, the teams had instant information
about what should be done to lower the levels of technical debt. With this information, they could prioritize which categories they would try to improve in the next month. If the team had some difficulties addressing any of the categories, they could call upon the architecture team support.