PEARL XVIII : Elucidation on ATDD – Acceptance Test Driven Development

PEARL XVIII : Elucidation on ATDD – Acceptance Test Driven Development 

TDD helps software developers produce working, high-quality code that’s maintainable and, most of all, reliable. Customers are rarely, however, interested in buying code. Customers want software that helps them to be more productive, make more money, maintain or improve operational capability, take over a market, and so forth. This is what we need to deliver with our software—functionality to support business function or market needs. Acceptance test-driven development (acceptance TDD) is what helps developers build high-quality software that fulfills the business’s needs as reliably as TDD helps ensure the software’s technical quality.

Acceptance Test Driven Development (ATDD) is a practice in which the whole team collaboratively discusses acceptance criteria, with examples, and then distills them into a set of concrete acceptance tests before development begins. It’s the best way to ensure that we all have the same shared understanding of what it is we’re actually building. It’s also the best way to ensure we have a shared definition of Done. .

Acceptance TDD helps coordinate software projects in a way that helps us deliver exactly what the customer wants when they want it, and that doesn’t let us implement the required functionality only half way.

An essential property of acceptance TDD is that it’s a team activity and a team process.

Acceptance tests are specifications for the desired behavior and functionality of a system. They tell us, for a given user story, how the system handles certain conditions and inputs and with what kinds of outcomes. There are a number of properties that an acceptance test should exhibit; 

An important property of acceptance tests is that they use the language of the domain and the customer instead of geek-speak only the programmer understands. This is the fundamental requirement for having the customer involved in the creation of acceptance tests and helps enormously with the job of validating that the tests are correct and sufficient. Scattering too much technical lingo into our tests makes us more vulnerable to having a requirement bug sneak into a production release—because the customer’s eyes glaze over when reading geek-speak and the developers are drawn to the technology rather than the real issue of specifying the right thing.

By using a domain language in specifying our tests, we are also not unnecessarily tied to the implementation, which is useful since we need to be able to refactor our system effectively. By using domain language, the changes we need to make to our existing tests when refactoring are typically non-existent or at most trivial.

Concise, precise, and unambiguous

Largely for the same reasons we write our acceptance tests using the domain’s own language, we want to keep our tests simple and concise. We write each of our acceptance tests to verify a single aspect or scenario relevant to the user story at hand. We keep our tests uncluttered, easy to understand, and easy to translate to executable tests. The less ambiguity involved, the better we are at avoiding mistakes and the working with our tests.

We might write our stories as simple reminders in the form of a bulleted list, or we might opt to spell them out as complete sentences describing the expected behavior. In either case, the goal is to provide just enough information for us to remember the important things we need to discuss and test for, rather than documenting those details beforehand. Card, conversation, confirmation—these are the three Cs that make up a user story. Those same three Cs could be applied to acceptance tests as well.

Yet another common property of acceptance tests is that they might not be implemented (translation: automated) using the same programming language as the system they are testing. Whether this is the case depends on the technologies involved and on the overall architecture of the system under test. For example, some programming languages are easier to inter-operate with than others. Similarly, it is easy to write acceptance tests for a web application through the HTTP protocol with practically any language we want, but it’s often impossible to run acceptance tests for embedded software written in any language other than that of the system itself.

The main reason for choosing a different programming language for implementing acceptance tests than the one we’re using for our production code (and, often, unit tests) is that the needs of acceptance tests are often radically different from the properties of the programming language we use for implementing our system. To give you an example, a particular real-time system might be feasible to implement only with native C code, whereas it would be rather verbose, slow, and error-prone to express tests for the same real-time system in C compared to, for example, a scripting language.

The ideal syntax for expressing our acceptance tests could be a declarative, tabular structure such as a spreadsheet, or it could be something closer to a sequence of higher-level actions written in plain English. If we want to have our customer collaborate with developers on our acceptance tests, a full-blown programming language such as Java, C/C++, or C# is likely not an option. “Best tool for the job” means more than technically best, because the programmer’s job description also includes collaborating with the customer.

The acceptance TDD cycle

In its simplest form, the process of acceptance test-driven development can be expressed as the simple cycle illustrated by figure 1

Figure 1. The acceptance TDD cycle

This cycle continues throughout the iteration as long as we have more stories to implement, starting over again from picking a user story; then writing tests for the chosen story, then turning those tests into automated, executable tests; and finally implementing the functionality to make our acceptance tests pass.

In practice, of course, things aren’t always that simple. We might not yet have user stories, the stories might be ambiguous or even contradictory, the stories might not have been prioritized, the stories might have dependencies that affect their scheduling, and so on.

Step 1: Pick a user story

The first step is to decide which story to work on next. Not always an easy job; but, fortunately, most of the time we’ll already have some relative priorities in place for all the stories in our iteration’s work backlog. Assuming that we have such priorities, the simplest way to go is to always pick the story that’s on top of the stack—that is, the story that’s considered the most important of those remaining. Again, sometimes, it’s not that simple.

Generally speaking, the stories are coming from the various planning meetings held throughout the project where the customer informally describes new features, providing examples to illustrate how the system should work in each situation. In those meetings, the developers and testers typically ask questions about the features, making them a medium for intense learning and discussion. Some of that information gets documented on a story card (whether virtual or physical), and some of it remains as tacit knowledge. In those same planning meetings, the customer prioritizes the stack of user stories by their business value (including business risk) and technical risk (as estimated by the team).

There are times when the highest-priority story requires skills that we don’t possess, or we consider not having enough of. In those situations, we might want to skip to the next task to see whether it makes more sense for us to work on it. Teams that have adopted pair programming don’t suffer from this problem as often. When working in pairs, even the most cross-functional team can usually accommodate by adjusting their current pairs in a way that frees the necessary skills for picking the highest priority task from the pile.

The least qualified person

The traditional way of dividing work on a team is for everyone to do what they do best. It’s intuitive. It’s safe. But it might not be the best way of completing the task. Arlo Belshee presented an experience report at the Agile 2005 conference, where he described how his company had started consciously tweaking the way they work and measuring what works and what doesn’t. Among their findings about stuff that worked was a practice of giving tasks to the least qualified person.

There can be more issues to deal with regarding picking user stories, but most of the time the solution comes easily through judicious application of common sense. For now, let’s move on to the second step in our process: writing tests for the story we’ve just picked.

Step 2: Write tests for a story

With a story card in hand (or onscreen if you’ve opted for managing your stories online), our next step is to write tests for the story.

The first thing to do is, of course, get together with the customer. In practice, this means having a team member sit down with the customer (they’re the one who should own the tests, remember?) and start sketching out a list of tests for the story in question.

As usual, there are personal preferences for how to go about doing this, but current preference  is to quickly scram out a list of rough scenarios or aspects of the story we want to test in order to say that the feature has been implemented correctly. There’s time to elaborate on those rough scenarios later on when we’re implementing the story or implementing the acceptance tests. At this time, however, we’re only talking about coming up with a bulleted list of things we need to test—things that have to work in order for us to claim the story is done.

On timing

Especially in projects that have been going on for a while already, the customer and the development team probably have some kind of an idea of what’s going to get scheduled into the next iteration in the upcoming planning meeting. In such projects, the customer and the team have probably spent some time during the previous iteration sketching acceptance tests for the features most likely to get picked in the next iteration’s planning session. This means that we might be writing acceptance tests for stories that we’re not going to implement until maybe a couple of weeks from now. We also might think of missing tests during implementation, for example, so this test-writing might happen pretty much at any point in time between writing the user story and the moment when the customer accepts the story as completed.

Once we have such a rough list, we start elaborating the tests, adding more detail and discussing about how this and that should work, whether there are any specifics about the user interface the customer would like to dictate, and so forth. Depending on the type of feature, the tests might be a set of interaction sequences or flows, or they might be a set of inputs and expected outputs. Often, especially with flow-style tests, the tests specify some kind of a starting state, a context the test assumes is part of the system.

Other than the level of detail and the sequence in which we work to add that detail, there’s a question of when—or whether—to start writing the tests into an executable format. Witness step 3 in our process: automating the tests.

Step 3: Automate the tests

The next step once we’ve got acceptance tests written down on the back of a story card, on a whiteboard, in some electronic format, or on pink napkins, is to turn those tests into something we can execute automatically and get back a simple pass-or-fail result. Whereas we’ve called the previous step writing tests, we might call this step implementing or automating those tests.

In an attempt to avoid potential confusion about how the executable acceptance tests differ from the acceptance tests

We might turn Acceptance tests into an executable format by using a variety of approaches and tools. The most popular category of tools  these days seems to be what we calltable-based tools. Their premise is that the tabular format of tables, rows, and columns makes it easy for us to specify our tests in a way that’s both human and machine readable. Figure 2 presents an example of how we might draft an executable test for the first test  “Valid account number”.

Figure 2. Example of an executable test, sketched on a piece of paper

In figure 2, we’ve outlined the steps we’re going to execute as part of our executable test in order to verify that the case of an incoming support call with a valid account number is handled as expected, displaying the customer’s information onscreen. Our test is already expressed in a format that’s easy to turn into a tabular table format using our tool of choice—for example, something that eats HTML tables and translates their content into a sequence of method invocations to Java code according to some documented rules.

The inevitable fact is that most of the time, there is not such a tool available that would understand our domain language tests in our table format and be able to wire those tests into calls to the system under test. In practice, we’ll have to do that wiring ourselves anyway—most likely the developers or testers will do so using a programming language.

To summarize this duality of turning acceptance tests into executable tests, we’re dealing with expressing the tests in a format that’s both human and machine readable and with writing the plumbing code to connect those tests to the system under test.

On style

The example in figure 2 is a flow-style test, based on a sequence of actions and parameters for those actions. This is not the only style at our disposal, however. A declarative approach to expressing the desired functionality or business rule can often yield more compact and more expressive tests than what’s possible with flow-style tests.

Yet our goal should—once again—be to keep our tests as simple and to the point as possible, ideally speaking in terms of what we’re doing instead of how we’re doing it.

With regard to writing things down , there are variations on how different teams do this. Some start writing the tests right away into electronic format using a word processor; some even go so far as to write them directly in an executable syntax. Some teams run their tests as early as during the initial authoring session. Some people,  prefer to work on the tests alongside the customer using a physical medium, leaving the running of the executable tests for a later time. For example, agilists like to sketch the executable tests on a whiteboard or a piece of paper first, and pick up the computerized tools only when they got something  relatively sure won’t need to be changed right away.

The benefit is that we’re less likely to fall prey to the technology—Agilsts noticed that tools often steal too much focus from the topic, which we don’t want. Using software also has this strange effect of the artifacts being worked on somehow seeming more formal, more final, and thus needing more polishing up. All that costs time and money, keeping us from the important work.

In projects where the customer’s availability is the bottleneck, especially in the beginning of an iteration (and this is the case more often than not), it makes a lot of sense to have a team member do the possibly laborious or uninteresting translation step on their own rather than keep the customer from working on elaborating tests for other stories. The downside to having the team member formulate the executable syntax alone is that the customer might feel less ownership in the acceptance tests in general—after all, it’s not the exact same piece they were working on. Furthermore, depending on the chosen test-automation tool and its syntax, the customer might even have difficulty reading the acceptance tests once they’ve been shoved into the executable format dictated by the tool.

let’s consider a case where our test-automation tool is a framework for which we express our tests in a simple but powerful scripting language such as Ruby. Figure 3 highlights the issue with the customer likely not being as capable of feeling ownership of the implemented acceptance test compared to the sketch, which they have participated in writing. Although the executable snippet of Ruby code certainly reads nicely to a programmer, it’s not so trivial for a non-technical person to relate to.

Figure 3. Contrast between a sketch an actual, implemented executable acceptance test

Another aspect to take into consideration is whether we should make all tests executable to start with or whether we should automate one test at a time as we progress with the implementation. Some teams—and this is largely dependent on the level of certainty regarding the requirements—do fine by automating all known tests for a given story up front before moving on to implementing the story.

Some teams prefer moving in baby steps like they do in regular test-driven development, implementing one test, implementing the respective slice of the story, implementing another test, and so forth. The downside to automating all tests up front is, of course, that we’re risking more unfinished work—inventory, if you will—than we would be if we’d implemented one slice at a time. Agilists  preference is strongly on the side of implementing acceptance tests one at a time rather than try getting them all done in one big burst. It should be mentioned, though, that elaborating acceptance tests toward their executable form during planning sessions could help a team understand the complexity of the story better and, thus, aid in making better estimates.

Many of the decisions regarding physical versus electronic medium, translating to executable syntax together or not, and so forth also depend to a large degree on the people. Some customers have no trouble working on the tests directly in the executable format (especially if the tool supports developing a domain-specific language). Some customers don’t have trouble identifying with tests that have been translated from their writing. As in so many aspects of software development, it depends.

Regardless of our choice of how many tests to automate at a time, after finishing this step of the cycle we have at least one acceptance test turned into an executable format; and before we proceed to implementing the functionality in question, we will have also written the necessary plumbing code for letting the test-automation tool know what those funny words mean in terms of technology. That is, we will have identified what the system should do when we say “select a transaction” or “place a call”—in terms of the programming API or other interface exposed by the system under test.

To put it another way, once we’ve gotten this far, we have an acceptance test that we can execute and that tells us that the specified functionality is missing. The next step is naturally to make that test pass—that is, implement the functionality to satisfy the failing test.

Step 4: Implement the functionality

Next on our plate is to come up with the functionality that makes our newly minted acceptance test(s) pass. Acceptance test-driven development doesn’t say how we should implement the functionality; but, needless to say, it is generally considered best practice among practitioners of acceptance TDD to do the implementation using test-driven development.

In general, a given story represents a piece of customer-valued functionality that is split—by the developers—into a set of tasks required for creating that functionality. It is these tasks that the developer then proceeds to tackle using whatever tools necessary, including TDD. When a given task is completed, the developer moves on to the next task, and so forth, until the story is completed—which is indicated by the acceptance tests executing successfully.

In practice, this process means plenty of small iterations within iterations. Figure 4 visualizes this transition to and from test-driven development inside the acceptance TDD process.

Figure 4. The relationship between test-driven development and acceptance test-driven development

As we can see, the fourth step of the acceptance test-driven development cycle, implementing the necessary functionality to fix a failing acceptance test, can be expanded into a sequence of smaller TDD cycles of test-code-refactor, building up the missing functionality in a piecemeal fashion until the acceptance test passes.

While the developer is working on a story, frequently consulting with the customer on how this and that ought to work, there will undoubtedly be occasions when the developer comes up with a scenario—a test—that the system should probably handle in addition to the customer/developer writing those things down. Being rational creatures, we add those acceptance tests to our list, perhaps after asking the customer what they think of the test. After all, they might not assign as much value to the given aspect or functionality of the story as we the developers might.

At some point, we’ve iterated through all the tasks and all the tests we’ve identified for the story, and the acceptance tests are happily passing. At this point, depending on whether we opted for automating all tests up front  or automating them just in time, we either go back to Step 3 to automate another test or to Step 1 to pick a brand-new story to work on.

. Getting acceptance tests passing is intensive work.

 Acceptance TDD inside an iteration

A healthy iteration consists mostly of hard work. Spend too much time in meetings or planning ahead, and you’re soon behind the iteration schedule and need to de-scope . Given a clear goal for the iteration, good user stories, and access to someone to answer our questions, most of the iteration should be spent in small cycles of a few hours to a couple of days writing acceptance tests, collaborating with the customer where necessary, making the tests executable, and implementing the missing functionality with test-driven development.

As such, the four-step acceptance test-driven development cycle of picking a story, writing tests for the story, implementing the tests, and implementing the story is only a fraction of the larger continuum of a whole iteration made of multiple—even up to dozens—of user stories, depending on the size of your team and the size of your stories. In order to gain understanding of how the small four-step cycle for a single user story fits into the iteration, we’re going to touch the zoom dial and see what an iteration might look like on a time line with the acceptance TDD–related activities scattered over the duration of a single iteration.

Figure 5 is an attempt to describe what such a time line might look like for a single iteration with nine user stories to implement. Each of the bars represents a single user story moving through the steps of writing acceptance tests, implementing acceptance tests, and implementing the story itself. In practice, there could (and probably would) be more iterations within each story, because we generally don’t write and implement all acceptance tests in one go but rather proceed through tests one by one.

Figure 5. Putting acceptance test-driven development on time line

Notice how the stories get completed almost from the beginning of the iteration? That’s the secret ingredient that acceptance TDD packs to provide indication of real progress. Our two imaginary developers (or pairs of developers and/or testers, if we’re pair programming) start working on the next-highest priority story as soon as they’re done with their current story. The developers don’t begin working on a new story before the current story is done. Thus, there are always two user stories getting worked on, and functionality gets completed throughout the iteration.

So, if the iteration doesn’t include writing the user stories, where are they coming from? As you may know if you’re familiar with agile methods, there is usually some kind of a planning meeting in the beginning of the iteration where the customer decides which stories get implemented in that iteration and which stories are left in the stack for the future. Because we’re scheduling the stories in that meeting, clearly we’ll have to have those stories written before the meeting, no?

That’s where continuous planning comes into the picture.

Continuous planning

Although an iteration should ideally be an autonomous, closed system that includes everything necessary to meet the iteration’s goal, it is often necessary—and useful—to prepare for the next iteration during the previous one by allocating some amount of time for pre-iteration planning activities.  Suggestions regarding the time we should allocate for this continuous planning range from 10–15% of the team’s total time available during the iteration. As usual, it’s good to start with something that has worked for others and, once we’ve got some experience doing things that way, begin zeroing in on a number that seems to work best in our particular context.

In practice, these pre-iteration planning activities might involve going through the backlog of user stories, identifying stories that are most likely to get scheduled for the next iteration, identifying stories that have been rendered obsolete, and so forth. This ongoing pre-iteration planning is also the context in which we carry out the writing of user stories and, to some extent, the writing of the first acceptance tests. The rationale here is to be prepared for the next iteration’s beginning when the backlog of stories is put on the table. At that point, the better we know our backlog, the more smoothly the planning session goes, and the faster we get back to work, crunching out valuable functionality for our customer.

By writing, estimating, splitting if necessary, and prioritizing user stories before the planning meeting, we ensure quick and productive planning meetings and are able to get back to delivering valuable features sooner.

It would be nice if we had all acceptance tests implemented (and failing) before we start implementing the production code. That is often not a realistic scenario, however, because tests require effort as well—they don’t just appear from thin air—and investing our time in implementing the complete set of acceptance tests up front doesn’t make any more sense than big up-front design does in the larger scale. It is much more efficient to implement acceptance tests as we go, user story by user story.

Teams that have dedicated testing personnel can have the testing engineers work together with the customer to make acceptance tests executable while developers start implementing the functionality for the stories.  A hazard is  that most teams, however, are much more homogeneous in this regard and participate in writing and implementing acceptance tests together, with nobody designated as “the acceptance test guy.”

The process is largely dependent on the availability of the customer and the test and software engineers. If your customer is only onsite for a few days in the beginning of each iteration, you probably need to do some trade-offs in order to make the most out of those few days and defer work that can be deferred until after the customer is no longer available. Similarly, somebody has to write code, and it’s likely not the customer who’ll do that; software and test engineers need to be involved at some point.

We start from those stories we’ll be working on first, of course, and implement the user story in parallel with automating the acceptance tests that we’ll use to verify our work. And, if at all possible, we avoid having the same person implement the tests and the production code in order to minimize our risk of human nature playing its tricks on us.

Again, we want to keep an eye on putting too much up-front effort in automating our acceptance tests—we might end up with a huge bunch of tests but no working software. It’s much better to proceed in small steps, delivering one story at a time. No matter how valuable our acceptance tests are to us, their value to the customer is negligible without the associated functionality.

The mid-iteration sanity check

Agilists like to have an informal sanity check in the middle of an iteration. At that point, we should have approximately half of the stories scheduled for the iteration running and passing. This might not be the case for the first iteration, due to having to build up more infrastructure than in later iterations; but, especially as we get better at estimating our stories, it should always be in the remote vicinity of having 50% of the stories passing their tests.

Of course, we’ll be tracking story completion throughout the iteration. Sometimes we realize early on that our estimated burn rate was clearly off, and we must adjust the backlog immediately and accordingly. By the middle of an iteration, however, we should generally be pretty close to having half the stories for the iteration completed. If not, the chances are that there’s more work to do than the team’s capacity can sustain, or the stories are too big compared to the iteration length.

A story’s burn-down rate is constantly more accurate a source of prediction than an inherently optimistic software developer. If it looks like we’re not going to live up to our planned iteration content, we decrease our load.

Decreasing the load

When it looks like we’re running out of time, we decrease the load. We don’t work harder (or smarter). We’re way past that illusion. We don’t want to sacrifice quality, because producing good quality guarantees the sustainability of our productivity, whereas bad quality only creates more rework and grinds our progress to a halt. We also don’t want to have our developers burn out from working overtime, especially when we know that working overtime doesn’t make any difference in the long run. Instead, we adjust the one thing we can: the iteration’s scope—to reality. In general, there are three ways to do that: swap, drop, and split. Tom DeMarco and Timothy Lister have done a great favor to our industry with their best-selling books Slack (DeMarco; Broadway, 2001) and Peopleware (DeMarco, Lister; Dorset House, 1999), which explain how overtime reduces productivity.

Swapping stories is simple. We trade one story for another, smaller one, thereby decreasing our workload. Again, we must consult the customer in order to assure that we still have the best possible content for the current iteration, given our best knowledge of how much work we can complete.

Dropping user stories is almost as straightforward as swapping them. “This low-priority story right here, we won’t do in this iteration. We’ll put it back into the product backlog.” But dropping the lowest-priority story might not always be the best option, considering the overall value delivered by the iteration—that particular story might be of low priority in itself, but it might also be part of a bigger whole that our customer cares about. We don’t want to optimize locally. Instead, we want to make sure that what we deliver in the end of the iteration is a cohesive whole that makes sense and can stand on its own.

The third way to decrease our load, splitting, is a bit trickier compared to dropping and swapping

Splitting stories

How do we split a story we already tried hard to keep as small as possible during the initial planning game? In general, we can split stories by function or by detail (or both). Consider a story such as “As a regular user of the online banking application, I want to optionally select the recipient information for a bank transfer from a list of most frequently and recently used accounts based on my history so that I don’t have to type in the details for the recipients every time.”

Splitting this story by function could mean dividing the story into “…from a list of recently used accounts” and “…from a list of most frequently used accounts.” Plus, depending on what the customer means by “most frequently and recently used,” we might end up adding another story along the lines of “…from a weighted list of most frequently and recently used accounts” where the weighted list uses an algorithm specified by the customer. Having these multiple smaller stories, we could then start by implementing a subset of the original, large story’s functionality and then add to it by implementing the other slices, building on what we have implemented for the earlier stories.

Splitting it by detail could result in separate stories for remembering only the account numbers, then also the recipient names, then the VAT numbers, and so forth. The usefulness of this approach is greatly dependent on the distribution of the overall effort between the details—if most of the work is in building the common infrastructure rather than in adding support for one more detail, then splitting by function might be a better option. On the other hand, if a significant part of the effort is in, for example, manually adding stuff to various places in the code base to support one new persistent field, splitting by detail might make sense.

Regardless of the chosen strategy, the most important thing to keep in mind is that, after the splitting, the resulting user stories should still represent something that makes sense—something valuable—to the customer.

Advertisements

PEARL IX : Refactoring performed to Sustain Application Development Success in Agile Environments

PEARL IX : Refactoring performed to Sustain Application Development Success in Agile Environments

 The term “refactoring” was originally coined by Martin Fowler and Kent Beck which refers to “a change made to the internal structure of software to make it easier to understand and cheaper to modify without altering its actual observable behavior i.e. it is a disciplined way to clean up code that minimizes the chances of introducing bugs and also enables the code to be evolved slowly over time and facilitates taking an iterative and incremental approach to programming and/or design”. Importantly, the underlying objective behind refactoring is to give thoughtful consideration and improve some of the essential non-functional attributes of the software. So, to achieve this, the technique has been broadly classified into following major categories:

1. Code Refactoring (clean-up) : It is intended to remove the unused code, methods, variables etc. which are misleading.
2. Code Standard Refactoring It is done to achieve quality code.

3. Database Refactoring: Just like code refactoring, it is intended to clean (clean-up) or remove the unnecessary and redundant data without changing the architecture.
4. Database schema and  Design Refactoring : This includes enhancing the database schema by leaving the actual fields required by the application.
5. User-Interface Refactoring :  It is intended to change the UI without affecting the underlying functionality.
6. Architecture Refactoring :  It is done to achieve modularization at the application level.

Refactoring is actually a simple technique where you make structural changes to the code in small, independent and safe steps, and test the code after each of these steps just to ensure that you have not changed the behavior – i.e. the code still works the same, but just looks different. Nevertheless, refactoring is intended to fill in some short-cuts, eliminate duplication and dead code, and help ensure the design and logic have been made very clear. Further, it is equally important to understand that, although refactoring is driven by certain good characteristics and shares some common attributes with debugging and/ or optimization, etc., it is actually different because

  •  Refactoring is not all about fixing any bugs.
  •  Again, optimization is not refactoring at all.
  •  Likewise, revisiting and/or tightening up error handling code is not refactoring.
  •  Adding any defensive code is also not considered to be refactoring.
  •  Importantly, tweaking the code to make it more testable is also not refactoring.

Re-factoring Activities – Conceptualized
The refactoring process generally consists of a number of distinct activities which are dealt with in chronological order:

  • Firstly, identify where the software should be refactored, i.e. figure out the code smell areas in the software which might increase the risk of failures or bugs.
  • Next, determine what refactoring should be applied to the identified places based on the list identified.
  • Guarantee that the applied refactoring preserves the behavior of the software. This is the crucial step in which, based on the type of software such as real-time, embedded and safety-critical, measures have to be taken to preserve their behavior prior to subjecting them to refactoring.
  • Apply the appropriate refactoring technique.
  • Assess the effect of the refactoring on the quality characteristics of the software, e.g. complexity, understandability and maintainability, and of the process, e.g. productivity, cost and effort.
  • Ensure the requisite consistency is maintained between the refactored program code and other software artifacts.

Refactoring Steps – Application/System Perspective
The points below clearly summarize the important steps to be adhered to when refactoring an application:
1. Firstly, formulate the unit test cases for the application/ system – the unit test cases should be developed in such a way that they test the application behavior and ensure that this behavior remains intact even after every cycle of refactoring.
2. Identify the approach to the task for refactoring – this includes two essential steps:
– Finding the problem – this is about identifying wheth-er there is any code smell situation with the current piece of code and, if yes, then identifying what the problem is all about.
– Assess/Decompose the problem – after identifying the potential problem assess it against the risks involved.
3. Design a suitable solution – work out what the resultant state will be after subjecting the code to refactoring.
Accordingly, formulate a solution that will be helpful intransitioning the code from the current state to the resultant state.
4. Alter the code – now proceed with refactoring the code without changing the external behavior of the code.
5. Test the refactored code – to ensure that the results and/ or behavior are consistent. If the test fails, then rollback the changes made and repeat the refactoring in different way.
6. Continue the cycle with the aforementioned steps (1) to (5) until the problematic/current code moves to the resultant state.

So, having said about refactoring and its underlying intent, it can be taken up as a practice and can be implemented safely with ease because the majority of today’s modern IDEs (integrated development environments) are inbuilt and equipped with various refactoring tools and patterns which can be used readily to refactor any application/business-logic/middle-tier code seamlessly. However, the situation may not be the same when it comes to refactoring a database, because database refactoring is conceptually more difficult when compared to code refactoring since with code refactoring you only need to maintain the behavioral semantics, whereas with database refactoring you must also maintain information semantics.

Refactoring is the process of clarifying and simplifying the design of existing code, without changing its behavior. Agile teams are maintaining and extending their code a lot from iteration to iteration, and without continuous refactoring, this is hard to do. This is because un-refactored code tends to rot. Rot takes several forms: unhealthy dependencies between classes or packages, bad allocation of class responsibilities, way too many responsibilities per method or class, duplicate code, and many other varieties of confusion and clutter.

Every time we change code without refactoring it, rot worsens and spreads. Code rot frustrates us, costs us time, and unduly shortens the lifespan of useful systems. In an agile context, it can mean the difference between meeting or not meeting an iteration deadline.

Refactoring code ruthlessly prevents rot, keeping the code easy to maintain and extend. This extensibility is the reason to refactor and the measure of its success. But note that it is only “safe” to refactor the code this extensively if we have extensive unit test suites of the kind we get if we work Test-First. Without being able to run those tests after each little step in a refactoring, we run the risk of introducing bugs. If you are doing true Test-Driven Development (TDD), in which the design evolves continuously, then you have no choice about regular refactoring, since that’s how you evolve the design.

Code Hygiene

A popular metaphor for refactoring is cleaning the kitchen as you cook. In any kitchen in which several complex meals are prepared per day for more than a handful of people, you will typically find that cleaning and reorganizing occur continuously. Someone is responsible for keeping the dishes, the pots, the kitchen itself, the food, the refrigerator all clean and organized from moment to moment. Without this, continuous cooking would soon collapse. In your own household, you can see non-trivial effects from postponing even small amounts of dish refactoring: did you ever try to scrape the muck formed by dried Cocoa Crispies out of a bowl? A missed opportunity for 2 seconds worth of rinsing can become 10 minutes of aggressive scraping.

Specific “Refactorings”

Refactorings are the opposite of fiddling endlessly with code; they are precise and finite. Martin Fowler’s definitivebook on the subject describes 72 specific “refactorings” by name (e.g., “Extract Method,” which extracts a block of code from one method, and creates a new method for it). Each refactoring converts a section of code (a block, a method, a class) from one of 22 well-understood “smelly” states to a more optimal state. It takes awhile to learn to recognize refactoring opportunities, and to implement refactorings properly.

Refactoring to Patterns

Refactoring does not only occur at low code levels. In his recent book, Refactoring to Patterns, Joshua Kerievsky skillfully makes the case that refactoring is the technique we should use to introduce Gang of Four design patterns into our code. He argues that patterns are often over-used, and often introduced too early into systems. He follows Fowler’s original format of showing and naming specific “refactorings,” recipes for getting your code from point A to point B. Kerievsky’s refactorings are generally higher level than Fowler’s, and often use Fowler’s refactorings as building blocks. Kerievsky also introduces the concept of refactoring “toward” a pattern, describing how many design patterns have several different implementations, or depths of implementation. Sometimes you need more of a pattern than you do at other times, and this book shows you exactly how to get part of the way there, or all of the way there.

The Flow of Refactoring

In a Test-First context, refactoring has the same flow as any other code change. You have your automated tests. You begin the refactoring by making the smallest discrete change you can that will compile, run, and function. Wherever possible, you make such changes by adding to the existing code, in parallel with it. You run the tests. You then make the next small discrete change, and run the tests again. When the refactoring is in place and the tests all run clean, you go back and remove the old smelly parallel code. Once the tests run clean after that, you are done.

Refactoring Automation in IDEs

Refactoring is much, much easier to do automatically than it is to do by hand. Fortunately, more and more Integrated Development Environments (IDEs) are building in automated refactoring support. For example, one popular IDE for Java is eclipse, which includes more auto-refactorings all the time. Another favorite is IntelliJ IDEA, which has historically included even more refactorings. In the .NET world, there are at least two refactoring tool plugins for Visual Studio 2003, and we are told that future versions of Visual Studio will have built-in refactoring support.

To refactor code in eclipse or IDEA, you select the code you want to refactor, pull down the specific refactoring you need from a menu, and the IDE does the rest of the hard work. You are prompted appropriately by dialog boxes for new names for things that need naming, and for similar input. You can then immediately rerun your tests to make sure that the change didn’t break anything. If anything was broken, you can easily undo the refactoring and investigate.

Example

Add Parameter

A method needs more information from its caller.

Add a parameter for an object that can pass on this information.

Customer                               Customer    
getContact()                                              getContact(data)

inverse of Remove Parameter

Naming: In IDEs this refactoring is usually done as part of “Change Method Signature”

Refactoring a Database – a Major and Typical Variant of Refactoring
“A database refactoring is a process or act of making simple changes to your database schema that improves its design while retaining both its behavioral and informational semantics.
It includes refactoring either structural aspects of the database such as table and view definitions or functional aspects such as stored procedures and triggers etc. Hence, it can be often thought of as the way to normalize your database schema.”
For a better understanding and appreciation of the concept,
let us consider the example of a typical database refactoring technique named Split Column, in which you replace a single table column with two or more other columns. For example, you are working on the PERSON table in your database and figure out that the DATE column is being used for two distinct purposes. a) to store the birth date when the person is a customer and b) to store the hire date when the person is an employee. Now, there is a problem if we have a requirement with the application to retrieve a person who is both customer and employee. So, before we proceed to implement and/or simulate such new requirement, we need to fix the database schema by replacing the DATE column with equivalent BirthDate and HireDate columns. Importantly, to maintain the behavioral semantics of the database schema we need to update all the supporting source code that accessed the DATE column earlier to now work with the newly introduced two columns. Likewise, to maintain the informational semantics we need to write a typical migration script that loops through the table, determines the appropriate type, and then copies the existing date data into the appropriate column.

Classification of Database Refactoring
The database refactoring process is classified into following
major categories:
1. Data quality – the database refactoring process which largely focuses on improving the quality of the data and information that resides within the database. Examples include introducing column constraints and replacing the type code with some boolean values, etc.
2. Structural – as the name implies this database refactoring process is intended to change the database schema.
Examples include renaming a column or splitting a column etc.
3. Referential Integrity – this is a kind of structural refactoring which is intended to refactor the database to ensure referential integrity. Examples include introducing cascading delete.
4. Architectural – this is a kind of structural refactoring which is intended to refactor one type of database item to another type.
5. Performance – this is a kind of structural refactoring which is aimed at improving the performance of the database. Examples include introducing alternate index to fasten the search during data selection.
6. Method – a refactoring technique which is intended to change a method (typically a stored procedure, stored function or trigger, etc.) to improve its quality. Examples include renaming a stored procedure to make it easier to refer and understand.
7. Non-Refactoring Transformations – this type of refactoring technique is intended to change the database schema that, in turn, changes its semantics. Examples include
adding new column to an existing table.
Why isn’t Database Refactoring Easy?
Generally, database refactoring is presumed to be a difficult and/or complicated task when compared to code refactoring. not just because there is the need to give thoughtful consideration to the behavioral and information semantics, but due to a distinct attribute referred to as coupling. The term coupling is understood to be the measure of the degree of the dependencies between two entities/items. So, the more coupling there is between entities/items, the greater the likelihood that a change in one will require a change in another. Hence, it is understood that coupling is the root cause of all the issues when it comes to database refactoring, i.e. the more things that your database is coupled to, the harder it is to refactor. Unfortunately, the majority of relational databases are coupled to a wide variety of things as mentioned below:

■ Application source code
■ Source code that facilitates data loading
■ Code that facilitates data extraction
■ Underlying Persistent layers/frameworks that govern the overall application process flow
■ The respective database schema
■ Data migration scripts, etc.

Refactoring Steps – Database Perspective
Generally, the need to refactor the database schema will be identified by a application developer who is actually trying to implement a new requirement or fix a defect. Then the application developer describes the required change to the concerned DBA of the project and then refactoring begins. Now, as part of this exercise, the DBA will typically work through all or a few of the following steps in chronological order:
1. Most importantly, verify whether database refactoring is required or not – this is the first thing that the DBA does, and it is where they will determine whether database refactoring is needed and/or if it is the right one to perform. Now the next important thing is to assess the overall impact of the refactoring.

2. If it is inevitable, choose the most appropriate database refactoring – this important step is about having several choices for implementing new logic and structures within a database and choosing the right one.

3. Deprecate the original schema – this is not a straightforward step, because you cannot simply make a change retaining the behavior. to the database schema instantly. Instead, adopt an approach that will work with both the old and the new schema in parallel for a while to provide the required time for the other team to both refactor and redeploy their
systems.
4. Modify the schema – this step is intended to make the requisite changes to the schema and ensure that the necessary logs are also updated accordingly, e.g. database change log which is typically the source code for implementing all database schema changes and update log which contains the source code for future changes to the database schema.
5. Migrate the data – this is the crucial step which involves migrating and/or copying the data from old versions of the schema to the new.
6. Modify all related external programs – this step is intended to ensure that all the programs which access the portion of database schema which is for the subject of refactoring must be updated to work with the new version of the database schema.
7. Conduct regression test – once the changes to the application code and database schema have been put in place, then it is good to run the regression test suite just to ensure that everything is right and working correctly.
8. Keep the team informed about the changes made and version control the work – this is an important step because the database is a shared resource and it is minimally shared by the application development team. So, it is the prime responsibility of the DBA to keep the team informed about the changes made to the database. Nevertheless, since database refactoring definitely includes some DDLs, change scripts, data migration scripts, data models related scripts, test data and its generation code, etc., all these scripts have to be put under configuration management by checking them into a version control system for better versioning, control, and consistency.

Once the database schema has been refactored successfully in the application development sandbox (a technical environment where your software, including both your application code and database schema, are developed and unit tested), the team can go ahead with refactoring the requisite Integration, Test/QA, and Production sandboxes as well, to ensure that the changes introduced are available and uniform across all environments.

Refactor Unit Tests

Unit test the current and rewritten code

Unit tests are tests to test small sections of the code. Ideally each test is independent, and stubs and drivers are used to get control over the environment. Since refactoring deals with small sections of code, unit tests provide the correct scope.

Refactor code that has no existing unit tests

When you work with very old code, in general you do not have unit tests. So can you just start refactoring? No, first add unit tests to the existing code. After refactoring, these unit tests should still hold. In this way you improve the maintainability of the code as well as the quality of the code. This is a complex task. First you need to find out what the functionality of thecode is. Then you need to think of test cases that properly cover the functionality. To discover the functionality, you provide several inputs to the code and observe the outputs. Functional equivalence is proven when the code is input/output conformant to the original code.

Refactor to increase the quality of the existing unit tests You also see code which contains badly designed unit tests. For example, the unit test verifies multiple scenarios at once. Usually this is caused by not properly decoupling the code from its dependencies . This is undesirable behaviour because the test must not depend on the state of the environment. A solution is to refactor the code to support substitutable dependencies. This allows the test to use a test stub or mock object. The unit test is split into three unit tests which test the three scenarios separately. The rewritten code has a configurable time provider. The test now uses its own time provider and has complete control over the environment.

Every change in the code needs to be tested. Therefore testing  is required when refactoring. You test the changes at different  levels. Since a small section of code is changed, unit testing  seems the most fitting level. But do not forget the business  value! Regression testing is of vital importance for the business.

Test-driven development (TDD)

Test-driven development (TDD) is an advanced technique of using automated unit tests to drive the design of software and force decoupling of dependencies. The result of using this practice is a comprehensive suite of unit tests that can be run at any time to provide feedback that the software is still working. This technique is heavily emphasized by those using Agile development methodologies

The motto of test-driven development is “Red, Green, Refactor.”

  • Red: Create a test and make it fail.
  • Green: Make the test pass by any means necessary.
  • Refactor: Change the code to remove duplication in your project and to improve the design while ensuring that all tests still pass.

The Red/Green/Refactor cycle is repeated very quickly for each new unit of code.

Key Benefits of Re-factoring
From a system/application standpoint, listed below are summaries of the key benefits that can be achieved seamlessly when implementing the refactoring process in a disciplined fashion:

  • Firstly, it improves the overall software extendability.
  • Reduces and optimizes the code maintenance cost.
  • Facilitates highly standardized and organized code.
  • Ensures that the system architecture is improved by retaining the behavior.
  • Guarantees three essential attributes: readability, understandability, and modularity of the code.
  • Ensures constant improvement in the overall quality of the system.

Justifying the refactoring task might be very difficult, but not impossible. Here are the tips for justifying the need for refactoring.
1. Future business changes will require less time. Refactoring will not give an immediate return but, in the long run, adding features will be less expensive as the code will become easier to maintain. Before refactoring, the code is fit for machine consumption but after refactoring it is fit for human as well as machine consumption.
2. Bugs will be fixed during refactoring. Hidden bugs or logics embedded in complicated unnecessary loops will be exposed, which might result in fixing some longstanding
non-reproducible issues.
3. The current application will have a longer life. Prevention is better than cure. Refactoring can be considered to be a prevention exercise which will help to optimize the structure of the application for future enhancements.
4. There might be performance gains. You cannot promise any apparent or measurable performance gain. But if you are planning to do refactoring to achieve some performance gain, then you should have measurable counters showing the performance of the current app before you start refactoring. And after each change the performance counters should be recalculated to check the optimization.Refactoring may result in a reduction in the lines of code, making it less expensive to maintain in the long run. During refactoring of your algorithm, you should follow the DRY (Don’t Repeat Yourself) principle. Any application
that has survived for 6 months to 1 year will have ample places to remove duplication of code.

Developers do not use the full potential of the refactoring tools available on the market.
This might be due to a lack of knowledge or pressure of timelines. During refactoring, these tools are extremely helpful and valuable as they reduce the chances of intro- ducing an error when making big changes

  • Resharper VIsual Studio Add on for .NET
  • XCode for Objective C #
  • iNTELLIJ idea For Java

Refactoring using the right tools and good software development practices will be a boon for any application’s long life and sustenance. Refactoring is an opportunity to solidify the foundation of an existing application that might have become weaker after adding a lot of changes and enhancements. If you are making changes to the same piece of code for the third time, it means there is some technical debt that you have created and there is a need to refactor this code.