PEARL IX : Refactoring performed to Sustain Application Development Success in Agile Environments
The term “refactoring” was originally coined by Martin Fowler and Kent Beck which refers to “a change made to the internal structure of software to make it easier to understand and cheaper to modify without altering its actual observable behavior i.e. it is a disciplined way to clean up code that minimizes the chances of introducing bugs and also enables the code to be evolved slowly over time and facilitates taking an iterative and incremental approach to programming and/or design”. Importantly, the underlying objective behind refactoring is to give thoughtful consideration and improve some of the essential non-functional attributes of the software. So, to achieve this, the technique has been broadly classified into following major categories:
1. Code Refactoring (clean-up) : It is intended to remove the unused code, methods, variables etc. which are misleading.
2. Code Standard Refactoring It is done to achieve quality code.
3. Database Refactoring: Just like code refactoring, it is intended to clean (clean-up) or remove the unnecessary and redundant data without changing the architecture.
4. Database schema and Design Refactoring : This includes enhancing the database schema by leaving the actual fields required by the application.
5. User-Interface Refactoring : It is intended to change the UI without affecting the underlying functionality.
6. Architecture Refactoring : It is done to achieve modularization at the application level.
Refactoring is actually a simple technique where you make structural changes to the code in small, independent and safe steps, and test the code after each of these steps just to ensure that you have not changed the behavior – i.e. the code still works the same, but just looks different. Nevertheless, refactoring is intended to fill in some short-cuts, eliminate duplication and dead code, and help ensure the design and logic have been made very clear. Further, it is equally important to understand that, although refactoring is driven by certain good characteristics and shares some common attributes with debugging and/ or optimization, etc., it is actually different because
- Refactoring is not all about fixing any bugs.
- Again, optimization is not refactoring at all.
- Likewise, revisiting and/or tightening up error handling code is not refactoring.
- Adding any defensive code is also not considered to be refactoring.
- Importantly, tweaking the code to make it more testable is also not refactoring.
Re-factoring Activities – Conceptualized
The refactoring process generally consists of a number of distinct activities which are dealt with in chronological order:
- Firstly, identify where the software should be refactored, i.e. figure out the code smell areas in the software which might increase the risk of failures or bugs.
- Next, determine what refactoring should be applied to the identified places based on the list identified.
- Guarantee that the applied refactoring preserves the behavior of the software. This is the crucial step in which, based on the type of software such as real-time, embedded and safety-critical, measures have to be taken to preserve their behavior prior to subjecting them to refactoring.
- Apply the appropriate refactoring technique.
- Assess the effect of the refactoring on the quality characteristics of the software, e.g. complexity, understandability and maintainability, and of the process, e.g. productivity, cost and effort.
- Ensure the requisite consistency is maintained between the refactored program code and other software artifacts.
Refactoring Steps – Application/System Perspective
The points below clearly summarize the important steps to be adhered to when refactoring an application:
1. Firstly, formulate the unit test cases for the application/ system – the unit test cases should be developed in such a way that they test the application behavior and ensure that this behavior remains intact even after every cycle of refactoring.
2. Identify the approach to the task for refactoring – this includes two essential steps:
– Finding the problem – this is about identifying wheth-er there is any code smell situation with the current piece of code and, if yes, then identifying what the problem is all about.
– Assess/Decompose the problem – after identifying the potential problem assess it against the risks involved.
3. Design a suitable solution – work out what the resultant state will be after subjecting the code to refactoring.
Accordingly, formulate a solution that will be helpful intransitioning the code from the current state to the resultant state.
4. Alter the code – now proceed with refactoring the code without changing the external behavior of the code.
5. Test the refactored code – to ensure that the results and/ or behavior are consistent. If the test fails, then rollback the changes made and repeat the refactoring in different way.
6. Continue the cycle with the aforementioned steps (1) to (5) until the problematic/current code moves to the resultant state.
So, having said about refactoring and its underlying intent, it can be taken up as a practice and can be implemented safely with ease because the majority of today’s modern IDEs (integrated development environments) are inbuilt and equipped with various refactoring tools and patterns which can be used readily to refactor any application/business-logic/middle-tier code seamlessly. However, the situation may not be the same when it comes to refactoring a database, because database refactoring is conceptually more difficult when compared to code refactoring since with code refactoring you only need to maintain the behavioral semantics, whereas with database refactoring you must also maintain information semantics.
Refactoring is the process of clarifying and simplifying the design of existing code, without changing its behavior. Agile teams are maintaining and extending their code a lot from iteration to iteration, and without continuous refactoring, this is hard to do. This is because un-refactored code tends to rot. Rot takes several forms: unhealthy dependencies between classes or packages, bad allocation of class responsibilities, way too many responsibilities per method or class, duplicate code, and many other varieties of confusion and clutter.
Every time we change code without refactoring it, rot worsens and spreads. Code rot frustrates us, costs us time, and unduly shortens the lifespan of useful systems. In an agile context, it can mean the difference between meeting or not meeting an iteration deadline.
Refactoring code ruthlessly prevents rot, keeping the code easy to maintain and extend. This extensibility is the reason to refactor and the measure of its success. But note that it is only “safe” to refactor the code this extensively if we have extensive unit test suites of the kind we get if we work Test-First. Without being able to run those tests after each little step in a refactoring, we run the risk of introducing bugs. If you are doing true Test-Driven Development (TDD), in which the design evolves continuously, then you have no choice about regular refactoring, since that’s how you evolve the design.
A popular metaphor for refactoring is cleaning the kitchen as you cook. In any kitchen in which several complex meals are prepared per day for more than a handful of people, you will typically find that cleaning and reorganizing occur continuously. Someone is responsible for keeping the dishes, the pots, the kitchen itself, the food, the refrigerator all clean and organized from moment to moment. Without this, continuous cooking would soon collapse. In your own household, you can see non-trivial effects from postponing even small amounts of dish refactoring: did you ever try to scrape the muck formed by dried Cocoa Crispies out of a bowl? A missed opportunity for 2 seconds worth of rinsing can become 10 minutes of aggressive scraping.
Refactorings are the opposite of fiddling endlessly with code; they are precise and finite. Martin Fowler’s definitivebook on the subject describes 72 specific “refactorings” by name (e.g., “Extract Method,” which extracts a block of code from one method, and creates a new method for it). Each refactoring converts a section of code (a block, a method, a class) from one of 22 well-understood “smelly” states to a more optimal state. It takes awhile to learn to recognize refactoring opportunities, and to implement refactorings properly.
Refactoring to Patterns
Refactoring does not only occur at low code levels. In his recent book, Refactoring to Patterns, Joshua Kerievsky skillfully makes the case that refactoring is the technique we should use to introduce Gang of Four design patterns into our code. He argues that patterns are often over-used, and often introduced too early into systems. He follows Fowler’s original format of showing and naming specific “refactorings,” recipes for getting your code from point A to point B. Kerievsky’s refactorings are generally higher level than Fowler’s, and often use Fowler’s refactorings as building blocks. Kerievsky also introduces the concept of refactoring “toward” a pattern, describing how many design patterns have several different implementations, or depths of implementation. Sometimes you need more of a pattern than you do at other times, and this book shows you exactly how to get part of the way there, or all of the way there.
The Flow of Refactoring
In a Test-First context, refactoring has the same flow as any other code change. You have your automated tests. You begin the refactoring by making the smallest discrete change you can that will compile, run, and function. Wherever possible, you make such changes by adding to the existing code, in parallel with it. You run the tests. You then make the next small discrete change, and run the tests again. When the refactoring is in place and the tests all run clean, you go back and remove the old smelly parallel code. Once the tests run clean after that, you are done.
Refactoring Automation in IDEs
Refactoring is much, much easier to do automatically than it is to do by hand. Fortunately, more and more Integrated Development Environments (IDEs) are building in automated refactoring support. For example, one popular IDE for Java is eclipse, which includes more auto-refactorings all the time. Another favorite is IntelliJ IDEA, which has historically included even more refactorings. In the .NET world, there are at least two refactoring tool plugins for Visual Studio 2003, and we are told that future versions of Visual Studio will have built-in refactoring support.
To refactor code in eclipse or IDEA, you select the code you want to refactor, pull down the specific refactoring you need from a menu, and the IDE does the rest of the hard work. You are prompted appropriately by dialog boxes for new names for things that need naming, and for similar input. You can then immediately rerun your tests to make sure that the change didn’t break anything. If anything was broken, you can easily undo the refactoring and investigate.
A method needs more information from its caller.
Add a parameter for an object that can pass on this information.
inverse of Remove Parameter
Naming: In IDEs this refactoring is usually done as part of “Change Method Signature”
Refactoring a Database – a Major and Typical Variant of Refactoring
“A database refactoring is a process or act of making simple changes to your database schema that improves its design while retaining both its behavioral and informational semantics.
It includes refactoring either structural aspects of the database such as table and view definitions or functional aspects such as stored procedures and triggers etc. Hence, it can be often thought of as the way to normalize your database schema.”
For a better understanding and appreciation of the concept,
let us consider the example of a typical database refactoring technique named Split Column, in which you replace a single table column with two or more other columns. For example, you are working on the PERSON table in your database and figure out that the DATE column is being used for two distinct purposes. a) to store the birth date when the person is a customer and b) to store the hire date when the person is an employee. Now, there is a problem if we have a requirement with the application to retrieve a person who is both customer and employee. So, before we proceed to implement and/or simulate such new requirement, we need to fix the database schema by replacing the DATE column with equivalent BirthDate and HireDate columns. Importantly, to maintain the behavioral semantics of the database schema we need to update all the supporting source code that accessed the DATE column earlier to now work with the newly introduced two columns. Likewise, to maintain the informational semantics we need to write a typical migration script that loops through the table, determines the appropriate type, and then copies the existing date data into the appropriate column.
Classification of Database Refactoring
The database refactoring process is classified into following
1. Data quality – the database refactoring process which largely focuses on improving the quality of the data and information that resides within the database. Examples include introducing column constraints and replacing the type code with some boolean values, etc.
2. Structural – as the name implies this database refactoring process is intended to change the database schema.
Examples include renaming a column or splitting a column etc.
3. Referential Integrity – this is a kind of structural refactoring which is intended to refactor the database to ensure referential integrity. Examples include introducing cascading delete.
4. Architectural – this is a kind of structural refactoring which is intended to refactor one type of database item to another type.
5. Performance – this is a kind of structural refactoring which is aimed at improving the performance of the database. Examples include introducing alternate index to fasten the search during data selection.
6. Method – a refactoring technique which is intended to change a method (typically a stored procedure, stored function or trigger, etc.) to improve its quality. Examples include renaming a stored procedure to make it easier to refer and understand.
7. Non-Refactoring Transformations – this type of refactoring technique is intended to change the database schema that, in turn, changes its semantics. Examples include
adding new column to an existing table.
Why isn’t Database Refactoring Easy?
Generally, database refactoring is presumed to be a difficult and/or complicated task when compared to code refactoring. not just because there is the need to give thoughtful consideration to the behavioral and information semantics, but due to a distinct attribute referred to as coupling. The term coupling is understood to be the measure of the degree of the dependencies between two entities/items. So, the more coupling there is between entities/items, the greater the likelihood that a change in one will require a change in another. Hence, it is understood that coupling is the root cause of all the issues when it comes to database refactoring, i.e. the more things that your database is coupled to, the harder it is to refactor. Unfortunately, the majority of relational databases are coupled to a wide variety of things as mentioned below:
■ Application source code
■ Source code that facilitates data loading
■ Code that facilitates data extraction
■ Underlying Persistent layers/frameworks that govern the overall application process flow
■ The respective database schema
■ Data migration scripts, etc.
Refactoring Steps – Database Perspective
Generally, the need to refactor the database schema will be identified by a application developer who is actually trying to implement a new requirement or fix a defect. Then the application developer describes the required change to the concerned DBA of the project and then refactoring begins. Now, as part of this exercise, the DBA will typically work through all or a few of the following steps in chronological order:
1. Most importantly, verify whether database refactoring is required or not – this is the first thing that the DBA does, and it is where they will determine whether database refactoring is needed and/or if it is the right one to perform. Now the next important thing is to assess the overall impact of the refactoring.
2. If it is inevitable, choose the most appropriate database refactoring – this important step is about having several choices for implementing new logic and structures within a database and choosing the right one.
3. Deprecate the original schema – this is not a straightforward step, because you cannot simply make a change retaining the behavior. to the database schema instantly. Instead, adopt an approach that will work with both the old and the new schema in parallel for a while to provide the required time for the other team to both refactor and redeploy their
4. Modify the schema – this step is intended to make the requisite changes to the schema and ensure that the necessary logs are also updated accordingly, e.g. database change log which is typically the source code for implementing all database schema changes and update log which contains the source code for future changes to the database schema.
5. Migrate the data – this is the crucial step which involves migrating and/or copying the data from old versions of the schema to the new.
6. Modify all related external programs – this step is intended to ensure that all the programs which access the portion of database schema which is for the subject of refactoring must be updated to work with the new version of the database schema.
7. Conduct regression test – once the changes to the application code and database schema have been put in place, then it is good to run the regression test suite just to ensure that everything is right and working correctly.
8. Keep the team informed about the changes made and version control the work – this is an important step because the database is a shared resource and it is minimally shared by the application development team. So, it is the prime responsibility of the DBA to keep the team informed about the changes made to the database. Nevertheless, since database refactoring definitely includes some DDLs, change scripts, data migration scripts, data models related scripts, test data and its generation code, etc., all these scripts have to be put under configuration management by checking them into a version control system for better versioning, control, and consistency.
Once the database schema has been refactored successfully in the application development sandbox (a technical environment where your software, including both your application code and database schema, are developed and unit tested), the team can go ahead with refactoring the requisite Integration, Test/QA, and Production sandboxes as well, to ensure that the changes introduced are available and uniform across all environments.
Refactor Unit Tests
Unit test the current and rewritten code
Unit tests are tests to test small sections of the code. Ideally each test is independent, and stubs and drivers are used to get control over the environment. Since refactoring deals with small sections of code, unit tests provide the correct scope.
Refactor code that has no existing unit tests
When you work with very old code, in general you do not have unit tests. So can you just start refactoring? No, first add unit tests to the existing code. After refactoring, these unit tests should still hold. In this way you improve the maintainability of the code as well as the quality of the code. This is a complex task. First you need to find out what the functionality of thecode is. Then you need to think of test cases that properly cover the functionality. To discover the functionality, you provide several inputs to the code and observe the outputs. Functional equivalence is proven when the code is input/output conformant to the original code.
Refactor to increase the quality of the existing unit tests You also see code which contains badly designed unit tests. For example, the unit test verifies multiple scenarios at once. Usually this is caused by not properly decoupling the code from its dependencies . This is undesirable behaviour because the test must not depend on the state of the environment. A solution is to refactor the code to support substitutable dependencies. This allows the test to use a test stub or mock object. The unit test is split into three unit tests which test the three scenarios separately. The rewritten code has a configurable time provider. The test now uses its own time provider and has complete control over the environment.
Every change in the code needs to be tested. Therefore testing is required when refactoring. You test the changes at different levels. Since a small section of code is changed, unit testing seems the most fitting level. But do not forget the business value! Regression testing is of vital importance for the business.
Test-driven development (TDD)
Test-driven development (TDD) is an advanced technique of using automated unit tests to drive the design of software and force decoupling of dependencies. The result of using this practice is a comprehensive suite of unit tests that can be run at any time to provide feedback that the software is still working. This technique is heavily emphasized by those using Agile development methodologies
The motto of test-driven development is “Red, Green, Refactor.”
- Red: Create a test and make it fail.
- Green: Make the test pass by any means necessary.
- Refactor: Change the code to remove duplication in your project and to improve the design while ensuring that all tests still pass.
The Red/Green/Refactor cycle is repeated very quickly for each new unit of code.
Key Benefits of Re-factoring
From a system/application standpoint, listed below are summaries of the key benefits that can be achieved seamlessly when implementing the refactoring process in a disciplined fashion:
- Firstly, it improves the overall software extendability.
- Reduces and optimizes the code maintenance cost.
- Facilitates highly standardized and organized code.
- Ensures that the system architecture is improved by retaining the behavior.
- Guarantees three essential attributes: readability, understandability, and modularity of the code.
- Ensures constant improvement in the overall quality of the system.
Justifying the refactoring task might be very difficult, but not impossible. Here are the tips for justifying the need for refactoring.
1. Future business changes will require less time. Refactoring will not give an immediate return but, in the long run, adding features will be less expensive as the code will become easier to maintain. Before refactoring, the code is fit for machine consumption but after refactoring it is fit for human as well as machine consumption.
2. Bugs will be fixed during refactoring. Hidden bugs or logics embedded in complicated unnecessary loops will be exposed, which might result in fixing some longstanding
3. The current application will have a longer life. Prevention is better than cure. Refactoring can be considered to be a prevention exercise which will help to optimize the structure of the application for future enhancements.
4. There might be performance gains. You cannot promise any apparent or measurable performance gain. But if you are planning to do refactoring to achieve some performance gain, then you should have measurable counters showing the performance of the current app before you start refactoring. And after each change the performance counters should be recalculated to check the optimization.Refactoring may result in a reduction in the lines of code, making it less expensive to maintain in the long run. During refactoring of your algorithm, you should follow the DRY (Don’t Repeat Yourself) principle. Any application
that has survived for 6 months to 1 year will have ample places to remove duplication of code.
Developers do not use the full potential of the refactoring tools available on the market.
This might be due to a lack of knowledge or pressure of timelines. During refactoring, these tools are extremely helpful and valuable as they reduce the chances of intro- ducing an error when making big changes
- Resharper VIsual Studio Add on for .NET
- XCode for Objective C #
- iNTELLIJ idea For Java
Refactoring using the right tools and good software development practices will be a boon for any application’s long life and sustenance. Refactoring is an opportunity to solidify the foundation of an existing application that might have become weaker after adding a lot of changes and enhancements. If you are making changes to the same piece of code for the third time, it means there is some technical debt that you have created and there is a need to refactor this code.