6. Quality and Testing
6.1 Internal and external quality of a system
Defining what quality is in software development, means following the two perspectives of the main stakeholders involved in a software project: the customer’s perspective and the programmer’s perspective.
If we look at defining software quality from the customer perspective, we realize that what the real user sees and feels when interacting with the software, defines the quality. Mary and Tom Poppendieck call it perceived integrity, Martin Fowler and Kent Beck call it external quality of a system.
A wonderful example, regarding external quality of a system, is given in “Lean Software Development: An Agile Toolkit” [], written by Mary and Tom Poppendieck, about how everyone needs to contribute to quality and about how different people see quality differently:
Walt Disney designed
Quality at
At
Mickey Mouse was too large, so the girl had not been able to talk to Mickey.
The driver called ahead, and when the tram arrived at the hotel, there was Mickey Mouse, waiting to meet it. The girl was thrilled, and the driver had done his job of making sure she had a quality experience.
—Mary
Looking at a software product, only from the programmer’s point of view, means mainly looking at the maintainability and extensibility sides and not only if the program respects design patterns or object oriented principles. A software program is of high quality, or it is considered to have a good design or architecture, if it can be modified, adapted and extended with ease.
Martin Fowler and Kent Beck describe internal quality:
… internal quality. This reflects the quality of the internals of the system: how well it is designed, how good the internal tests are, and so on. This is a very dangerous lever to play with. If you allow internal quality to drop you'll get a small immediate increase in speed, rapidly followed by a much bigger decrease in speed. As a result you must keep an eagle eye on this lever and make sure it is always up as far as it can go. Nothing kills speed more effectively than poor internal quality.
[Planning Extreme Programming]
Mary and Tom Poppendieck extend the concept of internal quality to conceptual integrity:
Conceptual integrity means that a system's central concepts work together as a smooth, cohesive whole. The components match and work well together; the architecture achieves an effective balance between flexibility, maintainability, efficiency, and responsiveness.
[Mary and Tom Poppendieck, 2003]
6.2 Automated tests
It is not testing alone that leads to high quality but the constant focus on maintaining quality at its highest, all the time. Traditional processes tend to confuse quality with testing, and although a lot of time is allocated for testing, is the testing phase is left to be the last stage in the process.
Agile processes have developed techniques to maintain focus on the high quality all the time, the most important being continuous integration. At the base of these principles stand automated tests, acceptance tests (defined with the customer) and unit tests that are run frequently giving feedback to the programmers about the quality status of the system. Keeping all tests running at 100%, and having a good coverage of automated tests on the system, focuses the team on quality and detects breaks early so they can be fixed fast before they eat out trough the system.
Automated tests, continuous integration, refactoring and customer involvement, combined with the focus on quality that needs to be delivered at the end of each iteration, ensure that the system’s integrity is built in and maintained all the time, from the beginning to the end.
6.3 Test Driven Development
Ron Jeffries expresses the Test Driven Development paradox, very well:
Writing code and tests is faster than writing just the code, if the code has to work.
Now let’s see if this is true:
The mini TDD experiment
To demonstrate the concepts around test first approaches and test driven development, we will make a small experiment. We assume that we are programmers and we need to code a function that divides two positive numbers. For this experiment we will compare the traditional and the TDD approaches.
Approach #1. Code and fix
As programmers, for a simple division we will write the following “pseudo” code:
Function Divide(No1, No2)
Return No1/No2
For this very simple method, let’s assume we needed 5 seconds to write it. Now let’s test if it works. First we try 6 and 2, expecting 3. It works. Let’s try another combination: 1 and 2, expecting 0.5. It works. Now let’s try 8 and 0. An error just occurred. This means we need to modify the program to display a message to the user that the second number cannot be 0:
Function Divide(No1,No2)
If No2 = 0 then display message “Division by 0 cannot be performed”
Else Return No1/No2
Now let’s test our function again. 6 and 2, result 3, good, 1 and 2, 0.5 as expected, 8 and 0 and a message “Division by 0 cannot be performed” occurs as expected. Now, our program works fine.
Assuming that manual testing is slow and for each combination of numbers we need about 10 seconds, this means that a testing session takes 30 seconds. The total time in which we developed the code was: 5 seconds to write the function, 30 seconds to test and see it has problems with division by 0, then about another 5 to correct the function and 30 minutes to test it again and make sure it works: total 5+30+5+30=70 seconds, a minute and 10 seconds.
Approach #2: Test Driven development
In test driven development, there are a series of steps to write a piece of code, starting with and automated test written first and ending up making that test succeed, by writing the code that it tests. Let’s see how it goes:
Function TestNormalDivision()
Expect 3 as a result of Divide(6,2)
The code above compares the value expected and the value returned by our (yet unwritten code) and if they do not match it fails.
One very important step now is to make sure our test really tests something and it does not work every time, no matter what the code under test does. So for this we need to make sure that when it needs to fail, it fails. So we write the following function:
Function Divide(No1, No2)
Return 0
Now we run the test, and it fails saying: expected 3 but the result was 0. So now we modify the function to return pass the test.
Function Divide(No1, No2)
Return 3
Now we run the test again: 1 test succeeded. Excellent. Now let’s see if it works for 1 and 2, so we update the test:
Function TestNormalDivision()
Expect 3 as a result of Divide(6,2)
Expect 0.5 as a result of Divide(1,2)
We run the test. Failure. Oooh, we just realize the mistake we made (code always returns 3) and modify the Divide function:
Function Divide(No1, No2)
Return No1/No2
Running the test, now passes all our expectations. But now we think, what would happen if we used 8 and 0. Let’s add a new test to the test suite (now we have two) and make sure that if there is division by 0, the user is notified:
Function TestDivisionByZero()
Expect message “Division by 0 cannot be performed” displayed as a result of Divide(8,0)
We run the test. It fails. Now we modify our function to make it work:
Function Divide(No1,No2)
If No2 = 0 then display message “Division by 0 cannot be performed”
Else Return No1/No2
Running all our tests, we discover that they all succeed.
How much time did we need to write this code? We needed 5 seconds to write the first test, 5 seconds to make sure it fails, 1 second to run the test (now testing is done by the computer so we assume it should be at least 10 times faster then manual testing), 5 seconds to modify the code to make the test work, 1 second to run the test, another 5 seconds to extend the test to verify the 1,2 combination, 1 second to see that the test fails, 5 seconds to modify the function and 1 second to see it working, another 5 seconds to write the second test and 2 seconds to see the first test work but the second failing, and 5 seconds to complete the code and another two to run the 2 tests and make sure it works. Wow, a long way: 5+5+1+5+1+5+1+5+1+5+2+5+2 = 42 seconds.
Using both approaches, we ended up with the same code. The amount of code written for the second approach is bigger then for the first, having the code and the tests. The amount of time needed for the second approach was arguably smaller then the amount for the first approach, which leads us to Ron Jeffries’s conclusion: to obtain good code, writing tests and code is faster then code alone. The main advantage is that we use computer power to do the testing rather then human power, so we are much faster. Then we can run the automated tests over and over again and it will take 2 seconds to see if they work, manually it will take 30 to do the same thing.
Let’s go further with our experiment, assuming that now we need to extend the program to be able also to do addictions, subtractions and multiplications.
Approach #1. Since all these operation are not affected by 0, but we test that anyway, the code written first will work, so it would take about 5 seconds to write each method, and testing each with 3 combinations of numbers would result in about 30 seconds to test each. The amount of time, needed would be 5+30+5+30+5+30 = 105 seconds, 1 minute and 45 seconds. Testing the whole program (the 3 new methods and the division method) would take us 4*30 = 120 seconds, which is 2 minutes.
Approach #2. TDD
Operations just as above will need only one test, checking 3 combinations. Let’s say it takes 10 seconds to write a test method like this:
Function TestMultiplication()
Expect 0 as a result of Multiply(6,0)
Expect 3 as a result of Multiply(3,1)
Expect -9 as a result of Multiply(3,-3)
Then we’d have to make sure it fails: 5 seconds, 1 second to run the test, then we’d write the code to make it work: 5 seconds and 1 second to make sure it works, so it takes about 10+5+1+5+1 = 22 seconds for each new function, resulting in 3*22 = 66 seconds or 1 minute and 6 seconds to write the new functions. Testing all the code would mean running 5 test methods (2 for division and 1 for the other three), which would be run in 5 seconds.
Tests and code, faster then just code
Comparing the times needed to test our incredibly simple system: 2 minutes vs. 5 seconds show us that not only the code is written faster (110+105=215 seconds vs 66+45=111 seconds), but making sure it works requires far less time for the TDD approach. And second big advantage, it can be done by a computer.
Using a continuous integration machine that downloads the program sources and runs the test suite, then sends us an email telling us what happened, means 5 seconds for the machine and 0 seconds on my side to test the whole system. Using the first approach, would take me 2 minutes to make sure the whole system works. I could delegate this responsibility of testing the whole system to the testing team, but the feedback times, telling me whether the system works as a whole or not, increase rapidly to days and weeks and by that time I should be doing something else.
Scalability
In the 3rd phase of our little experiment, we analyze what would happen if our system would have 400 functions instead of 4. Using the first approach it would need about 12000 seconds (that is over 3 hours) for a full test, while using the TDD and automated testing suite about 500 seconds or, better said, less then 10 minutes. This simple sample shows us scalability when it comes to TDD vs traditional coding approaches. The testing team could work, to some extent in parallel, but after, all I could set my integration machine to divide the tests and work in parallel.
After making this very small experiment, we showed how test driven development is, compared with just coding:
- faster to develop
- faster to test the whole system and give feedback
- scalable
Tests as documentation
Another advantage of the method described above is that, the automated tests can act as a very good documentation of the code written. In traditional approaches, just documenting things that can be very easily deduced from the automated tests, like how a function works, would increase even more the development time. After all, just reading:
Function TestDivisionByZero()
Expect message “Division by 0 cannot be performed” displayed as a result of Divide(8,0)
tells me or someone new in the project, that if you try to perform a division by 0, the system will display an error message on the screen.
Embrace change: how?
Having a system with 4, 400, 1000…100.000 methods, doesn’t really comfort me when it comes to making a change in it. If I change one tiny piece of code somewhere, could I break something in another part of the system? And if I do, how could I know fast enough, to be able to either correct it or reverse my changes?
To have the feedback from the code, telling me if and where I’ve broken some existing functionality, I would normally need to retest the whole system. For a 4 methods system, it would take 2 minutes, but for a more likely system, it might take hours, days or even weeks. So the courage to change decreases with the system getting bigger, thus is shortening the life of the system. When a system is too rigid and can no longer adapt to the changes on the market, it is bound to die.
Having a full regression automated test suite, that runs very fast and can be run very often, means fast feedback. Fast feedback means changes are less risky and can be done easier and faster, thus extending the life of a system.
Design advantages
Another advantage of writing automated tests for the code is that the code written tends to be very loosely coupled, thus better designed. Test driven development also tends to eliminate “partially completed code”, encouraging less code to be written, as the programmer is more focused on what is really needed, thus decreasing the amount of code and its complexity.
At a macro level, the fact that changes, even in the architecture are much easier to be performed, when using TDD combined with aggressive refactoring, allows the programmers to continuously upgrade the design and update the architecture. Since the changes are easy to do, the evolutionary design technique is encouraged, having a much smaller need to build a flexible architecture upfront, following the YAGNI principle from XP.
How do we achieve external quality at 100% all the time?
After each iteration, a potentially shippable system is delivered, so quality is at 100%, every 2-4 weeks, maintaining this way the system at the highest level throughout its lifecycle.
Frequent releases that allow the customer representatives to see the system all the time are maintaining the project on the right path and at the right perceived quality. When the customer sees the software he can very easily spot things that are not according with what he thought they would be, and together in the next iteration these issues can be fixed. The focus on quality in agile development is mainly expressed by the iterative and incremental development model, which allows the stakeholders in a project to identify problems early and correct them as soon as they are found, thus maintaining the system on the highest quality.
As a small sample, we think about one frequent non-functional requirement: the system must be fast. In traditional testing techniques, the test and development teams had to figure out by themselves what fast means. This is a very difficult requirement to deal with, because it is almost impossible to measure. Presenting the system, after an iteration, to the customer, might let him say: “I want that report to be faster. At this time it needs about 10 seconds. Can you make it faster? Let’s say 2-4 seconds”. Suddenly, there is a clear definition on where “fast” applies, allowing the team to focus on solving a concrete problem. This is a simple sample, showing how perceived quality is maintained by the continuous collaboration and feedback between the customer and the team.
Who is involved in quality, how and when?
Everyone, all the time. In traditional systems, the responsibility for quality is mainly delegated to testing teams that must make sure the code is of high quality. Agile thinking makes quality a collective responsibility of the customer, the developers and the testers all the time from the first, to the last minute of a project.
The customer is involved in quality by defining acceptance tests. The developers are involved in quality by helping the customers write the tests, by writing unit tests for all the production code they write and the testers are involved by helping the developers automate acceptance (customer) tests and by extending the suite of automated tests.
Manual vs. automated testing
Manual testing is not forgotten, although in agile methodologies a great emphasis is put on automated tests. Manual testing is still performed. Agile testing is about balancing between the need to automate when it is beneficial and relying on manual testing when it is more efficient then writing automated tests.
How do we achieve internal quality?
Simplicity, unit tests running at 100%, refactoring, YAGNI, collective ownership, continuous integration combined with good coding techniques, are all practices that as a whole are aimed at maintaining the code quality at its highest all the time.
Regarding code quality and the simplicity principle, Ron Jeffries writes:
Everything we write must:
- Run all the tests
- Express every idea we need to express
- Say everything once and only once
- Have a minimum number of classes and methods, consistent with the above
[Ron Jeffries, 2001]