Becoming Agile: 2006

Monday, December 18, 2006

Ajax Scaffolding with Castle MonoRail and C#

Download Sources, and view demos

Goal

Let's say we need to write an application very fast, that can do the basic CRUD operations for a Product. Ruby on rails (www.rubyonrails.org) came up with the excellent idea of scaffolding, and the idea was ported into the Castle Monorail project (www.castleproject.org). However, the default generator both in ROR and in MR, do not generate ajax based code. For ROR the solution is at: www.ajaxscaffold.com but nothing so far for MR. So I decided to take matters in my own hands...

Where do we start from?

The excellent generator by Marc Andre Cournoyer can generate projects, controllers, migrations, model etc and the templates can be changed to suit one's needs extremely easy. Starting from Marc Andre's generator I configured the templates to generate AJAX based code.

Let's see how it works, step by step starting from nothing.

Step 1: we make a new MonoRail project

Using latest RC2, and VS.NET 2005, we generate the project using the wizard inside VS.NET.

resulting in:

Then we edit the web.config and we create a new database called Products.

Step 2: we generate the persistent activerecord class and the database table

Starting for a basic Castle Monorail/ActiveRecord project, we start by creating the Product class, and then generating the database table corresponding to it. In the application's root folder, we type: generate model Product Name Description Price and it generates the code for us:

obtaining the following code (click view all files on the web project, and include the new file in project)

///

/// An instance of this class represents a Product.
///

[ActiveRecord]
public class Product : ActiveRecordValidationBase {
#region Fields
private int _id;
private string _name;
private string _description;
private string _price;
#endregion

#region Persistent properties
[PrimaryKey]
public int Id {
get { return _id; }
set { _id = value; }
}
[Property]
public string Name {
get { return _name; }
set { _name = value; }
}
[Property]
public string Description {
get { return _description; }
set { _description = value; }
}
[Property]
public string Price
{
get { return _price; }
set { _price = value; }
}
#endregion

internal static IList FindAll()
{
return FindAll(typeof(Product));
}

internal static Product Find(int id)
{
return FindByPrimaryKey(typeof(Product), id) as Product;
}

}

To generate the database schema we just use:

ActiveRecordStarter.CreateSchema();

and we have the table in our database.

Step 3: we generate the Ajax Scaffolding code

We type in our command window: generate scaffold Product and it will generate all the needed code dor adding, listing, editing and deleting products, using Ajax:

now we include the generated files in our solution from Visual Studio 2005 (right click, Include in project):

Step 4: we run the code. It works!

We start the application:

then we add:

after adding two products:

now viewing in place:

and editing:

and the same with deleting.

What about validations?

Let's say that the name and the price are required. We modify the Product class, to add the validation attributes:

///

/// An instance of this class represents a Product.
///

[ActiveRecord]
public class Product : ActiveRecordValidationBase {
#region Fields
private int _id;
private string _name;
private string _description;
private string _price;
#endregion

#region Persistent properties
[PrimaryKey]
public int Id {
get { return _id; }
set { _id = value; }
}
[Property,ValidateNotEmpty("Name is not optional.")]
public string Name {
get { return _name; }
set { _name = value; }
}
[Property]
public string Description {
get { return _description; }
set { _description = value; }
}
[Property, ValidateNotEmpty("Price is not optional.")]
public string Price
{
get { return _price; }
set { _price = value; }
}
#endregion

internal static IList FindAll()
{
return FindAll(typeof(Product));
}

internal static Product Find(int id)
{
return FindByPrimaryKey(typeof(Product), id) as Product;
}

}

and restart the application:

How hard is it to extend the product?

Let's say that we've just found out that besides the Name, Description and Price, we also need to add to each product a serial number. We modify the Product class, adding the new field and property:

private string _serialNumber;

...

[Property]
public string SerialNumber
{
get { return _serialNumber; }
set { _serialNumber = value; }
}

Now, we recreate the table:

ActiveRecordStarter.CreateSchema();

and start the application:

and

so it doesn't need any modification in the code or views.

What about relationships?

Let's consider that we want each product to be part of a Category. Following the steps above we create a class Category:

[ActiveRecord]
public class Category : ActiveRecordValidationBase {
#region Fields
private int _id;
private string _name;
private IList products = new ArrayList();
#endregion

#region Persistent properties
[PrimaryKey]
public int Id {
get { return _id; }
set { _id = value; }
}
[Property]
public string Name {
get { return _name; }
set { _name = value; }
}

[HasMany(typeof(Product))]
public IList Products
{
get { return products; }
set { products = value; }
}

#endregion

public override string ToString()
{
return this.Name;
}

and modify the product to be part of a Category by adding:

private Category _category=null;
...
[BelongsTo("CategoryID")]
public Category Category
{
get { return _category; }
set { _category = value; }
}

we regenerate the database, obtaining:

Using the generator we also generate the ajax scaffolding code to add/edit/delete and list categories. Now we add two categories:

and going to add new products we'll also have the category combobox, without us making any modification in the code or the views:

Conclusion

As it can be seen above, with very little effort, in 10 minutes we can create an easy to extend ajax application (thanks to Castle Project and Marc Andre's generator). The code is quite clear and easy to debug and extend, and since it is ajax based it can be easily integrated in existing web applications.

Tuesday, October 24, 2006

Agile Methodologies Presentation: from problems to agile solutions: iterations, TDD and reflective improvement

Download as ppt here

Problems

-information too late to managers - better tracking - solving it, increases profit by not losing lots of work and redoing it
-bad quality projects - solving it increases profits, by not debugging from our own pocket
-late projects, lack of time - solving it increases customer satisfaction and guarantees future contracts
-poor communication with the customer, management (already discussed), inside team and with the code - solving it increases the speed and trust in development

Solutions
the 3 core practices as a solution

-Iterations – projects get divided into short equal iterations, at the end of each some new functionality get shown to the client
-TDD – Test Driven Development, a design practice oriented at making development faster and changes late in development much easier
-Self improvement meetings – reflection workshops listing that activities divided as: to keep, to drop, to try

Information Too Late To Managers: Sow how can we manage?

Problem:

Manager need to know the information fast and very reliable, but it is not happening until it is usually very late and hard to cope with

Causes:

-the way information travels to managers is in many times flawed. Asking an engineer/programmer whether will he be ready on the planned date, is just like asking whether your new Armani suit looks good on you. In most cases the cannot tell you no, even if they are very well intentioned.
-a manager gets to see the actual software a few days before the delivery is scheduled, when in many cases is too late
-most times managers only get reports about the work, which as seen previously can be made more positive then the reality is, and if the status document travels upwards, it tends to be 'improved' by the ones that get their hands on it

Solution (Information Too Late To Managers: Sow how can we manage?)

2 managers (Ken Schwaber and Jeff Sutherland) running large companies, with lots of codebase and problems in the early 90s invented a methodology that allows them to get the REAL information ON TIME. It is called SCRUM, and it only has very few practices:

- Self managing, cross functional teams or max 6-8 people (for more a scrum of scrums is applied)
- Backlog (the list of things to be done)
- Sprints (One month iterations)
- Sprint planning meeting (at the beginning of the sprint)
- Product demo meeting (demo to the client at the end of the sprint)
- Daily meetings (each morning developers say what they do, and what the problems are if any)

A process outline

1. Build backlog (list of items/features etc needed in the application) and prioritize – the big plan with releases
2. Organize the first iteration/sprint, cutting from the prioritized backlog - a sprint=a project
3. Iteration/Sprint Planning meeting - detailing with the customer what needs to be done, no need for too many details before
4. Daily meetings throughout development - finding out what is going on
5. Product demo meeting - SEEING actually where the project really is
6. Go to step 1 or 2 for the second sprint

So the backlog is the plan to follow, the demo is when the manager sees if the team is on track or not. If it is not they can be fired, replaced etc instead of actually seeing the problems after 5 months on a 6 months project, when it is impossible to do something. In the daily meeting the manager can find out what are the problems in the team, especially if they are going to be late with the sprint delivery.

Process sample: building the backlog/list of requirements

A new client who wants a new software product to manage and track his sales and customers. System to go live in 2 months. We meet and establish what is required, then plan to meet the deadline. After the first meetings we come up with the following feature list, that he thinks he wants, that is afterwards estimated by us:

Client management (3p)
Product management (3p)
Sales leads management (4p)
Sales reports (3p)
Client activity management (3p)
User management (2p)
Sales workflow (3p)

Process sample: planning the releases

Now together with the customer, we make the first plan, dividing the work in two iterations:

Iteration #1: 10 p
Client management (3p)
Product management (3p)
Sales leads management (4p)
Iteration #2: 11p
Sales reports (3p)
Client activity management (3p)
User management (2p)
Sales workflow (3p)

After we have a plan to deliver the 21 points features in 2 months, we start working, by starting on the first iteration, at the end of which we deliver features 1,2 and 3.

Process sample: delivering and customer wants more

We now show the customer the first 3 features implemented, but he suddenly realizes that he needs more, then what was planned for the release. We start by adding what he wants to the list of features, which now becomes, after each of the new features are being estimated.

Client management (3p)
Product management (3p)
Sales leads management (4p)
Sales reports (3p)
Client activity management (3p)
User management (2p)
Sales workflow (3p)
Activity calendar (3p)
Forecast reports (3p)
Document templates and document merging (3p)

Process sample: changing/adapting the plan

We cannot deliver all in the 2 months term, but we make a delivery after the two months and we plan another delivery after that will include other features that will make the system more compete. The new plan looks like:

Release #1: 2 months
Iteration #1: 10p
Client management (3p)
Product management (3p)
Sales leads management (4p)
Iteration #2: 11p
Sales reports (3p)
Client activity management (3p)
User management (2p)
Forecast reports (3p)

Release #2 not defined fully yet
Iteration #3: 9p
Sales workflow (3p)
Activity calendar (3p)
Document templates and document merging (3p)

Process sample: new requirements/new plan

Release #1: 2 months
Iteration #1: 10 p
Client management (3p)
Product management (3p)
Sales leads management (4p)
Iteration #2: 11p
Sales reports (3p)
Client activity management (3p)
User management (2p)
Forecast reports (3p)

Release #2
Iteration #3 – 9p
Sales workflow (3p)
Activity calendar (3p)
Document templates and document merging (3p)
Iteration #4 – 10p
Contact communication management (3p)
Microsoft outlook integration (2p)
Sales processes (5p)

Iterations: benefits

-increase focus and the need to organize inside the team by give a near-term deadline Mark Twain: “Nothing focuses the mind like a noose”
-good instrument for planning, tracking progress
If the product is presented each time, it is easy to see where the project is
-minimize risks Agile thinking minimizes risks because it focuses on the most important and valuable features for the customer, and develops them first
-good instrument for managing changes in software At the end of an iteration, direction can be changed, and plans adapted to the new needs.
-good instrument to build trust By showing the customer the software after each iteration, he can see progress

Iterations: benefits (2)

-allow learning and adapting
Because the customer sees the results after an iteration, he can better express his needs, and he better understands what’s possible and what’s cost effective
-are a good instrument for development At the end of an iteration, a set of features must be shown as working. This transfers the focus of the developers, from developing a system horizontally, layer by layer, from data access, to business and presentation layers, assembling the whole system at the end. Instead the focus is on delivering working features, developing them vertically on all layers.
-build confidence and motivate in the team One very important aspect in development, especially in the early stages of a project is an early victory. By delivering the first iteration, the team starts to see something positive happening and starts to build confidence that it will win. With each new iteration a new battle is won, getting the team closer and closer to winning the war.
-bring honesty Since iterations are short the customer sees the product developed quickly, delays are surfaced very early, not allowing them to grow into serious problems.
-good instrument to increase quality At the end of each iteration, a potentially shippable product must be shown to the customer. This focuses the team on keeping the quality high, never letting bugs and quality problems perpetuate.

Bad quality increasing costs dramatically (debugging time)

What is quality?

In order to understand what quality is we must divide it into external quality of a system and internal quality, where external quality means what the customer thinks of the system, and internal quality is when the program is easy to change, extend and to maintain

External quality – whether the customer is happy by what he sees and feels
Problem: Poor external quality makes the customer send back the project for rework, and rework kills profits very fast.
Testing a system only at the end is not going to make the project deliverable faster, but in most cases it only shows it down. This does not mean that untested system should be delivered, but that testing should be started in min 1 of a project (acceptance tests, TDD, manual testing as the tools)
Solution: frequent feedback from the customer.
Tools:
Iterations and product presentation meetings. When each iteration is finished, the product’s planned features are demo-ed to the customer
Acceptance tests gathered from the customer, at the beginning of the iteration, to confirm at the end that what was asked for was implemented

Internal quality

Internal quality – whether the project can be easily changed, maintained and extended.
Problem: “Nothing kills speed more effectively than poor internal quality” Martin Fowler - Planning Extreme Programming
Solution: fast feedback from the code
Tool: TDD – Test Driven Development – by writing all code test first, you end up with all the production code backed up by a series of regression tests. When you want to change something by running the regression suite of automated suite of tests, you can find out very fast whether you’ve broken something that was already implemented. This is the feedback from the code itself

Test Driven Development

TDD explained:

1. Write a test (specification for what the code must do – TDD design method)
2. Make it fail
3. Write the code to make it work
4. Refactor (improve the code)
5. Go to step 1

Benefits of TDD:
- fast feedback from code as the project increases, people can move ahead faster – projects cost less. When a system grows, the biggest problem is whether when someone fixes a bug,
-debugging takes less, reducing costs. With TDD the number of bugs decreases, and also when bugs are fixed, they can be fixed faster (by making sure new ones are not introduced)
-tests as documentation of what the code actually does

TDD vs. code and fix

“To obtain good code, writing tests and code is faster then code alone” – Ron Jeffries, 2006

We tend to think adding automated tests the development time increases. Test+code time> code time. 1h+4h>4h
This thinking presumes that the code done in the 4 hours of development is bug free. That is bug free and will stay like that even if the system around it changes, gets bigger etc. Usually is not so 1h+4h>4h+x?

Sample
1 programmer paid 10£/week, needs to implement 1 feature estimated to 10 days
TDD: tests (15-25%) 2.5+10=12.5 days  25£
Code and fix: 10+5 debugging time =15  30£

Writing tests firsts focuses development, so many unneeded code is avoided so the additional 15-25% actually doesn’t exist

Not enough time, projects late - actually too much to do

Cause: In many cases, the problem with not enough time, is actually that one person has too many things to do in a limited amount of time.
Solution: eliminate waste and cutting overhead
By eliminating non crucial work, work that does not add immediate value to the customer, the people involved in a project will have more time, they will be able to be creative, focus on quality and deliver products at the customer on time.

What is waste in software development?
Taiichi Ohno’s (the father of Toyota Production System) said that anything that does not add value to a product, as perceived by the customer, is waste. Agile methodologies have emerged as a response to the chaos resulting from inappropriately used resources which waste time and energy.

Waste activities (from Lean Software Development-Mary, Tom Poppendieck)
partially done work
extra processes
extra features
switching tasks among workers (requiring additional learning curves)
waiting
motion and defects
Management activities
Not using the most productive tools

Waste activities (from Lean Software Development-Mary, Tom Poppendieck)

Partially done work
Code, documents, activities that get done partially and are not carried out until their end, only waste important resources without adding any value for the client

Extra processes
“Do you ever ask, Is all that paperwork really necessary? Paperwork consumes resources. Paperwork slows down response time. Paperwork hides quality problems. Paperwork gets lost. Paperwork degrades and becomes obsolete. Paperwork that no one cares to read adds no value.
Many software development processes require paperwork for customer sign-off, or to provide traceability, or to get approval for a change. Does your customer really find this makes the product more valuable to them? Just because paperwork is a required deliverable does not mean that it adds value. If you must produce paperwork that adds little customer value, there are three rules to remember: Keep it short. Keep it high level. Do it off line. “ – Lean Software Development

Waste activities (from Lean Software Development-Mary, Tom Poppendieck)

Extra features
66% of the features of a system are never or rarely used. Many customers spend fortunes on features never used. This is the most efficient method to cut costs.

Switching tasks
A developer moved from one project to another needs time to adjust to the new project: the learning curve. In many cases the times are either bigger then it takes t actually fix the bug or do the new feature, and in most cases because of lack of knowledge on the project he breaks something existing. When moved back to the project he will need time again to readjust to the system he was working on. This is the biggest problem we are facing.The fastest way to complete two projects that use the same resources is to do them one at a time. – Lean Software Development

Waiting
Using a traditional sequential development process, means that teams wait on each other. The delays are propagated and amplified though the whole project. Working as much as possible in parallel, is much more efficient. Sequential development pairs very well with switching tasks amplifying the problems.

Value stream mapping chart (wait time vs work time)

Value stream maps often show that non-development activities are the biggest bottlenecks in a software development value stream.

In 1970 Winston Royce (inventor of the waterfall model)"[While] many additional development steps are required, none contribute as directly to the final product as analysis and coding, and all drive up the development costs."

Communication problems

Three categories:
-customer representatives
-management
-developers

Problems:
Information too late, travelling too much
Too less or too much information (quantity vs. quality)

The specification document needs to much time to be built, and is over-complete in most cases (if the doc is big, the customer will read it later, adding to the amount of time unused)
It is hard to change and track changes in it
It is hard to actually see if the customer will accept it or not

Solution: replace documents with prioritized lists, details with acceptance tests
Send the list of items faster to the developers, and let them ask for more details – pull systems working in parallel

Reflective Improvement

CRUCIAL: Team looks back every iteration, and improves their process

Monday, June 19, 2006

To obtain good code, writing tests and code is faster then code alone

A few weeks ago on the TestDrivenDevelopment mailing list, Ron Jeffies, one of the XP gurus stated that "in order to obtain good code, writing tests and code is faster then just code". To find out if this is true or not let's make a small experiment.

The mini TDD experiment

We assume that we are programmers and we need to code a function that divides two positive numbers. For this experiment we will compare the traditional and the TDD approaches.

Approach #1. Code and fix

As programmers, for a simple division we will write the following “pseudo” code:

Function Divide(No1, No2)

Return No1/No2

For this very simple method, let’s assume we needed 5 seconds to write it. Now let’s test if it works. First we try 6 and 2, expecting 3. It works. Let’s try another combination: 1 and 2, expecting 0.5. It works. Now let’s try 8 and 0. An error just occurred. This means we need to modify the program to display a message to the user that the second number cannot be 0:

Function Divide(No1,No2)

If No2 = 0 then display message “Division by 0 cannot be performed”

Else Return No1/No2

Now let’s test our function again. 6 and 2, result 3, good, 1 and 2, 0.5 as expected, 8 and 0 and a message “Division by 0 cannot be performed” occurs as expected. Now, our program works fine.

Assuming that manual testing is slow and for each combination of numbers we need about 10 seconds, this means that a testing session takes 30 seconds. The total time in which we developed the code was: 5 seconds to write the function, 30 seconds to test and see it has problems with division by 0, then about another 5 to correct the function and 30 minutes to test it again and make sure it works: total 5+30+5+30=70 seconds, a minute and 10 seconds.

Approach #2: Test Driven development

In test driven development, there are a series of steps to write a piece of code, starting with and automated test written first and ending up making that test succeed, by writing the code that it tests. Let’s see how it goes:

Function TestNormalDivision()

Expect 3 as a result of Divide(6,2)

The code above compares the value expected and the value returned by our (yet unwritten code) and if they do not match it fails.

One very important step now is to make sure our test really tests something and it does not work every time, no matter what the code under test does. So for this we need to make sure that when it needs to fail, it fails. So we write the following function:

Function Divide(No1, No2)

Return 0

Now we run the test, and it fails saying: expected 3 but the result was 0. So now we modify the function to return pass the test.

Function Divide(No1, No2)

Return 3

Now we run the test again: 1 test succeeded. Excellent. Now let’s see if it works for 1 and 2, so we update the test:

Function TestNormalDivision()

Expect 3 as a result of Divide(6,2)

Expect 0.5 as a result of Divide(1,2)

We run the test. Failure. Oooh, we just realize the mistake we made (code always returns 3) and modify the Divide function:

Function Divide(No1, No2)

Return No1/No2

Running the test, now passes all our expectations. But now we think, what would happen if we used 8 and 0. Let’s add a new test to the test suite (now we have two) and make sure that if there is division by 0, the user is notified:

Function TestDivisionByZero()

Expect message “Division by 0 cannot be performed” displayed as a result of Divide(8,0)

We run the test. It fails. Now we modify our function to make it work:

Function Divide(No1,No2)

If No2 = 0 then display message “Division by 0 cannot be performed”

Else Return No1/No2

Running all our tests, we discover that they all succeed.

How much time did we need to write this code? We needed 5 seconds to write the first test, 5 seconds to make sure it fails, 1 second to run the test (now testing is done by the computer so we assume it should be at least 10 times faster then manual testing), 5 seconds to modify the code to make the test work, 1 second to run the test, another 5 seconds to extend the test to verify the 1,2 combination, 1 second to see that the test fails, 5 seconds to modify the function and 1 second to see it working, another 5 seconds to write the second test and 2 seconds to see the first test work but the second failing, and 5 seconds to complete the code and another two to run the 2 tests and make sure it works. Wow, a long way: 5+5+1+5+1+5+1+5+1+5+2+5+2 = 42 seconds.

Using both approaches, we ended up with the same code. The amount of code written for the second approach is bigger then for the first, having the code and the tests. The amount of time needed for the second approach was arguably smaller then the amount for the first approach, which leads us to Ron Jeffries’s conclusion: to obtain good code, writing tests and code is faster then code alone. The main advantage is that we use computer power to do the testing rather then human power, so we are much faster. Then we can run the automated tests over and over again and it will take 2 seconds to see if they work, manually it will take 30 to do the same thing.

Let’s go further with our experiment, assuming that now we need to extend the program to be able also to do addictions, subtractions and multiplications.

Approach #1. Code&Fix

Since all these operation are not affected by 0, but we test that anyway, the code written first will work, so it would take about 5 seconds to write each method, and testing each with 3 combinations of numbers would result in about 30 seconds to test each. The amount of time, needed would be 5+30+5+30+5+30 = 105 seconds, 1 minute and 45 seconds. Testing the whole program (the 3 new methods and the division method) would take us 4*30 = 120 seconds, which is 2 minutes.

Approach #2. TDD

Operations just as above will need only one test, checking 3 combinations. Let’s say it takes 10 seconds to write a test method like this:

Function TestMultiplication()

Expect 0 as a result of Multiply(6,0)

Expect 3 as a result of Multiply(3,1)

Expect -9 as a result of Multiply(3,-3)

Then we’d have to make sure it fails: 5 seconds, 1 second to run the test, then we’d write the code to make it work: 5 seconds and 1 second to make sure it works, so it takes about 10+5+1+5+1 = 22 seconds for each new function, resulting in 3*22 = 66 seconds or 1 minute and 6 seconds to write the new functions. Testing all the code would mean running 5 test methods (2 for division and 1 for the other three), which would be run in 5 seconds.

Tests and code, faster then just code

Comparing the times needed to test our incredibly simple system: 2 minutes vs. 5 seconds show us that not only the code is written faster (110+105=215 seconds vs 66+45=111 seconds), but making sure it works requires far less time for the TDD approach. And second big advantage, it can be done by a computer.

Using a continuous integration machine that downloads the program sources and runs the test suite, then sends us an email telling us what happened, means 5 seconds for the machine and 0 seconds on my side to test the whole system. Using the first approach, would take me 2 minutes to make sure the whole system works. I could delegate this responsibility of testing the whole system to the testing team, but the feedback times, telling me whether the system works as a whole or not, increase rapidly to days and weeks and by that time I should be doing something else.

Scalability

In the 3rd phase of our little experiment, we analyze what would happen if our system would have 400 functions instead of 4. Using the first approach it would need about 12000 seconds (that is over 3 hours) for a full test, while using the TDD and automated testing suite about 500 seconds or, better said, less then 10 minutes. This simple sample shows us scalability when it comes to TDD vs traditional coding approaches. The testing team could work, to some extent in parallel, but after, all I could set my integration machine to divide the tests and work in parallel.

After making this very small experiment, we showed how test driven development is, compared with just coding:

o faster to develop
o faster to test the whole system and give feedback
o scalable

Tests as documentation

Another advantage of the method described above is that, the automated tests can act as a very good documentation of the code written. In traditional approaches, just documenting things that can be very easily deduced from the automated tests, like how a function works, would increase even more the development time. After all, just reading:

Function TestDivisionByZero()

Expect message “Division by 0 cannot be performed” displayed as a result of Divide(8,0)

tells me or someone new in the project, that if you try to perform a division by 0, the system will display an error message on the screen.

Embrace change: how?

Having a system with 4, 400, 1000…100.000 methods, doesn’t really comfort me when it comes to making a change in it. If I change one tiny piece of code somewhere, could I break something in another part of the system? And if I do, how could I know fast enough, to be able to either correct it or reverse my changes?

To have the feedback from the code, telling me if and where I’ve broken some existing functionality, I would normally need to retest the whole system. For a 4 methods system, it would take 2 minutes, but for a more likely system, it might take hours, days or even weeks. So the courage to change decreases with the system getting bigger, thus is shortening the life of the system. When a system is too rigid and can no longer adapt to the changes on the market, it is bound to die.

Having a full regression automated test suite, that runs very fast and can be run very often, means fast feedback. Fast feedback means changes are less risky and can be done easier and faster, thus extending the life of a system.

Design advantages

Another advantage of writing automated tests for the code is that the code written tends to be very loosely coupled, thus better designed. Test driven development also tends to eliminate “partially completed code”, encouraging less code to be written, as the programmer is more focused on what is really needed, thus decreasing the amount of code and its complexity.

At a macro level, the fact that changes, even in the architecture are much easier to be performed, when using TDD combined with aggressive refactoring, allows the programmers to continuously upgrade the design and update the architecture. Since the changes are easy to do, the evolutionary design technique is encouraged, having a much smaller need to build a flexible architecture upfront, following the YAGNI principle from XP.

Saturday, May 20, 2006

Refactoring a legacy web application with Selenium

About two months ago, one of our clients decided, to change the way one java web application that we built for them a few years ago worked. The application is built to gather and process different kinds of measurements in a medical sector, building different charts and analysis reports on these measurements.

The requirements

When the application was first built, it was decided that each client (hospital etc) will have their own instance of the application installed at their location. That meant for instance that when a report is run, it is done on all the data from the database.

Now the client, after a market research decided that it was to hard to deploy and maintain a vast amount of applications in different locations, so they asked me if we could change the way the aplication working, so they could host the application, the application can have one database, but multiple clients can log into it and work. This would mean that the data in the database should be divided for each of their clients so they can only see their own data and now the other's.

The solution was to have a master table, I called it entities, which would now have a foreign key in all measurement tables. Although altering the database wasn't such a problem, I had to maintain the existing functionality as it was, and because of the domain there were some difficulties, especially because some complicated phisical methods of calculations were used troughout the software.

How can I change but keep existing functionality exactly the same?

If the application needs to stay the same, I had the idea of recording some testing scenarios using the great open source Selenium[1] and the Mozilla Firefox extention, called Selenium IDE [2], trough the web interface that worked for the legacy version, that would need to work with the new version also.

But if I record a database test, and want it to work after I change the internals of the project, this means that I need to have the same data into the database, when the test is rerun. For this first I created a servlet which I called CreateTestDatabaseServlet who's mission was to reset the entire data from a database to a known state. Based on that state I could record my tests, because I knew what should appear on the user interface. For instance, my servlet, cleared all FilmType rows from the database and added Film Type A, Film Type B and Film Type C. Now when going into the web application, I knew for sure that the combo on a screen , where I can choose a FilmType has the 3 values all the time:

The code of the servlet is like:

public void doGet(HttpServletRequest request, HttpServletResponse response) throws
ServletException, IOException{

this.session = HibernateUtil.getSession();
response.setContentType(CONTENT_TYPE);
PrintWriter out = response.getWriter();
out.println("");
out.println("

The servlet has received a " + request.getMethod() +
". This is the reply.

");
out.println("

"+this.createTestDatabase()+"

");

try
{
out.println("

" + this.insertTestData() + "

");
}
catch(Exception Ex)
{
Ex.printStackTrace();
}
finally
{
try {
this.session.close();
} catch (HibernateException ex) {
ex.printStackTrace();
}
this.session = null;
}
out.println("");
out.println("");
out.close();
}

where:

private String createTestDatabase() {
new AECBO(this.session).deleteAll();
new TubeRoomBO(session).deleteAll();
new UserBO(session).deleteAll();
new EntityBO(session).deleteAll();
new SUMBO(session).deleteAll();

...

return "";
}

and

private String insertTestData() {

...

createFilmType("Film Type A");
createFilmType("Film Type B");
createFilmType("Film Type C");

...

createProjection("Left Crano-Caudal");
createProjection("Right Crano-Caudal");

....

createUnit("FFD","centimeters (cm)",1);
createUnit("TRC","centimeters (cm)",1);
createUnit("FSD","meters (m)",1);
createUnit("BT","centimeters (cm)",1);
createUnit("CIO","micro-Grays (µGy)",1);

...
}

Now when a Selenium test is run, the first thing it does it reset the database to the known state, then login (I already know I have a user created by the servlet, that has a/a as credentials) and do what is necessary:

|open|/app/CreateTestDatabaseServlet|
|open|/app/login/login.faces?random=734574375|
|type|loginForm:userName|a|
|type|loginForm:password |a|
|clickAndWait|link=Login|
|clickAndWait|link=Image Quality|
|type|_id1:main:dateinput||

With the method to reset the database in place, using Selenium IDE [2] I started recording tests for all parts of the application, following different scenarios, which I then saved and created a suite of selenium tests, and after I made sure all of them work, I started dissecting the code.

Then I took a test, refactored the code and database until I made that selenium test work. Then I took the next test, until it and the one before it worked. I looked if I could make some refectorings, I did them ensuring nothing is broken and moved to the next test, until one month later all tests were working.

Although it doesn't seem like very much, the selenium regression tests helped me a great deal, telling me in 7 minutes if the existing functionality of the application (which took more then a year to develop because there are some very complicated phisics formulas in it) is working exactly as it did before, but with the extentions added. One of the big refactorings I did also was to remove the built in data access layer and business objects, and replace everything with Hibernate [3].

So it proved that my plan of action
1. Reset state of the database
2. Record the tests
3. Refactor - test until the recorded tests works again

worked very well and very fast.

Conclusion

Without Selenium, Selenium IDE and Hibernate, the entire operation could have taken a lot more then a month, because of the constant fear that the existing functionality will be broken and would need tobe rebuilt, after more then a year was already invested in the project.

[1] Selenium - www.openqa.org/selenium
[2] Selenium IDE - www.openqa.org
[3] Hibernate - www.hibernate.org

Saturday, April 08, 2006

Agile adaptive planning and fast delivery sample

After seeing that there are very many people that find it hard to see how agile handles adaptive planning, letting the customer not know from the beginning what he wants, I wanted to develop a mini sample to show these concepts in practice.

A small agile process practice sample

Let’s consider that we have a new client who wants a new software product to manage and track his sales and customers. He wants the system to go live in 2 months. We meet and establish what is required, then plan to meet the deadline.

After the first meetings we come up with the following feature list, that he thinks he wants, that is afterwards estimated by us:

Client management (3p)

Product management (3p)

Sales leads management (4p)

Sales reports (3p)

Client activity management (3p)

User management (2p)

Sales workflow (3p)

Now together with the customer, we make the first plan, dividing the work in two iterations:

Iteration #1: 10 p

Client management (3p)

Product management (3p)

Sales leads management (4p)

Iteration #2: 11p

Sales reports (3p)

Client activity management (3p)

User management (2p)

Sales workflow (3p)

After we have a plan to deliver the 21 points features in 2 months, we start working, by starting on the first iteration, at the end of which we deliver features 1,2 and 3.

We now show the customer the first 3 features implemented, but he suddenly realizes that he needs more, then what was planned for the release. We start by adding what he wants to the list of features, which now becomes, after each of the new features are being estimated.

Client management (3p)

Product management (3p)

Sales leads management (4p)

Sales reports (3p)

Client activity management (3p)

User management (2p)

Sales workflow (3p)

Activity calendar (3p)

Forecast reports (3p)

Document templates and document merging (3p)

With the client we realize we cannot deliver all in the 2 months term, but we decide that we make a delivery after the two months and we plan another delivery after that will include other features that will make the system more compete. The new plan looks like:

Release #1: 2 months
Iteration #1: 10p

Client management (3p)

Product management (3p)

Sales leads management (4p)

Iteration #2: 11p

Sales reports (3p)

Client activity management (3p)

User management (2p)

Forecast reports (3p)

Release #2
Iteration #3: 9p

Sales workflow (3p)

Activity calendar (3p)

Document templates and document merging (3p)

As you can see, the customer considered that is more important to have the forecast reports in the first delivery so he moved the forecast report into the second iteration and the sales workflow to the 3rd iteration.

After the second iteration is over, we have the features 1-7 finished, in 2 1 month iterations, delivering on term what the customer wanted, and deploying it to be used by its end users.

At the beginning of the 3rd iteration, the customer realizes that he wants a few more things, and he would like to have the next delivery of the complete system, in another two months so he can fit his budget. He adds a few new features, having the following list now:

Client management (3p)

Product management (3p)

Sales leads management (4p)

Sales reports (3p)

Client activity management (3p)

User management (2p)

Sales workflow (3p)

Activity calendar (3p)

Forecast reports (3p)

Document templates and document merging (3p)

Contact communication management (3p)

Microsoft Outlook integration (2p)

The plan now becomes:

Release #1: 2 months
Iteration #1: 10p

Client management (3p)

Product management (3p)

Sales leads management (4p)

Iteration #2: 11p

Sales reports (3p)

Client activity management (3p)

User management (2p)

Forecast reports (3p)

Release #2
Iteration #3: 9p

Sales workflow (3p)

Activity calendar (3p)

Document templates and document merging (3p)

Iteration #4: 5p

Contact communication management (3p)

Microsoft outlook integration (2p)

After finishing and showing to the client the resulting product having now 10 features out of 12 implemented, he realizes that there is one more feature he’d like to have, and that can be implemented in the last iteration: sales processes. The list of feature becomes:

Client management (3p)

Product management (3p)

Sales leads management (4p)

Sales reports (3p)

Client activity management (3p)

User management (2p)

Sales workflow (3p)

Activity calendar (3p)

Forecast reports (3p)

Document templates and document merging (3p)

Contact communication management (3p)

Microsoft Outlook integration (2p)

Sales Processes (5p)

And the updated plan:

Release #1: 2 months
Iteration #1: 10 p

Client management (3p)

Product management (3p)

Sales leads management (4p)

Iteration #2: 11p

Sales reports (3p)

Client activity management (3p)

User management (2p)

Forecast reports (3p)

Release #2
Iteration #3 – 9p

Sales workflow (3p)

Activity calendar (3p)

Document templates and document merging (3p)

Iteration #4 – 10p

Contact communication management (3p)

Microsoft outlook integration (2p)

Sales processes (5p)

At the beginning of the 4th iteration the client says he’d like one more feature that is estimated by the team to be of 3 points. Adding that feature would mean not meeting the second release target, so after balancing the options he decides to drop it.

After the 4th iteration the final product is delivered. The sample above cannot show by any means how any project can be developed, but it shows the agile process at work, iteration by iteration, planning, adapting and delivering incrementally a product to the end customer.

The evolutive process of gathering the client requirements and a burn down chart, showing iteration by iteration the number of remaining features to be implemented would be like:

So after this very small sample do you think it could be useful for you?