How to Build Software Test Strategy On Ongoing Projects
In This blog
The author would like to thank Jeff MacBane and Nicolas Shaw for their peer review
In the rare cases where people talk about test strategy, it is usually in a bubble - “this is the way to do things.” The context-driven testing community does a bit better, in designing strategies for a given context. Even then, the conversations are rarely about change management - switching from “whatever we are doing now” to “something new.” Last month PractiTest invited me to give a talk on the topic, but the video can go quickly over concepts, so we decided to write the ideas up formally as a blog. To do that, we’ll start by explaining a way to look at improvement, then provide you the tools to do your own analysis, and finally get to the difficult process of making it happen.
The Challenge of Ongoing Improvement
If everything goes right on a consulting assignment, we start with one single goal. The company wants to get to market faster, improve quality pre-release, improve the quality of the feedback from testing, testing in less time, reduce time from code-commit-to-production, and so on. A consultant can then conduct a “gap analysis”, where we look at the system as it is, what it is hoped to be, and then “what the gap is.” The improvement ideas are designed to close the gap.
Of course, most of the time what happens is that the customer explains they want all of the things, or at least several. If they wanted just one thing, they wouldn’t need consultants! This implies that we’ll have to make tradeoffs of value, to decide which good thing is better than something else, or what we can live without because something else has more value.
Once the gap analysis is done, we come up with improvement ideas. Now, remember, the company just spent a fair amount on consultants. The team is under pressure to deliver, that is why the consultants were brought in. Management doesn’t want to be told to purchase tools, spend money on training, create new positions, or “go slower to go fast.” Instead, it wants quick wins, which is what we will focus on today — things that can be done over a few lunch hours that can impact outcomes in obvious ways. “30% higher coverage”, while impressive, is also a bit intangible, easy to misunderstand and misconstrue. The outcomes we’ll go for today will be more like “less contact from angry customers”, to the point that Management actually notices incidents are occurring less frequently. Routine incident meetings are scheduled less often, the on-call person realizes they took no calls this shift, and so on.
To do that, we’ll start with the humble test case.
Deconstructing the Humble Test Case
I will make an aside that I’ve really understood the test case idea. For me, figuring out what to test is the fun part. As each build is different, putting resources into testing the things that are different for each build will allow you to test closer to where the risks are today, while increasing coverage over time. It was my friend John McConda, the quality lead at Moser Consulting, who told me years ago that people don’t want to have to think about it. So if what to test is written down in a document, stored in a system online, automated or “Artificial-intelligence-ed’” away, they can relax and stop worrying. Then, if the tester or lead or people who know how to test all leave, they can be easily replaced with someone new.
Except this doesn’t work.
In the Social Life of Information, Brown and Duguid studied photocopy repair technicians from a different generation, who had access to comprehensive repair documentation. Management was happy with this, as the documents represented value that could be borrowed against at a bank or valued during a sale.
Yet when Julian Orr studied what the technicians actually did, he found that most of that material was never read at all. Instead, the technicians worked in pairs and taught each other. This causes the default presumption that test cases are mostly waste. However, before we “go there”, I think it is fair to probe a bit more, to ask what problems test cases solve and for whom.
Doing that work, I find that different people want to get different things out of the test cases. While there is, of course, variation, there are also some broad generalities. Project Managers tend to want predictability. To them, testing is a straightforward business process that has to be run to get to production. To a project manager, everything is a checklist, and testing must be checked off. Testers want to know what to do when they are done, and how to do complex things. In the case where test cases are “thrown out”, we often find a bit of a “hiccup” later as the team has forgotten how to do something.
Typically those things are around configuration or setting up particularly complex accounts, users, or transactions. Programmers can see testing as a concrete example of what the software should do, often in code, that can stand up over time. Product owners are similar, seeing the higher level tests as “executable specification” or “automated examples.” Line managers are similar to project managers, as they want to view the delivery process as a machine where they can “turn the crank.” To do this, they tend to look at metrics (like test cases run or passing). We find the goal of the line managers, unlike the project managers, is to figure out “what is going on” - they want transparency.
You can do this with your own test process. Figure out what the driving element is, and then what people get out of it. That driving element is the thing people look to when they think of testing. It might not be a test case; it might include unit tests, acceptance tests, a user acceptance process, and a beta process. If the goal is going faster, we’ll have to take something away - even if we just remove multi-tasking, that removes some options from people. So if we take away some time from someone, we need to replace it with something.
Now we discuss how to improve it.
Moving forward under pressure
If people find an asset valuable, we don’t recommend throwing it away until you can replace it with something. There are, however, patterns you can use to replace things that can be done in your discretionary time, and that can add value almost immediately. Here are a few of them.
A realized defect analysis: It takes about a lunch period to sit down with a bug tracker and review the last hundred bugs. After a visual review, fire up a spreadsheet and invent broad categories for the defects. These could be “browser compatibility”, in that the code words are in the current version of Chrome but not Edge. “Screen Resolution Problem” in that the code fails only on tablets when turned to landscape mode. “Failure to handle changing bandwidth”, and so on. Do this in a new tab. Then go back and find the category for each bug. The total so far should be an hour or two.
Note: Some agile teams don’t have or need these records; they fix bugs as soon as they find them. Many teams don’t write up the bugs that are found in the initial-feature-test step, only in final regression testing. In those cases, you might have to take two weeks to temporarily document bugs and where they are found, as a data-gathering experiment.
When the exercise is complete, you’ll be able to see what categories of bugs pop up again and again that are found late in the cycle and think about methods to catch them earlier or, preferably, prevent them from happening. You’ll also see what categories of bugs fall through the cracks to be found by customers. This is the “gap.”
Match categories of defects to test approaches: Once you have the kind of defects in categories, make a new spreadsheet tab with the categories used to reduce risk. This could be a story test, unit test, API test, end-to-end test, and so on. Two new columns - where the defects are found, and where they should have been found.
Aligning the bugs to where they should be found, and where they are found, can do three things. First, it can find the categories of bugs for which there is no risk management step. Second, it can find the risk management steps that never actually find any bugs, allowing you to question the value of that process and reconsider the approach. (You might keep it; you want to have the conversation.) Third, it can find redundant processes that seem to overlap. Again, the overlap might be warranted; people make mistakes. Having the data enables the conversation.
Coverage Visualizations: I’ve worked on many projects where the effort spanned multiple teams. Occasionally a key person leaves, or there is a reorganization, and suddenly a feature has no one responsible; no person to maintain the feature. If the feature is no longer under active development, this is usually viewed as an inconvenience - yet it is often possible, even likely, that no one is testing that feature at all. When this happens, it is common for “code rot” in the tooling, with programmers who are not responsible and do not understand why automated checks are running to simply turn them off. A coverage visualization, like the mind map at right, is simply a list of major elements of the software that breaks down into smaller lists. If you make and publish such a list, you can have the nodes be links that appear as pages in a wiki such as Confluence.
Once we’ve built out the coverage map, we can map a copy for any test process and color sections green, red, and yellow to indicate quality, then use percentages, icons, emojis, or darkness of color to indicate the depth of coverage. Determine the depth by using a rule you create for your own team, depending on how you do the testing. We typically find a score from 1 to 10 help, with 1 meaning it was clicked on once, and other scores indicating higher amounts of testing.
The description above doesn’t consider user journeys that cover more than one feature, which is another layer to think about.
Recipes for complex tasks: Once we’ve built out the coverage map, we can use the end-nodes as links to a wiki, or some other system, to describe the feature. PractiTest, or software like it, might come into play here. A recipe describes the feature, and provides that configuration or setup information, along with common ways to test the feature and things to think about. This creates a “card”, which we call a recipe. This might include the SQL statement to run to find categories of things to add to the cart that qualify for a given feature, or rules about how health insurance benefits are smart coded into the product ID, and so on.
An Emergent Risk List: Some risk management activities need to run - but only once. These are the sorts of things where we wonder “The add to cart button is sort of slow; what happens if we click it twice?” this might not become something written down as a long-term, frequent test, but these ideas take time and are rarely documented. By producing some sort of list somewhere, we create transparency into what we are working on without the major time investment of more formal approaches.
More importantly, sometimes these are big. If the list is big enough, you can make a case for more people, more test time, placing the stories on the board as “test stories”, having the whole team invest in getting the list down, or, perhaps, skipping test ideas. One way to do this is to assign priorities, 1 to 10, then sort, and express where the cut line is for today, this sprint, etc. When bugs emerge from below the cut line, well then, the “job of test” was done properly, wasn’t it? Maybe the team needs more time to invest in testing. This prevents low-maturity conversations about “Why didn’t QA find that bug” and so on.
By maturity here, I mean something specific, of knowing what you stand for, being able to justify it, and living with the consequences. The ideas above are designed to help you move toward a world that looks more like that, but the analysis, and the choices, have to be yours.
Putting that all together
Recipes provide people new to the application with the tools they need to test it; visualizations provide everyone with a sense of what to test. Emergent risk lists provide a place to put ideas worth investigating at least once and allow management to make tough tradeoffs on how much time to invest in testing. The coverage visualizations make it clear how much time the team is spending on what features, and what features are uncovered. All of these equip senior management to make tough decisions about how much time to invest in testing, and how to best deploy that time. Best of all, these are the sorts of ideas you can implement in ongoing projects, without asking for permission, without taking a great deal of time away from the day job, while having a visible payoff people can appreciate.
Back to our original thesis, the analysis of bugs and categories allows you to find the gaps in your testing. What it doesn’t do is deal with waste and speed. Those problems generally come from multitasking, miscommunication, poor choice of worker selection, assignment, and a poor build-test-release process. If that’s something you’d like to talk more about, we’re happy to keep talking. For now, give some quick wins a try, and let us know how it goes!