The problem with managing to metrics...
Apr. 17th, 2008 09:53 am...is that often the metrics aren't what you really wanted. In tech support, "shorter calls" were a big metric. Shorter calls meant more customers were served and problems were solved more efficiently, right? Right?
Some calls, yes. Perhaps even the 50% or 80% most common calls. But not the less common calls. Not the troublesome calls. Those took longer, and one 1/2 hour call would screw up your stats for the day.
"Of course you should take the time to do the job right, just not more time than necessary," management would chirp. And yet the rewards went to those who did the job quickly - and not necessarily well. Since people repeat the behavior that gets rewards, this led to less service.
In testing, focusing on bug counts can be a similar problem1. My first test lead didn't look at how many bugs were reported by each individual on his team. Instead, he read every bug reported by his team. He found it was very clear who was breaking bugs into tiny little bits so's to inflate bug counts (often a habit acquired elsewhere), who was sloppy in figuring out repros, who was noticing that two different symptoms had a common cause and reporting it as such, who was mostly finding bugs in areas other than their own (which if a conversation determined their area just didn't have many bugs to find could be a good thing) and so on.
Of course, it was expected that a test lead would read and know all the bugs for the areas covered by his team. And my first test lead did so. Among other things, this made him aware of what was being reported by those not assigned to test the area. (Often these were bugs on areas that weren't officially released for testing yet and such. But sometimes it was a clue that tests needed revising.)
Sometimes the focus on bug count goes beyond using it as a measure of tester productivity. The Defect Black Market relates just one example. This Dilbert shows a slightly less, er, subtle approach.
1The problems with using bug counts as a measure of tester productivity are discussed in-depth in Testing Computer Software by Kaner et al.
Some calls, yes. Perhaps even the 50% or 80% most common calls. But not the less common calls. Not the troublesome calls. Those took longer, and one 1/2 hour call would screw up your stats for the day.
"Of course you should take the time to do the job right, just not more time than necessary," management would chirp. And yet the rewards went to those who did the job quickly - and not necessarily well. Since people repeat the behavior that gets rewards, this led to less service.
In testing, focusing on bug counts can be a similar problem1. My first test lead didn't look at how many bugs were reported by each individual on his team. Instead, he read every bug reported by his team. He found it was very clear who was breaking bugs into tiny little bits so's to inflate bug counts (often a habit acquired elsewhere), who was sloppy in figuring out repros, who was noticing that two different symptoms had a common cause and reporting it as such, who was mostly finding bugs in areas other than their own (which if a conversation determined their area just didn't have many bugs to find could be a good thing) and so on.
Of course, it was expected that a test lead would read and know all the bugs for the areas covered by his team. And my first test lead did so. Among other things, this made him aware of what was being reported by those not assigned to test the area. (Often these were bugs on areas that weren't officially released for testing yet and such. But sometimes it was a clue that tests needed revising.)
Sometimes the focus on bug count goes beyond using it as a measure of tester productivity. The Defect Black Market relates just one example. This Dilbert shows a slightly less, er, subtle approach.
1The problems with using bug counts as a measure of tester productivity are discussed in-depth in Testing Computer Software by Kaner et al.