Sometimes I’m amazed. I came to work this morning, and someone came at me in slight panic. A system that is central for our customer operations is having an outage what do we do.
Apparently what they had tried to isolate the problem was checking again and again if the system was working. Basically just trying to refresh in the browser.
What they hadn’t done was to try to troubleshoot it.
How do you troubleshoot?
You isolate all other components that can fail and follow the error as close to its source as possible.
I came in isolated one piece of commodity technology between them and the service they were trying to use. And voila it worked.
The piece taken out, the handset. Once plugged back in the handset worked again.
One of the most important aspects of solving any problem is to isolate it. If you don’t; every attempt at solving the problem is just guessing or brute force hammering on the problem (brute force attempts have a high risk of breaking stuff and a less of a chance to solving any problem. Sometimes however, it just feels good)
#1 by Simon on March 23, 2010 - 09:24
Skoj! Det verkar som att ni behöver stärka upp med fler personer som har ingenjörsmässigt tänkande. nudge nudge
#2 by Anna Forss on March 23, 2010 - 19:56
I also think this is an important trait, but isolation is not always easy when you’re in a panic.
I’ve also found that when errors, defects and bugs are common, people tend to become worse at error cornering. One reason of course is that you never know how many errors are involved in an issue. But also lots of defects build a culture where you take errors for granted.
If you instead build a culture with zero defect acceptance and where the development teams does not release software with a lot of defects, users don’t necessarily think “software bug” when they face a problem. Why did he think that the problem was a bug? Is he used to finding defects in the product?