Wednesday Mar 15, 2006

17 Years

My time at Sun Microsystems is soon to come to an end, and so I've taken this opportunity to take a retrospective look at the last 17 years with Sun, 71% of it's life, 42% of mine. If you're not mentioned it's an oversight only, and if you do recognise yourself, then smile.

Peak Experiences:

Hawaii: Winning the ultimate atta-boy in 1993. I did work very hard on the phones in the UK Support Centre, and won a Services award for my efforts. (Taking and closing 28 NeWSprint/Modem/Printer/Serial Port calls in one day was my personal best). For my efforts I was given the once-in-a-lifetime experience of going to a Services conference on the island of Maui, Hawaii with my wife and we stayed in the four seasons resort for three days. It was a glimpse of another world that I had never even dreamed of, and was truly a peak experience.

Cooking: For a number of years the White family organised the food provision for the Sun Summer BBQ at Camberley Cricket Club. The goal was to provide a BBQ for 700 people and feed them between 5:30 and 7:30pm – 350 people per hour, more than 5 people a minute for 2 hours served with hot food and salad, catering for vegetarians and carnivores. I recruited my colleagues in the Solution Centre, complied with all food hygiene, local council and employment laws, sourced food from local butchers, and together we offered an excellent service to our colleagues in Sun, for about half the cost of an external catering company – oh and it was fun. JN and his competitive BBQing, HH forgetting to take money with her to the shops, all memorable and such things added colour to the work we were doing.

Barn Parties: I had a notion that my colleagues were dangerous when drunk, leaving a trail of hotels that would not have us back – yet the Solution Centre did like to party, so I had the idea of a heady mixture of real ale, BBQ'd food, good music and an environment that they simply could not break – empty arable farm barns. All the barn needed was water and electric, and be away from other people (like in fields, where arable farm barns usually are). We had a number of Barn Parties, all well attended, most people got outrageously drunk, we camped on site, had a hearty full English fried breakfast in the morning and as the venue was typically “rural” there was very little to clear up afterwards. And farmers would have us back.

Hmmm, on reflection a lot of this wasn't actually work related, more about the Fun@Sun activities that used to go on in Sun, and since the dot bomb have all but ceased.

Travel: This career was lived on location in:

Bahrain, (excellent duty free at the airport)
Belgium, (good beer, lovely colleagues)
California, (spending the night sleeping under the stars in Death Valley, and waking up at 4am to go to Dante's Peek to watch the Sun rise was a peak experience)
China, (Beijing – an incredible place to visit, some truly talented people working in the Sun office there)
Colorado, (Visually stunning airport, many lovely colleagues)
Czech Republic, (Only place I've been threatened with physical violence when asking for a taxi receipt)
Finland, (Eat more fish, there's a limited supply, when it's gone it's gone, so help yourself to some more, here's a really big pile of it, go on, you know you want to)
France, (The Paris Peripherique on a Friday at 5pm on a motorbike is an experience to savour and survive)
Greece, (Lovely and crumbly)
Hawaii, (See above)
Iceland, (Beautiful, cold, dark, expensive, sulphurous, exciting off road adventures in a 53 seat coach, expensive, did I mention how expensive it was? Oh yes I did.)
India, (Excellent food and hospitality, lovely silks on MG Road, very clever colleagues, always got sick no matter how careful I was)
Ireland, (Fresh Guinness, a beautiful thing. Eating chinese take-away on the beach at Malahide in good company. Priceless.)
Italy, (Northern industrial towns in winter can be somewhat grim I think.)
Japan, (Winners of the “feed a foreigner something strange and watch the reaction” competition with a fish that was not actually quite dead at the time of eating).
Kasakhstan, (beautiful to look at, personal safety not assured, came closest so far to dying - in a car crash – somehow everyone missed crashing, I'm sure that physics was looking the other way. And too many guns. And beautiful beautiful women).
Massachusetts, (Would you like beef with that beef sir?)
Netherlands, (Flat and efficient. All they need to sort out is the position of the traffic lights at their intersections and all would be perfect.)
Nevada, (Only popped in for a short while, walked in one casino and out the other side – could not see the appeal...)
New York State, (Bagels, turnpikes and obfuscated roadsigns, a heady mix of brusqueness and efficency)
Norway, (Fish. See Finland)
Qatar, (The seafront has got to be one of the most beautiful of the Persian Gulf)
Singapore, (Everything works, great tailors, beautiful people, lovely weather, efficient rapid transport system)
Spain, (Tapas. What an excellent eating strategy.)
Switzerland, (Airport, Bankers, Airport – repeat)
UAE, (Spent a month there over Christmas being the Solution Centre in the '90s. Truly enjoyed the Suk and the sunsets).
UK, (All over, customers and offices)


Steve White – himself
Best manager in career – SU
Man with best hospitality in the world - RG
Best impression of a Dutchman - BS
Clever people in office – CG, TU, CK, MH
Men in canteen – MH, SS
Open All Hours – JF, PH
Man who most often saved me from redundancy – IW
Man who gave me the best breaks – JR, DP, IC
Man who showed me the real meaning of company car cleanliness when he suggested I clean my wheeltrims with a toothbrush like he did - WS
Lounge Lizard – LH
Man comatose in hotel room – DG
Driver of funniest road traffic accident – MT (Now before you reach for your “Dear BBC” notepaper there is no such thing as a funny road accident, except this one. I cannot tell the tale publically, those who were in the UK Solution Centre at the time will remember it as just a classic of it's genre. MT will probably sue me over even mentioning it. Or he might drive into a pond, you really cannot tell).
Driver of second funniest road traffic accident – DG (Again only funny because no-one was hurt, and getting an Astra 17ft up a lamppost on flat ground really is most impressive. As I drove into Watchmoor Park that fine dry morning and saw the litany of vehicular carnage only one thing crossed my mind... “That'll be one of ours”. And it was).
Reformed driver: TU
The Doctor – RH
Colleague who should have written a personal image improvement guide – Anita Selfe
Management who sound like Ricky Gervais in “The Office” – all of them.

A new colleague who has interviewed well and was not very good was telling NT and me about his hobbies in an idle moment office chat of the “getting to know you” kind of way. He said that he and his wife like Jazz Magazines and N pricked his ears up (being a musical Jazz fan) and said “I'm a great fan of jazz. What kind of jazz?”. “Pornographic” was the reply. Oh false floor open up and take me away. Until that moment I had no idea that Jazz Mag meant anything other than music. He did leave soon after.

I am going to have to name his name for this recollection. All was quiet late in the evening and a few of us were heads down when a colleague on the other side of the partition answered the phone. It would be like the 40th time that day that he'd answered the phone and was not quite as clear as maybe the first occasion. His real name was Jon Peacock, and from that day we knew him as if his first name started with a D and his second name with a K.

There were a host of other outtakes – I just can't recall them now.

Directed by: IC, JE, DG, DP, JR, SU
Produced by – Scott McNealy

This has been a Steve White production for Sun Microsystems.

The sequel will shortly begin with Kepner-Tregoe. swhite at kepner-tregoe dot com.

sdw0: transmission stopped.

Wednesday Feb 22, 2006

Using SGRT as a Customer Relationship offering.

A short while ago I was involved in the most interesting SGRT facilitation of my career so far – interesting because Sun had been invited to provide a Rational Troubleshooting facilitation even though our equipment was not involved in the root cause of the customer problems.  The account manager in Sun is a huge supporter of this process, and offered our services to the customer to help them manage a very gnarly problem. The customer (to whom I'd presented SGRT a couple of years earlier) was interested and I volunteered.

We did have Sun equipment in the customer site, and it was having problems, and it had been identified that the Sun equipment was a victim of a much more subtle problem to do with the links between two computer sites. The customer was hugely advanced in their understanding of the problem, and the answer to “Where on the object” was really clear. They had excellent "what" data, and acknowledged that there was no chance of getting the lifecycle information as the problem occurred 100 times during terabytes of data transfer over days of full production usage.

Having all the suppliers in one room, providing their view of the problem was enlightening to all – some suppliers had a view of the application, some of the underlying network infrastructure and others had the physical rack and cable view.

There was no resolution to the problem identified during the facilitation – many actions to take and more importantly many actions that no longer needed taking as those actions were not essential to the resolution of the problem. The possible causes appeared to be centered on one supplier's hardware not doing quite what it should, so the lens of attention was focused tightly on that equipment.

Oh, and the very nice man from IBM clearly recognised what troubleshooting method I was using and came up with some marvelously incisive questions to forward our understanding of the symptoms still further. It felt really good to have a peer supplier in the room recognise the troubleshooting process we were using and actively engage in it.

For me the key learning points were;

  • the reinforcement that getting the right people in the room is not enough – following a structured analytic technique saved us all time, made the problem very clear and took the audience with the technical staff so that everyone understood by the end of the day what the issues were. Even I understood them.
  • that the capability of SGRT can be used as a Customer Relationship offering to assist our customers with the management of problems (Incidents in ITIL language).

For the End-to-End implementation of KT-Resolve / SGRT / [whatever the process is known by in the client company] throughout the computer industry to fully succeed we need customers to call for the use of a rational approach. It should no longer be a matter of serial trial fixes tacking toward a lucky break - big companies concentrate on what their customers demand. Sun's customers should demand a rational approach to problem management (and some already do), and Sun now has the capability worldwide to handle problem / incident management in a rational manner.

Tuesday Feb 14, 2006

Rational troubleshooting takes too long.

It is a well established fact that rational troubleshooting takes too long. It's always been the case that when a Support Engineer in Sun is shown that there is a rational way to approach the data gathering and processing of incoming symptomatic information there's a mantra on first impression "this takes too long", "I'm paid to fix problems, not understand them", "There's not time to handle things this way".

If we treat this observation as a "performance problem", as in "Understanding the symptoms of a problem takes too long" we can begin to use KT-Resolve / SGRT thinking to pick this apart.

Is the Should good, is the Actual factual? How long should it take to initially handle a customer case during the first contact with the customer? Many companies have a tiered approach to customer handling, and plucked out of the air (I think on the tide of "Live Call Transfer" installations) was the figure of 15 minutes per customer case for the initial examination of the concern. If that's an average, all works well, and we also know that where there's an average there is a distribution curve. Many many incoming cases take way less than 15 minutes to process - "I need a patch", "When I do this, this happens - and this is what you need to do" and so on. Obviously on average there are cases that take about 15 minutes to understand, and when acknowledging the existence of a distribution curve there are inevitably cases that take longer to specify.

So if we're in agreement that some cases will fall to the right of the average on the distribution curve, how do you know which ones to concentrate on?.

I believe we have found the answer - it's cases which are a "problem" (using the Kepner-Tregoe definition of a problem, not the ITIL definition, they call it an incident) to the techie who is handling the incoming case.

When handling incoming customer cases, despite the acknowledgement of a distribution curve you have only 15 minutes with the customer to get a clear understanding of the problem. The question is whether you attempt to fix the problem in 15 minutes (without understanding it fully, which often results in "shotgun" fixes) or you spend 15 minutes characterising the problem for someone else to solve.

What can reasonably be achieved in 15 minutes?

Situation Appraisal when done quickly is hugely revealing about the root concern - and action can then be considered to mitigate the effect or the cause. Situation Appraisal is applicable to every incoming case. If the result of the SA is that the customer is facing a problem, then it makes sense to State and Specify the problem.

It is not necessary to collect a full specification of a problem at this stage as the information is likely to be used to route the problem to someone who knows more about the content, and also to guide their thinking and approach. In Sun we have discovered that it's necessary to concentrate on a subset of the whole specification. In this way, for most of the time (again a distribution curve comes into play) the fundamental essence of the problem can be collected in under 15 minutes.

We run a "15 minute spec" exercise while delivering the SGRT class to end users when the audience includes frontline "first customer contact" staff. It's (of course) only an exercise, and we've found that we can extract the essence of the problem from the problem owner (who, for the sake of the exercise has all the answers - not so in real life) in around 8 to 10 minutes.

This first iteration of the specification is good for routing (we'd call it a top level spec, which has general answers) and guiding the thinking of the content expert, and is not detailed enough for continuing with if the problem is not solved quickly by the content expert. It takes a content expert to be able to frame the questions in the technology at hand to produce a "second level" specification on which the remaining processes in Problem Analysis can be operated.

When rational troubleshooting is used to the degree necessary, using targeted and clear questioning, it can reduce the time to close problems by a huge amount. When used well, rational troubleshooting takes no time at all to use, because it's saving time that might be otherwise spent on irrational troubleshooting.

Friday Feb 03, 2006

Further process loops

Kepner-Tregoe have a model of human behaviour called the "Performance System" model, and we use this concept extensively in the management of the installation of the SGR Troubleshooting process (Sun branding for the Kepner-Tregoe Resolve process, KT-Resolve) in Sun. The project office is not empowered to provide consequences, and we make it our job to get management to understand that; we do provide, and get involved in the infrastructure that drives the feedback loops.

If we maintain that feedback is provided to an individual to improve their performance for the next time they see the same situation, the feedback -  to be effective - needs to be timely, accurate and targetted. We have a rule in the project office that feedback provided more than 7 elapsed days after the event is not worth either the coach completing or the engineer receiving. This drives tight loops. Most of the feedback loops are of less than 48 elapsed hours, and it's that long because we have a global organisation and any report that runs once in a 24 hour period arrives in someone's timezone while they are not at work.

The primary feedback loops we run are as described in a blog entry below. We have built a number of secondary feedback loops to begin to measure and reinforce good behaviour.

Daily Coaching Loop

Every day the coaches that are assigned a group of mentees (who are often the colleagues that they have trained) receive an invitation (and key) to assess the intent and quality of the work that is passing from their group to the next group. This could be thought of as a daily survey of the quality of work that is passing between engineers. We are assessing the quality of the documentation.

Over time we can see whether engineers are improving in the quality of their documentation or not, and can take action to provide additional coaching or support for engineers who are not reaching the required standard of internal documentation quality.

Reputation Feedback

Given that we now have "End-To-End" installation of SGRT almost everywhere in the Customer Facing organisations and in the backline support organisations, we can begin to get engineers to measure engineers by reputation. A loop recently installed (and being used as a pilot for the Betty Support Model) is asking for process usage by reputation.

Part of the main coaching loop has process coaches assessing the intent behind an escalation. The trigger for the reputation feedback loop is the closure of a case that had "Cause Unknown" set as it's intent by the process coach. This tells us that the subject of the escalation was a "Problem" (using the classic definition provided by Problem Analysis thinking) to the Escalation Generator. Given that it was a "Problem", it should have been specified using PA thinking and the process of Problem Analysis continued by the Handling engineer. On escalation closure, both the Escalation Generator and the Escalation Handler are offered a survey of how the other engineer did.

The form for the Generator and Handler, and the key for completion for the Generator and Handler.

This has been an extremely useful loop in an unexpected way - apart from providing an opportunity to build up a picture of coachable opportunities, the comments field is exposing further opportunities to work even more effectively in Sun.

Process Escape Loops

From time to time things go wrong, and to handle those situations where effective call handling goes astray we have set up a process escape loop. This loop can operate in the forward direction and the backward direction, and always involves a process expert to assess the situation and provide coaching where necessary.

Why are we doing this?

Simply put, because it's more effective to do so. Sun is striving toward providing a better quality customer experience by standardising on the troubleshooting method we use throughout the support organisation. It's less expensive to have engineers all use the same troubleshooting process than it is to have them inventing new processes every time. It's results in more consistent (think reduction in variation in terms of manufacturing or Sigma measurement) support by reducing the standard deviation on elapsed time metrics, and reduces average elapsed time metrics.

The opportunities that this offers Sun and it's customer are many, and include the possibility of reaching out to our customers who use, or are interested in using KT-Resolve. Imagine a time when customers, empowered with the same troubleshooting method as Sun, perform a clear Situation Appraisal, identify the Object with the problem and the Defect that it is seeing and have spent a few minutes gathering accurate data surrounding a problem. When they pass that info to Sun, it can be immediately routed to the most likely person to solve the problem, and if that person can't solve the problem they can continue the same troubleshooting process. This has to be cheaper for our customers, and provide a better level of service to their business.

Thursday Dec 29, 2005

Rational Troubleshooting and Betty

The Adaptive Model of technical support about which much is written and details can be found elsewhere is a technique for structuring a support organisation in a way that removes the artificial barriers often imposed unthinkingly. For instance, when there is a "frontline" organisation which is local to the customer, and a "backline" organisation that is distributed, one frontline engineer has no access to excellent technical skills that might be in a peer frontline organisation. The tacit belief is that superior technical knowledge is "upwards", and to access that technical knowledge is an "escalation", the very choice of that word bringing to mind an upward movement.

The Adaptive Model is a leveller, and will allow anyone in the support organisation to be involved in any customer issue, irrespective of their geographical or hierarchical location. Suddenly a much greater number of alternatives are presented as potential candidates to answer the question "who is the best person for the job?" as it might be a "frontline" engineer in another country or region who has just the expertise that is needed to solve the issue.

This levelling brings with it two hard problems to solve. One is how to offer the issue to a wider set of people than before, and secondly how to communicate clearly the needs of the customer within a dynamically assembled team quickly and effectively. Clever routing tools answer the first issue. I believe that rational troubleshooting processes satisfy the needs of the second, and we have done research in Sun to identify the time efficiencies possible when people in a support organisation are good at, and genuinely use the same thinking methods.

This is not template completion. Template completion is easy and practically useless in terms of time saving.

The results from this research were derived from an experiment using people who use the "SGRT thinking way" - an internal-to-Sun label which expands to practical and consistent use of the right tool for the situation, used to the degree necessary to get the result needed. It's the mindset of using Situation Appraisal when first speaking to the customer, and if the SA concludes with an inkling that this is a problem, seamlessly and elegantly transitioning into Problem Analysis questioning, using the process so well that only a customer also trained in rational troubleshooting would recognise that they are being guided to answer specification issues, and with product content knowledge that allows a deeper exploration of the issue than "Machine Crashed". It's then offered to other technically capable engineers who also approach the collaboration with the same thinking processes, continue SA, continue PA and test their possible causes against the specification information already gathered, before concluding with Think Beyond The Fix.

Working in a support organisation is a bit like fishing in a river with a bucket. When you have space for more work you dip your bucket in the never-ending stream, and there are two options - you either fix the customer issue yourself, or you gather information to advocate the customer situation to the dynamic team you are going to build around you. Rivers can be flat, and broad, and have deltas, the concept of always going up is gone, to be replaced with flow.

In Sun, when we experimented with some technically experienced volunteers who were SGRT trained and supported in a coaching group, we found that the customers benefited in elapsed time savings hugely when rational troubleshooting was applied.

In the Adaptive model, good use of rational troubleshooting will be key to solving customer problems in a timely manner, and the results speak for themselves.

Monday Feb 28, 2005

Feedback loops

Individual Program Leaders for Sun Global Resolution Troubleshooting occasionally tie-up with Program Leaders in other companies. Recently two colleagues of mine were invited to the offices of Cisco in San Francisco to talk about the challenges of instituting this in their company. Sadly I wasn't there (I had planned to attend and something else got in the way), and I heard from my colleagues that the Cisco Program Leaders were particularly interested in the process improvement feedback loop we're using in Sun. One day I hope to meet you, until then, this is a drawing of the basic operation of one of the feedback loops we use.

  1. In the call flow there are people who are generating work for other people. In this model I'm calling the people who are generating work “Escalation Generators” and the handlers of that work “Escalation handlers” Bear in mind that escalation handlers can also be escalation generators if they then pass work to others. Every day we get a dump out of the case management system of all the transfer of work movements between one group of people and another. A Process Caoch (either a Program Leader of a Process Facilitator) is associated with a group of engineers. This can be the staff the Program Leader has they themselves trained – it provides continuity following the training course.

  2. A batch program associates all the work from the generators with their respective process coach, and creates a personalised html form for the coach, making it easy for the coach to visit the work of their coaching group.

  3. The coach visits the html form, takes a look at the quality of the documentation and finds coachable moments, both “well done”, or “could do better”.

  4. There are two stages to the coaching – the first stage is getting the end users to recognise when they should use a part of KT's process, we call this the “triggers for use” and once we see individuals using a process to document the work we are looking for “Good use of Process”. Program Leaders should be able to recognise Good Use of Process when they see it – if not KT have a definition you can recycle.

  5. Feedback or consequences are provided to the individuals one to one, either by email, a phone call or in person.

    Typical emails that we send are:

Poor quality from the escalation generator

It is perfectly reasonable to ask the escalation generator for a problem statement and specification on this escalation if it would assist you in the resolution of the problem.

The escalating engineer has been trained, and should provide you with a specification.

Ensure you have cycled this escalation through "Received Incomplete"at some point in it's life to alert Management to the lower than expected quality of the documentation.

A specification from the escalation generator when it was not necessary

Thank you for providing the problem statement and specification on the above escalation.

The provision of a clear description of the problem in a standard format is very helpful in the speedy understanding of a customer problem, and overall will result in a shorter time to resolution, more chance of a first time fix and more satisfied customers.

Please note that there are certain escalation types that do not mandate an SGR specification. For all cases where you know the cause of the problem;

  • Reproducible Test Case (do this and this happens)

  • Known Bug

  • Request for Backport

  • Technical Question

you do not need to also provide a specification. It's fine to do so, but it is not necessary.

For further details see .... (url for further details)

A good specification from the escalation generator

Thank you for providing the problem statement and specification on the above escalation.

The provision of a clear description of the problem in a standard format is very helpful in the speedy understanding of a customer problem, and overall will result in a shorter time to resolution, more chance of a first time fix and more satisfied customers.

A specification that had coachable moments.

Hand crafted by process experts every time.

  1. Once the reports are completed by the coaches they are archived (in a mail archive as it happens).

  2. We can the do data mining and compare the performance of cases where good quality documentation wes provided compared with not such good quality documentation.

Tuesday Feb 22, 2005

The hamburger of rational process

One of the many enjoyable things about the troubleshooting method job I have with Sun at the moment is talking to our customers about the use of a rational process in handling their issues.

The challenge is to get the decision makers in companies to understand that it's the installation of a capability, not just a training course. With a training course you go, learn the new stuff and use it straight away, and for technical training on a product you've been assigned to support you have “no choice”, the reinforcement of the training is built into the job.

With a thought process an attendee has two choices, either to stay the same or use the new process, and if rational process installation is considered to be training the attendees stay the same. A long time ago I drew this hamburger to illustrate to management in Sun that while the training may be considered the meat in this offering, without the lettuce, tomato, mushroom and bun it's all a pile of greasy sausage.




« July 2016