Let the Service Desk Be Your “Stitch in Time” with Problem Management

A hand in the middle of standing and toppling dominos

At the service desk, each agent is expected to handle and resolve hundreds of incidents or interruptions to end-user IT services hundreds of times per month. As an industry best practice they are also expected to answer the phones quickly, be customer service oriented, and resolve a high percentage of them remotely whenever access and available procedural documentation permit. From a reactive standpoint, this incident management approach is designed to get end users back into operational mode as quickly as possible. But what if those remote agents could take advantage of the troubleshooting skills they hone through sheer volume and repetition and assist in preventing future incidents from ever happening? According to ITIL, this preventative measure is referred to as problem management.

In order to consider how problem management works, first it’s important to draw a distinction between an incident and a problem. Whereas an incident may be a “one-off” issue that has temporarily impacted one user, problems are often defined as the underlying cause of recurring incidents or ones that impact multiple users. Using a medical analogy, incidents are the symptoms and problems are the illness itself. Problem management is the cure and one of the most effective call avoidance techniques because it emphasizes resolving problems before they generate future incidents and additional IT costs associated with them.  Since service desk agents are the first point of contact for all inbound incidents, they are also the first to detect problems such as client-server outages, newly rolled out application glitches and an incorrectly configured VPN domain. They are deftly familiar with the technical environment they are trained to support and are constantly tuned in to the incident related chatter by both the end user population and fellow agents. Because they regularly have their finger on the pulse of service desk support activity, they are most adept at spotting potential problems. Consequently, they relay that insight to the team lead for remediation or suggest a permanent fix. Here’s how it works:

Problem Management Tools and Processes

Most ITIL-based ticketing systems include comprehensive problem management capabilities which will enable clients to track and manage identified problems as well as their overall impact on the organization.  These systems assist in this process by performing the following:

  • Create new problem records for identified problems and create workaround steps when available. To promote “One IT” knowledge share with the client, enable them to create and manage problem records.
  • Assist in performing root cause analysis with clients in order to prevent future
  • Update the IVR system as well as the self-service portal using scrolling alerts that notify end users of known problems.
  • When categorizing and documenting ticket data, service desk agents associate incidents with known problems where applicable to ensure proper tracking and metrics for root cause analysis. The production and maintenance of the Known Error Database (KEDB) are some of the most important outputs of the Problem Management The Known Error Database is used by the incident management process to more rapidly resolve incidents.

Once the root cause of an incident is identified, the problem management team may submit a request for change, recommend a permanent fix for the underlying cause or, if a permanent solution is not possible or rejected by the Change Advisory Board, assist in the development of a work around for use in restoring service and minimizing the impact of associated incidents. Understandably, a formal approval process is necessary for any proposed fix especially when it can adversely impact vulnerable systems and data security. For the same reason infrastructure teams will not deploy untested server patches immediately to a production environment, risks introduced by any proposed change need to be assessed and mitigated before the appropriate staff can authorize implementation.

Whatever the resolution to the root cause and no matter what IT groups may be involved, an effective problem management strategy is one that encourages input from the service desk. Since agents serve as the first point of contact, empowering them to flag incidents as problem candidates and offer solutions can be the proverbial stitch in time. In other words, leveraging problem management as part of an early warning system will prevent both companywide and recurring incidents. Swiftly converting each problem from a known issue to a non-issue before call volumes spike and end user productivity suffers will always be the ideal method to curb IT operational costs. From a CFO and CIO’s perspective, staying within budget while maintaining uninterrupted technical services for its employees is one less problem to solve.