Problem Management Policy
Policy Owner: Manager, IT Performance Achievement
Policy Sponsor: IT Service Management (ITSM) Steering Committee
Note: An owner must be a PCES-level manager.
This policy provides formally documented management expectations and intentions used to direct decision making and ensure consistent and appropriate development and implementation of Processes, Standards, Roles, Activities, etc.
The policy owner has the responsibility and authority to execute the defined and established Problem Management Process.
The purpose of this policy is to ensure that all problems, as identified through Problem Management, which affect the daily operations of the Postal Service Technology Environments, are managed through an established process. USPS will utilize the best practice framework for the implementation of Problem Management within Postal Service Technology.
A Problem is defined as the underlying cause of one or more Incidents. Problem Management is the process responsible for managing the lifecycle of all problems.
The goal of Problem Management is to minimize the adverse impact of incidents and problems on the business, caused by underlying errors within the IT Infrastructure. Problem Management seeks to proactively prevent recurrence of incidents related to these errors.
- All IT-supported USPS locations
- All environments subject to the Problem Management policy determined by the ITSM Steering Committee
- USPS identified and owned Problems (e.g., Problems recorded and managed to closure by USPS IT personnel)
- Vendor/Partner owned Problems under USPS management (e.g., Problems recorded by a Vendor/Partner and assigned and managed to closure by USPS IT personnel)
- Vendor/Partner owned Problems under Vendor management (e.g., Problems recorded by a Vendor/Partner and assigned and managed to closure by the Vendor/Partner)
- USPS owned but Vendor/Partner supported Problems (e.g., USPS created Problems that are assigned to a vendor/partner).
All items not specifically listed within the Scope section are deemed as “Out-Of-Scope.”
The following policy is established for Problem Management:
- All USPS IT organizations must use the currently approved documented Problem Management process and standardized methodology, and problems will be reported, recorded, managed, and appropriately communicated through the approved Problem Management tool.
- All IT Managers will be responsible for ensuring the Problem Management process is followed.
- ITSM managers or team leads will commit appropriate resources to conduct Problem Management activities, such as Root Cause Analysis (RCA), Change Management post implementation reviews (PIR) (where applicable), validation or creation of workarounds, and proactive trend analysis reviews as requested by the Problem Manager(s).
- Any Problem that requires a change request (CR) to aid in its resolution can only be closed after the successful implementation of the CR and validation that no further incidents are occurring as a result of the identified error.
- Upon resolution of the problem, a Knowledge Article must be submitted that identifies a known error and includes the resolution or workaround.
- This policy will complement and not supersede compliance policies such as CIRT, SOX, and TSLC.
Any requests for exceptions to this policy must be submitted in writing and will be reviewed on a case-by-case basis. Exceptions shall be permitted only after written approval from the IT Steering Committee.
Policy Compliance and Monitoring
A sampling of Problem Records will be reviewed on a periodic basis by the Problem Management Process Owner to assess policy compliance. This is to ensure that the procedures, guidelines, and standards set forth in the Problem Management Process are adhered to.
The Problem Management Policy will be reviewed on the following basis:
- Periodically, by the Problem Management Process Owner
- Upon an update to the Problem Management Process and/or tool
- Upon request of the IT Steering Committee
Change Request (CR): A formal request for change to any component of an IT infrastructure or to any aspect of an IT Service that is to be made to the USPS IT Production Environment.
Configuration Item (Cfg-Item): A Cfg-Item is an abbreviation for the term "Configuration Item," which refers to any service asset component that needs to be managed in order to deliver an IT service. Information about each configuration item is recorded in a configuration record within the configuration management system and is maintained throughout its life cycle by service asset and configuration management. Configuration items are under the control of change management. They typically include IT services, hardware, software, buildings, people, and formal documentation such as process documentation and service level agreements.
Incident: An unplanned interruption to an IT service or reduction in the quality of an IT service. Failure of a configuration item that has not yet affected service is also considered an incident.
IT Service: An IT service is made up of a combination of information technology, people, and processes. A customer-facing IT service directly supports the business processes of one or more customers and its service level targets should be defined in a service level agreement.
ITSM Steering Committee: An executive body responsible for the guidance and direction of USPS IT Service Management.
Knowledge Article: A Document located in the Knowledge database containing a solution or workaround to a Known Error.
Known Error: A Known Error is a Problem that has a documented root cause and a workaround. Known Errors are created and managed throughout their lifecycle by Problem Management. Symptoms such as application or hardware errors are not considered Known Errors. It is the underlying cause to those errors that is considered the Known Error. Known Errors may also be identified by Development or Suppliers.
Problem Coordinator: The Problem Coordinator initiates, approves, and oversees efforts related to problem investigations, ensuring that the activities related to problem investigations adhere to the Problem Management process.
Problem Management Process Owner: The Problem Management Process Owner has overall authority and accountability over the Problem Management process and all activities within it.
Problem Manager: The person or group that is assigned the responsibility of managing the lifecycle of a problem, as defined within the Problem Management process.
Problem: A Problem is an unknown cause of one or more Incidents and is often identified because of multiple similar Incidents. The cause is not usually known at the time a Problem Record is created. The Problem Management Process is responsible for further investigation.
Resolution: Action taken to repair the root cause of an Incident or Problem or to implement a workaround.
Root Cause Analysis: The identification of the fundamental cause of a problem and the proposal of a structural solution.
Root Cause: The fundamental cause of a problem, which removal will prevent the recurrence of incidents resulting from the problem.
Workaround: A temporary solution that bypasses or masks the incorrect functioning of a service. A workaround is implemented when it is the quickest way to allow affected users to return to their work, reducing or eliminating the Impact of an Incident or Problem for which a full resolution is not yet available (for example, restarting a failed Cfg-Item).
- Problem Management Process
- Problem Management Roles and Responsibilities
- Incident Management Process
- Change Management Process
- Change Management Policy
|Revision Description:||Initial Release.|
|Revision Description:||This document was made Section 508 compliant and was converted to HTML.|
|Revision Description:||Rewritten to remove tool-specific references and updated to include current policies.|