1. Service Management View
Reference Viewpoint: Service Management Viewpoint
Version: Working Draft 1, July 7, 2006
SOA is by definition a distributed paradigm; therefore, from an information technology (IT) perspective, a managed distributed system architecture is needed to fully realize the potential of SOA. Distributed capabilities facilitated by the SOA paradigm may be under different ownership domains. This suggests that there is no theoretical limit as to how widely dispersed—geographically or otherwise—participants in a SOA environment can be so long as there exists a means for service participants to communicate. The prospect of having to support a highly distributed system architecture poses significant challenges from a systems and network management point of view, and requires the introduction of a specialized form of management known as service management. The bottom line is that a SOA must be managed to be effective.
<optional> Systems management refers to enterprise-wide maintenance and administration of distributed computer systems. Network management refers to the maintenance and administration of large-scale networks such as computer networks and telecommunication networks. Systems and network management execute a set of functions required for controlling, planning, deploying, coordinating, and monitoring the distributed computer systems and the resources of a network. </optional>
Service management, which is the subject of this reference architecture view, refers to the management and administration of service-based resources through a set of activities and capabilities that continuously monitor, control, coordinate, and report on the qualities and usage of these resources. Examples of service qualities include health qualities, or common Qualities of Service (QoS) attributes such as availability and performance, and accessibility. Examples of service usage that may be monitored or controlled include frequency, duration, scope, functional extent, and access authorization. Ultimately, service management is about insuring that acceptable levels of service quality meet the needs of the service consumer.
1.2. Management Capabilities
Historically, systems management capabilities have been organized by the following functional groups known as “FCAPS” functions (based on the ITU-T Rec. X.700 | ISO/IEC 7498-4:1989(E) standard):
- Fault Management
- Configuration Management
- Accounting Management
- Performance Management
- Security Management
From a service management perspective, each of these functional groups can be leveraged and defined for purposes of this SOA reference architecture as follows (in concert with ITU-T Rec. X.700 | ISO/IEC 7498-4:1989(E)):
Fault Management – Encompasses fault detection, isolation and the correction of abnormal operation of the SOA environment. Faults cause SOA distributed systems to fail to meet their operational objectives and they may be persistent or transient. Faults manifest themselves as particular events (e.g., errors) in the operation of a distributed system. Error detection provides capabilities to recognize faults. Fault management includes functions to a) maintain and examine error logs, b) accept and act upon error detection notifications, c) trace and identify faults, d) carry out sequences of diagnostic tests, and e) correct faults. For purposes of this reference architecture, monitoring functions such as service status and alerting are included in this functional group.
Accounting Management – Enables charges to be established for the use of resources in the SOA environment, and for costs to be identified for the use of those resources. Accounting management includes functions to a) inform service consumers of costs incurred or resources consumed, b) enable accounting limits to be set and tariff schedules to be associated with the use of resources, and c) enable costs to be combined where multiple resources are invoked to achieve a given objective (resulting in a real-world effect). For purposes of this reference architecture, related accounting functions such as metering and billing fall into this category.
Configuration Management – Identifies, exercises control over, collects data from and provides data to SOA distributed systems for the purpose of preparing for, initializing, starting, providing for the continuous operation of, and terminating services. Configuration management includes functions to a) set the parameters that control the routine operation of the SOA distributed system, b) associate names with managed resources and sets of managed resources, c) initialize and close down managed resources, d) collect information on demand about the current condition of the SOA distributed system, e) obtain announcements of significant changes in the condition of the SOA distributed system, and f) change the configuration of the SOA distributed system. For purposes of this reference architecture, related configuration management functions of service versioning and service provisioning (i.e., supplying of services) is included in this functional category.
Performance Management – Enables the behavior of resources in the SOA environment and the effectiveness of service-oriented activities to be evaluated. Performance management includes functions to a) gather statistical information, b) maintain and examine logs of system state histories, c) determine system performance under natural and artificial conditions, and d) alter system modes of operation for the purpose of conducting performance management activities. Measurements gathered as part of performance management are used to compare against service level agreements (SLAs).
Security Management – Support the application of security policies by means of functions which include a) the creation, deletion and control of security services and mechanisms, b) the distribution of security-related information, and c) the reporting of security-relevant events. A more detailed treatment on the topic of security is provided in the Security View of this SOA reference architecture.
1.3. Management Contracts and Policies
1.3.1. SLA management (ex of contract), etc.
Quality of Service capabilities will enable graceful degradation, fault tolerance, high reliability, and bounded deterministic behavior. Typical QoS attributes are:
Availability: Probability for service availability including such factors as MTBF (H/W and S/W) and MTTR
Accessibility: Probability of successful service instantiation when required
Scalability: Probability to successfully serve requests independent of load
Integrity: Measurement of interaction correctness with respect to the source versus probabilistic requirement
Performance: Measurement of round trip service request throughput and latency versus requirement
Reliability: The probability of being able to maintain a service at specified service quality
Regulatory: The probability of conforming to rules, standards, service level agreements
"Although provision of management capabilities enables a service to become manageable, the extent and degree of permissible management are defined in management policies that are associated with the services. Management policies are used to define the obligations for, and permissions to, managing the service." [WSA]
Will come back to this...
Relate to policies, i.e., "policies are also intended as a vehicle to express SLAs."
1.4. Manageability & Instrumentation
1.5. Management Infrastructure
Elements of a basic service management infrastructure should include the following characteristics:
- Integrate with existing security services
- Heartbeat and Ping
- Pause/Restore/Restart Service Access
- Logging, Auditing, Non-Repudiation
- Runtime Version Management
- Complement other infrastructure services (discovery, messaging, mediation)
- Message Routing and Redirection
- QoS, Management of Service Level Objects and Agreements
- Response Time
- Fault and Exception Management
Requirements on a management system should be to manage the services and not the infrastructure.
[#] ITU-T Rec. X.700 | ISO/IEC 10746-3:1996(E), Information processing systems—Open Systems Interconnection—Basic Reference Model—Part 4: Management Framework, International Telecommunication Union, International Organization for Standardization and International Electrotechnical Commission, Geneva, Switzerland, 1989.
[#] David Booth, et al., Web Services Architecture, W3C Working Group Note, World Wide Web Consortium (W3C) (Massachusetts Institute of Technology, European Research Consortium for Informatics and Mathematics, Keio University), February, 2004.
[#] D. E. Cox and H. Kreger, “Management of the service-oriented architecture life cycle,” IBM Systems Journal 44, No. 4, 709-726, 2005.
[#] Deepak Kakadia, et al., “Enterprise Management Systems Part I: Architectures and Standards,” Sun BluePrints™ Online, Sun Microsystems, Inc., Santa Clara, CA, April, 2002.
2.1. Service Management
How do we cope with complexity?
2.1.1. Service Life-cycle