In many jurisdictions, any negative customer impact is felt by the utility as
a penalty payment, so network performance (system outages, voltage fluctuations)
and customer management (such as missed appointments) must be optimized.
Traditionally, network infrastructure was designed for extremely high reliability
and fail-safe operation in accordance with sound engineering practice. Much
of this was achieved through over-design, and over-maintenance was seen as simply
more of a good thing.
Today, with the introduction of competitive markets, utilities recognize the
need to reduce operational costs while providing for a safe minimum amount of
maintenance. We can no longer simply throw money at a technical problem. This
article outlines a way forward tailored to our unique environment.
Historically, utilities were set up and operated as monopolies with heavy reliance
on regulation to avoid abuses. The generation assets and transmission and distribution
networks were designed to provide highly reliable service to customers. As every
engineer knows, it is possible to support highly reliable service delivery through
the use of installed redundancy; if one asset breaks down, the standby asset
takes over and service disruption is either avoided or reduced, depending on
the speed of switching.
Of course, assets do break down from time to time and they require maintenance
to restore service or backup capability. Maintainers also know that one way
to enhance reliability is to prevent failures from occurring. So, conventional
practice suggests that doing more preventive maintenance might reduce failures
and achieve more uptime or availability. In geographically far-flung networks,
it was convenient to have plenty of people available to respond to trouble calls
and restore service. It was expected that if those people were not responding
to trouble calls, they were doing more preventive maintenance.
All of this was paid for by the ratepayers. Ratepayers were usually protected
by regulators from the full brunt of the costs through rate caps and restrictions
on rate increases. Capital expansion and replacement programs were funded by
the issuance of debt instruments that would one day be paid off by those same
ratepayers. Utilities did what they needed to do knowing it would be paid for
eventually. As businesses, they provided highly reliable service, at any cost.
Today our industry is deregulating. The whole idea behind deregulation is to
allow the markets and costs of service delivery to be driven by their own natural
forces, thus benefiting the ratepayers and taxpayers. One of the biggest impacts
of deregulation is that utilities can no longer pass the costs of operation
on to the customer base. Disaggregation, the splitting of integrated utilities
into smaller arms-length companies, has focused attention on the financial performance
of wires companies as never before.
Because the wires part of the utilities industry is a natural monopoly, operating
rules impose nominal performance metrics, as well as penalties if the metrics
are not achieved. In this new environment, the ability to recover costs is constrained,
and the need for increased system reliability is greater. We are now being asked
to manage our assets as a business that must serve competing market demands
for service and delivery, but now it’s with market-driven cost ceilings. That
requires a change.
We must accept the premise that we have often over-designed, over-maintained,
and over-staffed to meet demands for service continuity. There must be opportunities
to cut costs while maintaining high service levels and customer satisfaction.
And there are. But, those methods are not well aligned with the traditional
methods we have used for years. The change is in our thinking and where that
Better Asset Management
To contain or reduce costs we need to find more effective and efficient ways
of managing the asset base.
While maintaining our assets we disturb them. We take them out of service for
a brief time, take them apart, clean, lubricate, reassemble, test, and reinstall
them. Much of this work is done by skilled maintainers in conditions that are
far from ideal exposed to weather, in awkward locations atop poles, standing
in buckets several meters above street level, with thick gloves on, and so on.
Despite the care they take, they can’t help but make mistakes. The result is
that those assets often don’t work properly after they’ve been maintained. We’ve
all experienced the car that doesn’t run as well after a trip to the garage
this is the same sort of phenomenon. Statistically, these problems are
the most common type of failure we experience, and in those cases, we would
have been better off to leave things alone.
Using designed-in redundancy is expensive and doesn’t always produce the desired
results. This redundant capability means that we have assets that are bought,
installed, and maintained, but seldom called upon to operate. Sometimes we induce
failures as discussed above. Sometimes when we need them we find they are not
available for use; they are already in a failed state, such as switches that
haven’t moved for a long time that are now stuck closed or stuck open. We then
have expensive failed system components that haven’t added the value we needed
they didn’t work.
We can observe what has been done in other industries that have lived with
similar market forces for much longer. Air transport, in particular, has learned
to maintain its widely dispersed assets for high levels of reliability and service
while doing so cost-effectively in a highly competitive yet tightly regulated
business environment. Evidence of this is found in Stanley Nowlan and Howard
Heap’s report “Reliability Centered Maintenance” from 1978 and John Moubray’s
“RCM 2” in 2002.
The airline industry of the 1950s and ’60s also experienced high costs due
to maintenance, and it experienced many asset failures that led to disastrous
consequences. The industry had roughly 60 crashed aircraft for every 1 million
takeoffs, and more than 40 of those were due to equipment failure. Airlines
tried what was then obvious: increasing the maintenance so equipment failures
would be avoided. The situation actually got worse!
The industry realized eventually that traditional thinking wouldn’t work as
the larger wide-bodied all-jet fleets were introduced. Airlines studied how
assets failed and recognized that different failures required different approaches.
They recognized that many failures were maintenance-induced, that many were
entirely random, that many could be detected before they got to the totally
failed state, and that many were hidden under normal circumstances. All of these
can be managed.
The airlines developed a method to determine the amount of maintenance that
decreased the failure consequences to safety, the environment, and operations.
As a result of widespread application of this method, they experienced a 30-fold
reduction in crashes (today the rate is about two per million takeoffs) and
a 133-fold improvement in those that are equipment failure-related (to less
than 0.3 per million). They actually do less of the fixed interval type of overhaul
work that they used to: In those early days 80 percent of their maintenance
was fixed interval work; today it is less than 20 percent. The work they do
is now less expensive.
Our utilities need to embrace this thinking for one simple reason: It works.
The method is called reliability centered maintenance (RCM). While there are
a variety of RCM methods available today, only a handful are thorough enough
to work well for our industry. In this article, we examine the benefits of RCM
methods applicable to utilities.
Figure 1: Finding the RCM Maintenance Target Zone
In a recent application of RCM in a large municipal electrical distribution
utility, we demonstrated a 34 percent cost reduction for the asset classes to
which it was applied. The total costs to achieve that were small: Internal rate
of return was calculated in excess of 180 percent. It revealed many instances
of hidden failures that were inadequately dealt with in the past. It used risk-based
decision techniques to make the best use of installed redundancy in order to
The RCM application also favored condition monitoring over fixed interval maintenance.
It also revealed many instances of maintenance practices that induced failures;
many of those practices were dropped. It also found several instances where
over-design actually led to failures design specifications and installed
equipment are being modified to help eliminate those failures. A valuable ancillary
benefit was that knowledge about the assets was captured in a structured and
easily accessible database. When the experienced people who participate retire,
their knowledge won’t leave with them.
The utility above and others that are leading the way in adopting RCM are proving
that this proactive method of determining the right maintenance can produce
substantial cost savings without compromising safety or service. The technology
now exists to capture the full value potential that is there. The money is on
the table and astute utilities are starting to grab it. Their timing is right.
The savings will come from doing the right maintenance at the right time
no more, no less. It liberates manpower and uses fewer parts and replacements.
By identifying unnecessary designed-in features that lead to failures, it enhances
system performance and strives to eliminate risks. By using risk-based decision
techniques, it verifies that money is being spent where it will do the most
good. It provides the reliability and redundant asset availability that delivers
customer satisfaction and it does it for the lowest cost.
Optimizing the Workforce
For reasons mentioned above, utilities have traditionally relied on a large
labor force. The average “tool time” per tradesperson was low, typically only
two to three working hours per day. In a cost-constrained business environment,
there is a new pressure on utility owners to improve productivity. Opportunities
for workforce reduction are therefore welcome, as long as necessary work can
be accomplished to meet reliability and service targets.
By redesigning maintenance work as a result of using an RCM approach to work
identification, it is likely that the total amount of maintenance work will
decline. The short-term result is that we may find that we have more people
than we need and that can lead to painful choices. Fortunately, however, the
timing coincides with another major change that is taking place in our society:
the early retirement of some of the baby boomers the oldest are now about
55. As these experienced and valuable people leave the workforce, we will find
fewer to replace them. The baby bust produced far fewer people, and they are
already in the workforce the youngest are 23.
We already have difficulty finding skilled tradespeople to replace those who
leave. The echo boom is just beginning to enter the workforce, but they are
not embracing the trades; they are the first all-computer generation. We need
to get used to working with fewer skilled tradespeople in our workforce. We
can’t easily correct the supply problem, but RCM can help us prepare by reducing
the demand for those skilled trades.
Trade unions may not like the sound of this but it is as real a problem for
them as it is for those of us managing our utilities. Their membership will
shrink and they won’t be able to meet demand. This can help them too. RCM enables
us to verify that the few skilled tradespeople we will have are indeed doing
the work that adds the most value the safe minimum amount of maintenance
The next benefit to be attained is to more effectively manage the (now reduced)
workforce. The target here is to increase productivity. Design software based
on a geographic information system (GIS) can be used to carry out design of
new and replacement distribution infrastructure. The key benefits are that field
trips by designers can be reduced, the design can be automatically routed to
asset and work management systems as compatible units, and completed work can
be automatically settled to the fixed assets subsystems of the utility’s financial
system. All of these benefits translate into significant reductions in manual
processes, with consequent cost savings as well as improvements in data accuracy.
Resource management software can be deployed to optimize the process of allocating
human resources to the work that must be done. These tools verify that work
is performed in a sequence such that crew travel time between jobs is decreased,
overtime is reduced, material is staged to be at the work location at the correct
time, and work is “clustered” so that all jobs in a certain geographic area
are given to those field crews in or assigned to that area. This reduces the
need for multiple trips to the same remote substation, for example.
Depending on how frequently the software is updated with actual field data,
work assignments can be automatically adjusted on the fly so that unplanned
field conditions can be worked around. By understanding the available skills
in the resource pool (comprising both human and equipment resources), their
location and workload, resource management software can optimally assemble crews
to achieve the same goals: reduced downtime and travel time.
Of course, in emergency situations, such as a storm response, all of the productivity-enhancing
and cost-containment rules that resource management software uses for normal
crew management may no longer apply. However, it is precisely in this scenario
that good resource management applications reveal their worth. Work assignment
rules can be changed rapidly to help meet the objective of minimal system down
time. Tight data integration verifies that the status of work, planned and actual,
is available to planners and to customers through the appropriate business processes.
There has always been a need to get accurate and timely data from the field.
With the advent of relatively inexpensive and reliable PDAs, all manner of field
data can be collected and sent to upstream applications. Job status data allows
schedulers to determine with fine-grained accuracy if the crew can accept new
work, typically on a per-hour basis.
Physical location can be collected via GPS units in trucks, and the location
of poles can be plotted and used to update GIS-based maps. Actual asset condition,
the keystone for best-in-class asset management, can be captured with nominal
error and optimal accuracy. Of course, it was always possible to collect this
data using manual processes. The problem was that data was often incomplete
and the accuracy poor. The quality of decisions made with poor data can never
be more than poor.
Current technology allows for bar code reading, “write once” data capture,
and above all, significant improvements to user productivity. Of course, geography,
available infrastructure, and cost are all determinants on how these field devices
are deployed and how successful they will be. Nevertheless, there have been
some significant successes achieved by utilities that have made the investment.
As with any investment, there needs to be a careful analysis done before committing
to large expenditures. For example, one utility allows workers to use their
PDAs for personal use, thereby giving the worker an incentive to protect the
Much has been written about the potential for introducing “contestability”
into certain business processes of utilities. Functions such as meter reading,
billing, selected maintenance tasks, and IT infrastructure management were early
candidates for being outsourced to third parties. If the outsourcing arrangement
is set up correctly (and this is not easy), the result is potentially lower
costs to the utility with equal or improved operating performance. ??