Cover Story

Strategic Oil Analysis: Systems, tools and tactics

Mike Johnson & Matt Spurlock | TLT Cover Story April 2009

A properly developed, scheduled and executed program can help you keep ‘normal’ machine wear to a minimum.

www.canstockphoto.com

KEY CONCEPTS
• Lubrication practices provide insight into machine operation by focusing on lubricant health, sump/lubricant contamination conditions and changing machine health.
• With a properly designed test slate, oil analysis is capable of diagnosing and addressing changes in three core areas: machine health, lubricant health and contaminant level.
• The most difficult aspect of oil analysis development is understanding how to best use the data.

This is the first in a five-part series addressing oil analysis, (hereafter referred to as OA). OA is the control tool that the reliability engineer uses to grade the effectiveness of machine lubrication practices and activities.

I am pleased to join forces with Matt Spurlock for this series. Matt is the machine lubricant subject matter expert at Allied Reliability, Inc., in Indianapolis, Ind., and possesses strong practical knowledge in both machine lubrication and oil analysis principles. He has proven to be quite effective pulling meaningful diagnoses from data sets for all types of machines.

This article provides a brief history of oil analysis as a machine care function, an overview of the purpose for oil analysis and introduces the tactics associated with establishing an effective analysis program.

USED OIL ANALYSIS: A HISTORY
Systematic used oil analysis began just after World War II in the railroad industry in the western United States. During the early years, wear metals were identified by several wet chemistry methods. It wasn’t until the development of the spectrograph that used oil analysis truly began to show promise as a value-added predictive technology.

In the mid-1950s, the U.S. Navy began using oil analysis to monitor jet engines. This program became the basis for the military’s Joint Oil Analysis Program. Over the years, this program has partnered with several civilian companies to develop useful technologies and instruments for use in both laboratory bench-top testing and plant floor screening.

The first independent commercial oil analysis laboratory was opened in 1960. Since then more than 250 used oil analysis laboratories have opened in North America alone. While many of these laboratories are considered private interest labs, there are a handful of commercial providers that process several thousand samples per day.

USED OIL ANALYSIS: THE OBJECTIVE
It has been stated many times by machine lubrication professionals and lubricant developers that machine surfaces experience normal wear. While it is true that surfaces do bump, rub and wear, this condition shouldn’t be viewed as normal. Machine wear may be common, but machine owners spend many thousands of dollars, roughly 5% of the annual cost of maintenance, trying to avoid wear.

Modern lubricants are a product of exceptional research and development and generally are robust and durable. If those product capabilities are applied properly, following accepted engineering principles for product selection, application and contamination control, then the prospect of normal wear should be very low.

Nonetheless, lubricated machine surfaces do wear, degrade and fail. Accordingly, a properly developed, scheduled and executed OA condition monitoring practice is useful to monitor the conditions that create downtime and production losses related to all lubricated mechanical components, including both oil and grease applications.

Lubrication practices provide insight into machine operation by focusing on lubricant health, sump/lubricant contamination conditions and changing machine health. The progressions to failure of many lubricated components follow degraded lubricant health and contaminated sumps. Additionally, as the contaminant loads increase, lubricant health declines, further supporting this three-pronged approach.

MACHINE CONDITION ANALYSIS
As shown in Figure 1, the predominant threat to long-term performance is surface wear. Wear is caused by a handful of repeating problems. One of the most common applications for oil analysis is machine condition assessment by wear debris measurement. This is commonly performed through spectrographic analysis, which reports metals in parts per million. Spectrographic analysis generally is performed in one of two ways, ICP (inductively coupled plasma) and RTD (rotating disc spectroscopy), also known collectively as atomic emission spectroscopy (AES).

Figure 1. The predominant reason for the loss of machine usefulness is degradation of surfaces through a handful of repeated root causes.

While both ICP and RTD are standard components of most oil sample test slates, both have weaknesses when it comes to identifying wear debris from components in moderate to advanced stages of failure, characterized by high concentrations of large particles.

The weakness is due to the overall detection limits of the instruments. The ICP has a high accuracy level to particles less than 5 microns in size. The level of accuracy is reduced significantly between 5 and 8 microns, and detection effectiveness is lost for particles above 8 microns. RTD has similar detection parameters for wear particles up to 10 microns. To combat the limiting effects of atomic emission spectroscopy, additional tests should be included in the default test slate, including:

Particle Quantifier (PQ). The PQ gives an index value that is not size dependant. This trendable value can assist in identifying large ferrous wear particles. Used in conjunction with AES, the PQ helps confirm growing normal wear, the onset of aggressive wear or the prospect of eminent catastrophic failure. Due to the very limited sample prep, the PQ test is both inexpensive and highly repeatable.

Particle Counting (PC). Particle counting is mostly thought of as a tool for determining overall cleanliness and contamination in used oil. While this is true, particle count values can be used to assist in the detection of wear debris in a sump. While the values derived from particle counting are not qualitative in nature, they can be used in conjunction with AES and ferrous density (PQ/DR) to help justify the use of analytical ferrography.

Direct Read Ferrography (DRF). DR Ferrography has been used in oil analysis for gearbox applications for many years. DR Ferrography gives two index values: DL (ferrous large) for ferrous particles greater than 5 microns, and DS (ferrous small) for ferrous particles less than 5 microns. Additionally, the values derived from DR Ferrography can help determine wear particle concentration and percentage of large particles. Significant sample preparation and the use of hazardous chemicals makes this a somewhat more costly test that serves well as an exception-based (following identification of a suspicious condition) method.

Analytical Ferrography (A/F). Analytical ferrography, whether via a slide (ferrogram) or a micropatch, is a time-consuming test. Due to its cost, A/F is recommended as an exception test only. In this test, a sample is metered over a slide sitting on top of a high-powered magnet or drained through a paper filter (a patch). In both cases, the captured debris is microscopically examined by a trained analyst to determine the prevailing wear mode. While this is a powerful and potentially highly useful test method, the results of A/F are very subjective in nature, and their successful application depends a great deal on the analysts’ experience and training.

LUBRICANT HEALTH ANALYSIS
Monitoring lubricant health brings a high potential for ROI. The proper use of oil analysis can easily result in extended oil change intervals by a factor of three or more. In order to achieve these extended intervals, the reliability engineer must ensure that the appropriate application-specific values are monitored.

Viscosity. For most components, viscosity will be included in all test slates. Viscosity is the single most important property of a lubricant. The lubricant’s viscosity is what allows it to form the protective layer required for separation of moving surfaces. The lubricant’s viscosity generally increases (thickens) with age. An abrupt change in viscosity generally suggests lubricant mixing and spurs investigation. Viscosity alone cannot be used to determine the health of a lubricant, as different parameters can cause changes in viscosity outside of the normal aging process.

Fourier Transform Infrared Spectroscopy (FTIR) for Oxidation. FTIR provides an oxidation value for the lubricant. Oxidation is a very complicated process by which oxygen reacts with the lubricant. As the lubricant ages, the oxidation value generally increases. However, oxidation alone is not an indicator of oil age. In lubricating oil, common wear metals act as catalysts to accelerate oxidation. Process contaminants and water also catalytically accelerate the rate of oxidation. When the oil reaches an oxidation threshold (limit), it must be replaced.

Neutralization Number. A lubricant’s acid number (AN) or base number (BN) is measured to determine a change in concentration of acid in the oil. Industrial oils receive AN measurement, and engine oils receive BN measurement. As an industrial oil ages, the acid number, oxidation value and viscosity are all likely to increase. Tests designed to measure these parameters have some degree of overlap and interdependent confirmation. BN measures the reserve alkalinity of an oil. This indicates the oil’s ability to neutralize the acid that is created during the combustion process, particularly for diesel engines. Unlike AN, BN is expected to decrease over time and as the over-base additives are expended.

Additive Depletion. Additives are complex organic compounds. FTIR is designed to measure for the presence or absence of compounds such as zinc dialkyldithiophosphate, but is more commonly used to search for molecular contaminants, as is denoted in Figure 2. AES also monitors for the presence of additives but does so at an atomic level—identifying the zinc and phosphorus atoms that make up the zinc dialkyldithiophosphate. Common lubricant degradation mechanisms can cause the loss of molecular integrity without the loss of the atomic building blocks. If FTIR is included in the default test selection for other reasons, the additive concentration will be available for the asking.

Figure 2. FTIR Spectroscopy enables the analyst to look for species of molecules that don’t belong in the oil. Water, nitration, sulfation and oxidation compounds can be identified when present. (Courtesy of Condition Monitoring International)

Other non-routine tests are very useful in identifying lubricant health changes, including the Rotating Pressure Vessel Oxidation Test (RPVOT), the RULER (Cyclic Voltammetry) and the Micro Patch Calorimetry/Quantitative Spectrophotometric Analysis (MPC/QSA™) tests. These will be addressed in another article in this series.

CONTAMINANT MONITORING
Identification of contamination can have the single largest impact on the life of both equipment and corresponding lubricant. The existence of water and air contaminants can/will disrupt the fluid film required for surface separation. The presence of catalytic wear metals increases oxidation and the rate of lubricant degradation and degrades machine surface profiles. The presence of atmospheric and process chemicals causes surface abrasion, deformation and corrosion, all of which leads to lost surface integrity and accelerated wear. In order to reduce the impact of contamination, the following tests are generally run on a routine basis:

Particle Count. The particle count test is used as a primary test for the identification of particle contamination. Atmospheric particles accumulate in the oil through normal thermal (heating and cooling) cycles, through routine top-up activities and from containers used to handle and store the lubricants. A correlation exists between the concentration of atmospheric particles (much of which will be harder than the machine steel surfaces) and component wear. The resulting wear creates more wear and causes lubricant health to decline, creating a self-perpetuating escalation.

Atomic Emission Spectroscopy. AES is used to help monitor contaminant metals. Contaminants measured via AES include dirt (Si, Al), coolant (B, Na, K), wrong oil (additive metals).

Water Percentage. Water percentage is generally run via some type of screening test such as hot plate or FTIR. If these methods indicate positive for water, then Karl Fischer titration is conducted to identify the level of water. If the machine has a critical function, then the screening test can be bypassed and the Karl Fischer test becomes part of the standard test slate.

Soot. Measured in diesel engine samples using FTIR.

Fuel Dilution. Measured in diesel engine samples using FTIR as a screening with a follow-up confirmation test performed either by a Fuel Sniffer or, for more precision, gas chromatography.

While all these tests have been placed in a specific category, the values of each test ideally will be used as confirmation data for other parameters in determining the true condition of the oil or machine. With many other tests capable of providing detailed information on oil samples, the tests mentioned here are run most routinely.

OA PROGRAM DEVELOPMENT & TACTICS
The analysis program development requires planning to address five separate points of concern. They are:
• Machine selection
• Sample collection methods development
• Primary and secondary test slate selection
• Alarm settings development
• Data review, analysis and corrective actions.

Skipping any of these interrelated priorities will turn an otherwise A quality program into a C quality performer and perhaps degrade the program into little more than a useless expense. Diligence is needed in each of the following areas:

Machine Selection. The first step in the process is to determine which machines to place under analysis. In the Best Practices article published in the January 2008 issue of TLT, the concept of machine criticality analysis was reviewed as a key factor for determining which machines justified the development of reliability-centered lubrication practices. The same concept and motivations apply to the development of an oil analysis program.

In order to obtain the highest ROI from an oil analysis program, the reliability engineer must ensure that the program is designed to meet the reliability objectives of the company and/or facility.

The first step in establishing an oil analysis program is determining exactly what equipment should be monitored. When we consider what equipment to monitor, several parameters should be considered, including:

Equipment criticality. The assignment of a criticality value to machine will assist in the ability to determine what, if any, predictive technologies should be applied for condition monitoring purposes. Criticality assessments take into account considerations for employee safety, the environment, operations and maintenance priorities.

Failure modes. Understanding the possible failure modes of a component will help determine whether or not oil analysis is a suitable technology. Although a failure mode may exist where oil analysis can be a useful tool, it may not make economic sense to include oil analysis in the overall predictive tool kit for the component in question.

Cycle time from incipient failure to loss of machine function. Having data regarding the potential time to catastrophic failure helps determine the optimum oil sample interval. While most equipment can be served effectively with quarterly oil sample intervals, some instances may require shorter intervals. Additionally, a hybrid approach may be taken by which a quarterly sampling is conducted until evidence of a problem surfaces and the interval is shortened. Either way, the time to failure after initial detection is an important piece of information in fine tuning the oil analysis program.

SAMPLE COLLECTION METHODS
Sampling technique is an important part of any type of testing activity, whether machines are involved or not. Sampling machine conditions, regardless of the type of testing involved, demands that the sample process itself be dependable.

Dependable sampling technique requires that the sample process itself does not influence the quality of the information collected. The sample must be an accurate representation of the actual conditions and must be repeatable (consistent). Accomplishing these two considerations with oil sampling is not inherently difficult but can be made so if the machine is not prepared for accurate sample collection.

Sample collection from an oil drain line and/or from the top of the reservoir with a suction tube produces poor quality and unrepresentative data, even if these are common methods. To overcome the variability and lack of quality created with these methods, the sampled machines need to be retrofitted with fixed sample ports, and the methods for collection should be very clearly defined.

Sample collection from a circulated oil system is prone to similar data error. Laminar flow conditions can have appreciable impact on the distribution of material collected when samples are taken from non-turbulent flow conditions.

TEST SLATE AND ALARM SETTINGS
Once the appropriate equipment has been selected and a dependable routine for sample collection defined, a determination must be made as to what properties should be monitored. With a properly designed test slate, oil analysis is capable of diagnosing and addressing changes in three core areas: machine health, lubricant health and contaminant level.

The primary and secondary slate flows from understanding prevailing failure modes for the selected machine types and component surface interactions. Once the component failure modes are identified, the tests that best reveal the presence of a failure mode are adopted. We’ll discuss test slate selection in future articles in this series.

Each test measures for an increase or decrease of a property. Alarm settings are used to bring attention to a rise or fall of specific criterion. An oil health criterion such as viscosity represents a potential failure mode if it is either too high or too low, so viscosity has alarm settings for both rising and falling numbers. Wear debris represents a potential failure mode only when it is rising. Consequently, wear debris alarm settings apply to numbers that are increasing.

Some alarms are based on an increase of a specific value, some are based on an increase of a percentage, some are based on a rate of change with operating hours, and some are based on changes relative to a standard deviation of a collection of previous samples.

DATA REVIEW, ANALYSIS & CORRECTIVE ACTIONS
Developing lasting, high quality, reliability-enhancing machine lubrication practices requires more than a textbook understanding of film formation, lubricant failure and contamination control techniques. While these are useful, success follows a thorough understanding of how the organization functions as much as it follows understanding the physics of component interaction. Similarly, perhaps the most difficult aspect of oil analysis development is in understanding of how to best use the data from analysis.

The tests noted above are typically associated with external, commercial laboratories, but that doesn’t necessarily have to be the case. Many shop-level instruments provide the reliability engineer with nearly immediate information that can be integrated into the decision-making process. Onsite testing is useful for machines with very short time-cycles between incipient and manifest failure. Yet, conducting all the noted tests using on-site methods for the sake of speeding up the availability of test results is probably not a best use of available resources.

The strongest economic value associated with lubricant analysis occurs from the use of the data to eliminate the conditions that, left alone, are highly likely to cause failure of the lubricant, the machine or both. Common responses include fixing the common long-term problems: clean the oil, cool the operating temperature of the sump, increase the viscosity, increase lubricant types to improve wear-resistance capacity, clean the oil again. These steps are very common and provide superior economic rewards when accomplished with a degree of consistency.

SUMMARY
The purpose for investing human energy and precious company capital in the improvement of lubrication practices, including implementing effective oil analysis practices, is to protect machine capacity and plant productivity. The dollars expensed in oil analysis efforts are a rounding error relative to the cost impact of lost production from lubricated component failure.

Oil analysis is the feedback loop that tells the practitioner whether the lubrication activities are delivering the results as expected. Oil analysis should be designed to provide information about the state of the lubricant condition, the cleanliness of the sump and the condition of the machine. A multitude of tests can be used to deliver this type of information.

To ensure that you have developed an efficient program that delivers results for these three criteria, you must do the following:
• Assure the best fit for machine selection
• Make sure that the sample collection process provides samples that are both relevant and repeatable
• Select tests that reflect real failure conditions and root causes
• Set alarms that draw attention to a change in conditions well before the machine is damaged.

Finally, make sure you use the information to make decisions that improve long-term effectiveness of the machinery.

Mike Johnson, CLS, CMRP, MLT, is the principal consultant for Advanced Machine Reliability Resources, in Franklin, Tenn. You can reach him at mike.johnson@precisionlubrication.com.

Matt Spurlock, CMRP, MLA II, MLT I, LLA I, is the machine lubricant subject matter expert at Allied Reliability, Inc., in Indianapolis, Ind. You reach him at spurlockm@alliedreliability.com.