Standardization and quality assurance of quantitative determinations
Lothar Siekmann, Gerhard Röhle
Specimens to be examined in medical laboratories are characterized by extraordinary variety due to their different origin and composition. Materials to be examined include:
- Body fluids such as blood, cerebrospinal fluid, ascites, or pleural fluid
- Excretions such as urine, saliva, sputum, or feces
- Tissue samples.
Usually the specimen represents a mixture of many different chemical substances and more or less differentiated organic structures.
The sum of all components and characteristics of a sample except for the analyte itself is referred to as the sample matrix. Each component of a specimen can become the subject (analyte) of a laboratory diagnostic evaluation.
The analyte may represent quantities (measurands) as different as:
- Physical properties
- Chemical elements, ions, inorganic molecules
- Low-molecular-mass organic structures
- Macromolecules with known or only approximately known structure
- Cells or cellular systems.
The enormous variety of components of biological specimens requires that the methods used for the detection or quantitative determination of individual analytes should meet high expectations. For characterizing a quantitative analytical method, a number of different quality characteristics are taken into consideration. Among these, specificity is of special significance in the present context.
Specificity characterizes the ability of a method to measure the analyte without erroneous interference by other components contained within the sample matrix.
Other quality characteristics include:
- Limit of quantitation
They are dealt with in further detail in the following text.
Standardization should always be based on the application of the concept of traceability. This concept is the subject of the ISO 17511 standard and is explained by the model in . The concept describes a hierarchical structure of measurement procedures and calibrators, from the patient sample as the lowest level of the hierarchy to the highest level (i.e., the definition of the measurand in SI units) .
(a) Definition of the quantity: if a quantity can be described by a defined molecular structure, it can be specified as an amount of substance concentration (e.g., in mol/L).
(b) The primary reference measurement procedure is the determination of the purity (certification) of a primary reference material. Such certified reference materials, whose purity was determined by metrology institutes, are available for many measurands.
(d) Secondary reference methods are procedures that are required to possess a high degree of specificity and are therefore suitable to deliver results of highest trueness and precision in a complex biological matrix. Such methods can be expected to determine the true value within narrow limits of measurement uncertainty. A typical method principle for the development of a reference measurement procedure is isotope dilution mass spectrometry.
Secondary reference methods are intended for the certification of manufacturer’s calibrators, of control samples for internal and external quality assurance and, to some extent, for the certification of panels of patient samples as part of the development and testing of diagnostic test kits. Matrix reference materials, which can be obtained from metrology institutes (e.g., cholesterol in human serum), are also certified with such secondary reference methods.
(e) Manufacturer’s calibrators are used to calibrate in-house measurement procedures.
(f) These manufacturer’s procedures (in-house) are used to calibrate the product calibrators.
(g) The product calibrators are part of the test kits available to routine laboratories for diagnostic purposes.
(h) Diagnostic laboratories use routine methods to determine analytical results in the patient samples with the test kits provided by manufacturers (calibrators, reagents, equipment).
The traceability of a measuring result in a patient sample (i) up to the highest level – represented by the definition of the measurand (a) – is assured if all the individual steps in this hierarchical model are traceable.
Each measurement procedure (reference, manufacturer’s, or routine procedure) and each value assigned to a reference material or calibrator has a certain measurement uncertainty [μc(Y)]. The measurement uncertainty of the result of a patient sample is calculated from all the individual contributions [μc(Y)] of the hierarchical chain according to the rules for the calculation of overall measurement uncertainty.
In the hierarchical model of traceability described here, individual steps, except for the highest levels (a) through (d), may be skipped. For example, it is possible to skip the in-house manufacturer’s procedure and the manufacturer’s calibrator and certify the product calibrators (g) from a manufacturer directly with a secondary reference measurement procedure (d).
In theory it would be possible, for example, to measure patient samples (i) directly with a secondary reference measurement procedure (d). In practice, however, this would make little sense due to the enormous costs and time required.
The prerequisite for traceability to the highest level is the availability of primary and secondary reference measurement procedures and primary calibrators. Unless these are available, the traceability ends at the level of the manufacturer’s calibrator or manufacturer’s procedure. Standardization in terms of consistent measurement results across manufacturers can then not be achieved.
The implementation of the traceability concept has been monitored globally by a committee (Joint Committee for Traceability in Laboratory Medicine, JCTLM, ) since 2002. The leading members of the committee are the International Committee of Weights and Measures (CIPM) represented by the International Bureau of Weights and Measures (BIPM) in Paris, the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC), and the International Laboratory Accreditation Cooperation (ILAC).
The JCTLM receives applications for the listing of reference materials, reference measurement procedures, and services by reference laboratories. Following a review by expert subcommittees with respect to compliance with the requirements of ISO standards 15193 , 15194 and 15195 , an up-to-date listing is published annually on the JCTLM’s website. Laboratories which apply for a listing according to the services offered (quantities) must be accredited to ISO 15 195. They are also required to regularly participate in an external quality assurance program for reference laboratories ().
The EU Directive on In Vitro Diagnostic Medical Devices (98/79/EC) stipulates: the traceability of values assigned to calibrators and/or control materials must be assured through available reference measurement procedures and/or available reference materials of a higher order.
This requirement applies equally to the following two groups involved with in vitro diagnostic medical devices:
- The manufacturers of in vitro diagnostic devices
- The organizers of external quality control programs.
The directive of the German Medical Association has been requiring the use of reference method values as target values in external quality control for a multitude of quantities since 1987. Since then, continuous improvement has been seen with regard to consistency of the results obtained with test procedures from different manufacturers.
This must be seen as a positive result of the application of the traceability concept for the standardization of clinical chemistry methods of analysis.
For internal quality assurance of quantitative determinations, control samples are added to the series of patient samples as “random samples”.
If the true target values are available for the control specimens, the routine result for an analyte contained within the control sample may be compared to its target value (control of trueness). In the case of repeated determinations of the analyte in samples of the same control specimen (e.g., in every analytical series) the variation of results can be calculated given an adequately large number of single results (control of precision). If the lack of trueness and the lack of precision of the control measurements are still within predetermined limits, it can be presumed that the results measured in the patient samples also fulfill the requirements set by the quality assurance standards.
A biological specimen to be used as a control sample for the purposes of quality assurance should meet the following criteria:
- It must be homogeneous within a lot in order to exclude sample-to-sample differences
- A lot should be large enough so that one-time expenses (e.g., determination of target values, stability checks) are reduced to a minimum
- The material should be stable enough to allow for prolonged storage without showing any alterations
- The general characteristics of the control specimen and those of the patient samples should not differ from each other (commutability).
The combination of the latter two requirements represents an essential problem of a system using control samples. The virtually indispensable requirement for stability of control material usually means that it has to be more or less altered in comparison to the usually instable native specimen. The problems arising from this dilemma tend to remain within tolerable limits in the case of some control materials (e.g. serum).
In the case of other control materials, large difficulties may arise given the current state of the art in the preparation of control samples. For instance, highly specialized analytical systems designed to differentiate between denatured and normal blood cells will hardly be capable of correctly counting fixed blood cells contained in a control blood sample.
For the purpose of external and internal quality assurance, the terms listed in are used in conjunction with target values and location parameters. It follows from the requirements of the In Vitro Diagnostic Medical Devices Directive and the Directive of the German Medical Association that, whenever possible, reference method values should be used as target values in internal and external quality control .
The error of measurement of an analytical result consists of systematic and random components of error. Numerical values can be assigned to both the error of measurement as well as the systematic and random components of error. Refer to .
The accuracy of a measurement value depends on the trueness and the precision of the method of measurement. No numerical values can be assigned to these terms.
Accuracy, trueness (accuracy of the mean) and precision have a similar relationship to each other as inaccuracy (error of measurement), inaccuracy of the mean (systematic error), and imprecision (random error); these terms have accordingly antithetic meaning.
Because the true value is not determinable, for the practical purposes of quality assurance it is substituted by a defined true value (target value), e.g. a reference method value or – if the latter is not available – by a method dependent assigned value. Under such circumstances, the term “conventionally true value” is used. In this context the terms “conventional bias” and “conventional inaccuracy” may be derived from this.
The relationships between measurement values, expectation values and target values (reference method or method dependent assigned values) are shown in
Statistical tools are used in the quality assurance of quantitative determinations. Refer to:
The goal of quality assurance of quantitative determinations as part of laboratory diagnostic evaluations is, on the one hand, to determine how widely the measurement values vary due to random errors (control of precision) and, on the other hand, to check the extent of systematic errors if the necessary prerequisites are in place (control of trueness).
Insofar as the terms trueness and accuracy are used here in the context of applied quality assurance, they always refer to the conventional trueness or accuracy.
Essential components of quality assurance are internal laboratory controls of precision and trueness as well as external quality assurance as accomplished by means of inter laboratory surveys.
As part of inter laboratory surveys usually two samples with different concentrations of the analyte are sent out to the participants of the inter laboratory survey. The results of all participating laboratories of the inter laboratory survey can be summarized in so-called Youden diagrams ().
Each point in a Youden diagram represents both results from a given laboratory: the value for sample A is read off the abscissa and that for sample B off the ordinate. A laboratory whose two measurement results both coincide with the target values will have its representing point situated right in the middle of the diagram. Frequently the results are distributed in the shape of an elliptical cloud surrounding the diagonal taking a course from the lower left to the upper right. This correlates to a predominantly systematic deviation of both results towards either higher or lower values.
The Youden diagram shows the results of an inter laboratory survey performed for serum creatinine whose correct determination still presents problems with the currently available routine methods. For the comparison, all results obtained with the Jaffé method were shown on the left (highlighted by the black dots), and all results obtained with the enzymatic methods on the right of the graphic (highlighted by the black dots). The directive of the German Medical Association require the use of a reference method value as the target value.
Interpretation shows that the individual results (i) vary widely and are partly outside the control limits when the Jaffé method is used, and (ii) correlate much better with the reference method target values with only few results outside the control limits when the enzymatic methods are used.
Similarly, it is possible on the website of the Reference Institute for Bioanalytics () to show the participants both with respect to method principles and in groups of different manufacturer’s test kits used.
In this way the reliability of commercial test methods can be shown (coincidence with target values, variation of results from different laboratories).
The measurement value of an analytical determination except for random coincidences always shows more or less extensive random and systematic deviations from the target value.
The extent of the random errors in an analytical system can be calculated as part of the control of precision; the extent of systematic errors of measurement with regard to the target value can only be ascertained using trueness control procedures. Unknown discrepancies between target and true value are not subject for consideration. The obvious goal of analysis is for these errors of measurement to remain within acceptable limits.
The definition of limits for an acceptable scatter and an acceptable systematic error of measurement with regard to analytical results is only possible as per agreement. Two steps are necessary for such an agreement:
- A basic standard must be selected which is reasonably related to the desirable analytical precision and trueness
- The basic standard must be converted by means of agreed upon factors or formulas into applicable criteria of acceptance.
The following two examples should help in understanding the principle of this concept:
1. The reference interval for serum chloride be 98–108 mmol/L (103 mmol/L ± 5%). A valid relative error of measurement of 6%, for instance, would imply that a measurement result in the pathologically low range (e.g., 97 mmol/L) just as one in the pathologically high range (e.g. 109 mmol/L), would have to be considered as adequately correct if the actual concentration were 103 mmol/L. The tolerable error of measurement for serum chloride would therefore have to be lowered.
2. The reference interval for serum urea be 10–50 mg/dL (30 mg/dL ± 67%). Given this prerequisite, even measurement results with errors or deviations, for instance, of 20% could lead to erroneous interpretation only if the actual urea concentration is close to the limits of the reference interval. The tolerable error of measurement for urea therefore can be more generously set than for chloride.
The examples demonstrate that it can make sense for the purpose of quality assurance to use the reference interval of a quantity as the basic standard for the criteria of acceptance.
To convert this basic standard into applicable criteria of acceptance it has been postulated that:
- The analytical scatter in conjunction with the control of precision as expressed by the coefficient of variation may amount to maximally 1/12 of the width of the reference interval as expressed by a percentage of its mean.
- In conjunction with the “control of trueness” (more precisely control of accuracy) the relative error or deviation of a measurement value with regard to the target value may amount to maximally 1/4 of the reference interval as expressed by a percentage of its mean
For serum chloride whose width of reference interval amounts to approximately 10% of its mean it thus follows that:
- The CV of the precision control measurement values would have to be less than 0.85%.
- A measurement value for the “control of accuracy” would be allowed to deviate from the target value by maximally 2.5%.
This concept formed the basis for the criteria of acceptance defined in Part B1 of the directive of the German Medical Association notwithstanding some compromises which were agreed upon on account of the still suboptimal state of the art concerning analytical procedures .
Besides the reference interval other approaches for defining a basic standard for criteria of acceptance were also discussed:
- The demands placed on the reliability of analytical results by clinicians
- The (limited) performance capability of current analytical procedures
- The intraindividual and inter individual biological variation of a quantity.
As part of the intensive efforts to reach an internationally acceptable agreement on the important question of criteria of acceptance, the biological variation is currently almost the only area of agreement.
Despite this agreement concerning the basic standard, a final binding settlement should not be anticipated in the near future.
So far in this discussion essential aspects of the criteria of acceptance used in conjunction with the control of trueness have not been taken into consideration e. g:
- The difference in quality of the target values. If the target value is a reference method value, adherence to a defined tolerable error of measurement may be much more problematic than if the target value represents a specific target value used in conjunction with a routine method ().
- The dependence of the concentration on errors of measurement. As is observed in the case of many quantities, relative errors of measurement become significantly larger with a decrease in concentration. Thus, for instance, it is impossible during the determination of glucose in the hypoglycemic range using customary routine methods to adhere to the tolerable error of measurement if it is defined as a certain percentage valid across the entire range of measurement.
A laboratory director can generally be expected to make use of all the available means of quality assurance in order to ensure the reliability of the results obtained at his laboratory and to meet the relevant medical requirements. For most analyses there are no regulations prescribing the extent of any such measure.
The German Medical Association has published a directive for some areas of laboratory medicine that defines a minimum quality assurance program for a number of analyses.
- A part A, which describes the “General Requirements for Quality Assurance of Medical Laboratory Examinations”. Many of the requirements stated there correspond to those of ISO standard 15189 .
- A subpart B1, which deals with “quantitative analyses in medical laboratories“. This part contains specific regulations for internal and external quality assurance, including a table listing the control limits for numerous quantities. The requirements are summarized in .
Further subparts, B2 through Bx, comprising the requirements for qualitative analyses, pathogens, and spermatologic investigations are in development.
1. International Organisation for Standardization. ISO 17511. In vitro diagnostic medical devices – Measurement of quantities in biological samples – Metrological traceability of values assigned to calibrators and control materials, ISO, 2003.
3. International Organisation for Standardization. ISO 15193. In vitro diagnostic medical devices – Measurement of quantities in samples of biological origin – Requirements for content and presentation of reference measurement procedures, ISO, 2009.
4. International Organisation for Standardization. ISO 15194. In vitro diagnostic medical devices – Measurement of quantities in samples of biological origin – Requirements for certified reference materials and the content of supporting documentation, ISO, 2009.
Types of errors
Number (N) of values
N = 50
N = 100
N = 150
N = 287
Position in the row of the value
Remark: The preference for listing the 16th (16%) and 84th percentiles (84%) is based on the analogy of the (84% – 16%) range to the x ± 1s range which is associated with normally distributed values and also contains 68% of the values.
Figure 50-2 Measurement values (MV) of two laboratories for a parameter in a given sample using two different routine methods. The relationships between location parameters, target values and measurement values are shown in .
TV, true value (not determinable); RMV, reference method value; AV, assigned value (method-dependent); EV, expectation value; MVX, individual measurement value with random errors
– A–D = Absolute bias (not determinable)
– A–E = Absolute inaccuracy (not determinable)
– B–D or C–D = Conventional bias with regard to the target value (RMV or AV)
– B–E or C–E = Conventional inaccuracy with regard to the target value (RMV or AV)
Each component of the error of measurement may be preceded by a negative or positive sign.
Ideal form of the frequency distribution (density function) of a normally distributed sample of values with a listing of the relative proportion (%) of all values positioned in the segments –1 s to +1 s as well as below –1 s and above +1 s.
x = Rank of the values, e.g. of concentrations
x = Mean
s = Standard deviation
N = Number of values
Figure 50-6 Example of an empirical frequency distribution curve; not normally distributed sample of values with a listing of the position of the mean (x) and the standard deviation (s) as well as the median (M) plus the 16th and 84th percentiles (%).
x = Rank of the values, e.g. of concentrations
N = Number of values
Figure 50-7 Youden diagram showing the serum creatinine results of an interlaboratory survey conducted by the Reference Institute for Bioanalytics. Each point in these diagrams represents both results from a given participant; the value for sample A is read off the abscissa and that for sample B off the ordinate. In the diagram on the left, all results (538 of 711) from participants who used the Jaffé method are highlighted as dark points. On the right side, all results from participants who used enzymatic methods (59 of 711) are shown as dark points.