More Than You Ever Wanted to Know About Calibrations, Part 6 – Linearity and Calibration Spacing
8 May 2023In my previous blog posts, I’ve focused mainly on the theoretical side of things, covering a lot of the math behind how calibration curves work. Starting with this post we’ll be looking more at the practical considerations of how to make a calibration, starting with how to choose your calibration points.
The first step in choosing your calibration points is determining your calibration range. If you’re testing known samples, it’s an easy choice; just span your calibration a bit beyond the expected range of your samples. If you’re testing unknown samples, you often want to use the full linear range of your instrument to get the most sensitivity while limiting the number of sample dilutions needed. You can map out the detector range by analyzing a bunch of different concentration standards. Most detectors will have a response curve similar to what’s shown in Figure 1. The low end of the curve is flat and dominated by noise, followed by a linear range once the standard signal is sufficiently high enough to overcome the noise. Finally, a saturation point is reached, and the signal begins to top out. The middle, linear range is the sweet spot to target for good calibrations.
Figure 1 – Example detector response curve, with linear range outlined in red.
Once you’ve determined the linear range the next step is deciding on the number and spacing of calibration points. One point will make a linear curve with an intercept of 0. This may be sufficient if you are testing samples that always have the same concentration. Two points will always make a perfect linear fit with potential for nonzero intercepts. Three points is the minimum for a linear fit to show any error, but most methods require a minimum of five points for linear fits, and six points for quadratic fits. If you have a standard mix you may have to use more points than the minimum if the compounds have different linear ranges, extending it to catch outliers that may be more or less sensitive. Making calibration points can be time and labor intensive, so in general I recommend using six to seven points depending on the compounds and standards available. This allows you to potentially drop points for compounds that while still having enough points to meet the minimum requirements for most methods. See your method calibration guidelines to determine when or if dropping calibration points is allowed.
The next step is determining how to space your points within the linear range. This is an area that I feel is lacking in detailed guidance, and what guidance exists is generally framed as suggestions with little to no technical justification. SW846 method 8000D section 11.4.1.3 states that you should avoid using geometric spacing on the high end since that can mask the point of detector saturation and suggests using equal spacing for the top calibration points. A best practices document from LGC limited states that having an outlier on the high end of the curve gives a point of leverage that strongly effects the slope of the curve, which is true of unweighted curves due to the outsize influence that higher calibration points have relative to the lower points. I mentioned in a previous post on TO15A that stacking your calibration on the low end can counter the influence of higher calibration points in unweighted curves, which I recall reading somewhere but can no longer find a good reference for.
Further confusing things, EPA methods occasionally give suggestions for calibration points, or example calibrations used in the method validation. No justification is given for the selection of these calibration points, and the calibrations often run counter to the guidance given in 8000D. Method 1633 for example has calibration points of 0.2, 0.5, 1.25, 2.5, 5, 12.5, and 62.5 ng/mL for most compounds, while OTM45 has 0.25, 0.5, 1, 2.5, 5, 20, and 100 pg/µL. Both calibrations have the high point as a large outlier and cluster points towards the low end of the curve. Neither of these methods are part of the SW846 family or covered by 8000D, but they still run counter to advice put out by the agency.
Due to the lack of technical justification and consistent guidelines, I wanted to do some testing to see if I could find a way to best determine the optimal spacing of calibration points. Unfortunately, making and analyzing a number of calibrations to test this would take a lot of time, so I decided to do some simple modeling. To do this I started with the example calibration in method 1633 and made a calibration with equal spacing that covered the same range, both shown in Figure 2.
Figure 2 – 1633 method calibration (left) and equal spaced calibration (right).
From there I set my theoretical responses equal to the concentration (i.e., the cal point of 10 ng/mL has a response of 10) for a perfect fit and used Excel’s random number generator to vary the responses by ±10%. I generated 5 calibration curves with each spacing, which are shown in Figure 3 below.
Figure 3 – Example calibrations built with 1633 method spacing (top) and equal spacing (bottom).
What’s interesting is how the slope and intercept are affected by the different spacing, which can be seen visually and in the standard deviation calculations for the slope and intercept on the right. By stacking the points at the low end, the method calibration has more variation in slope, but converges at the low end. This is the leverage effect referenced in the LGC document. The equal spacing calibration on the other hand has less slope variation but more in the intercept, and does not converge at the low end. Table 1 shows the calculated % error for each point, as well as the total % error for each calibration, with the average total highlighted.
cal point 
Equal Weighted % error, method calibration  
0.2 
91.7 
249.5 
82.9 
25.8 
173.1 
0.5 
33.8 
88.6 
19.3 
4.6 
64.8 
1.25 
6.0 
27.6 
2.0 
14.5 
25.5 
2.5 
6.4 
8.8 
3.9 
1.6 
6.6 
5 
7.2 
8.6 
5.7 
2.7 
5.2 
12.5 
9.1 
10.9 
4.5 
1.7 
9.0 
62.5 
0.3 
0.5 
0.1 
0.1 
0.4 
total 
154.6 
394.5 
118.5 
51.1 
284.6 
average 
200.6 

cal point 
Equal Weighted % error, equal spacing  
0.2 
20.4 
839.7 
595.9 
22.3 
364.0 
10 
2.4 
5.1 
0.4 
8.7 
3.3 
20 
4.9 
0.6 
4.7 
9.8 
4.7 
30 
2.0 
1.1 
6.3 
0.1 
0.9 
40 
6.2 
8.9 
5.6 
3.0 
6.2 
50 
7.8 
5.8 
3.4 
1.0 
6.0 
62.5 
2.4 
7.5 
1.8 
1.1 
6.0 
total 
46.2 
868.6 
618.0 
46.1 
391.3 
average 
394.0 
Table 1  % Error for 1633 method and equal spaced calibrations, using linear, equalweighted fits.
I had mentioned in my last post that the yintercept drives lowend error more than the slope, and we can see that in the results here. The equal spaced calibration has more variance in the intercept, and 3 out of 5 calibrations have a large % error on the low calibration point. This is what mainly drives the higher average % error, with the equal spacing having almost twice the total average error as the method calibration. This confirms what I stated in the TO15A blog about the importance of stacking low points in unweighted curves to counter the influence of the higher calibration points.
I’ve previously stressed the importance of weighted curves, so I hope at this point you might be asking how this data looks if the curves are weighted. I’m too picky and obsessive about calibrations to ever forget this, and Figure 4 shows the same calibration responses fitted to 1/x and 1/x^{2} weighting.
Figure 4 – Weighted curves for 1633 method (top) and equal spaced (bottom) calibrations.
The difference is obvious just from looking at the curves themselves, and you can clearly see that they all converge on the low end of the calibration. This is due to the fact that the lower calibration points now have a more equal influence on the curve, so it’s no longer dominated by the higher calibration points even when the points aren’t clustered. I would expect that this would result in less error in the low end, and the data in Table 2 verifies this.
cal point 
1/x % error, method calibration  
0.2 
6.9 
27.7 
20.0 
0.4 
14.3 
0.5 
0.8 
2.4 
5.4 
5.6 
2.8 
1.25 
6.2 
4.4 
7.3 
10.7 
2.3 
2.5 
0.9 
5.3 
8.1 
0.1 
3.7 
5 
5.2 
13.4 
4.3 
3.3 
9.2 
12.5 
8.9 
10.4 
4.6 
1.6 
9.2 
62.5 
1.4 
3.3 
1.0 
0.4 
2.6 
Total 
30.4 
67.0 
50.6 
22.2 
44.2 
Avg 
42.9 






cal point 
1/x % error, equal spacing  
0.2 
0.1 
10.9 
13.4 
1.8 
2.6 
10 
2.2 
5.8 
8.6 
8.4 
1.6 
20 
5.0 
3.3 
7.0 
9.9 
3.5 
30 
2.0 
1.1 
6.6 
0.1 
1.0 
40 
6.2 
7.1 
6.7 
3.0 
5.8 
50 
7.7 
3.5 
2.0 
1.1 
5.2 
62.5 
2.5 
9.8 
4.0 
1.1 
7.4 
Total 
25.7 
41.4 
48.2 
25.3 
27.1 
Avg 
33.5 






cal point 
1/x2 % error, method calibration  
0.2 
0.7 
1.5 
4.6 
0.7 
0.2 
0.5 
0.5 
2.8 
9.1 
5.4 
0.2 
1.25 
5.5 
1.5 
5.9 
10.6 
4.1 
2.5 
2.2 
0.3 
5.0 
0.3 
0.6 
5 
6.7 
5.9 
8.9 
3.7 
5.6 
12.5 
7.0 
2.4 
0.1 
2.0 
5.0 
62.5 
3.3 
10.7 
6.3 
0.1 
7.6 
Total 
25.7 
25.1 
39.9 
22.7 
23.4 
Avg 
27.3 






cal point 
1/x2 % error, equal spacing  
0.2 
0.0 
0.1 
0.1 
0.1 
0.0 
10 
2.1 
4.0 
6.8 
8.1 
1.2 
20 
5.0 
1.5 
4.9 
10.1 
4.0 
30 
2.0 
0.7 
4.4 
0.2 
0.6 
40 
6.2 
5.2 
9.2 
2.7 
5.4 
50 
7.8 
1.6 
0.3 
0.8 
4.8 
62.5 
2.5 
11.4 
6.4 
1.3 
7.9 
Total 
25.6 
24.5 
32.2 
23.3 
23.8 
avg 
25.9 




Table 2  % Error for 1633 method and equal spaced calibrations, using linear, weighted fits.
Two things stand out from the weighted data when compared to the unweighted data. The first is that the total % error is almost an order of magnitude lower, with the error on the low calibration points being in some cases 3 orders of magnitude lower. The second thing is that there’s not much difference in error between the method calibration and the equal spaced calibration. Since the weighting gives a more equal influence to each calibration point and each point has the same relative error (±10%), any point on the curve should contribute the same amount of error to the calibration. This means that, for calibrations that have equal relative error across the calibration range, equal weighted calibrations should always be stacked with more calibration points on the low end, but weighted curves can have whatever points fit you feel like making. Use dice to choose your concentrations, calibrate with only prime number concentrations, or use the Fibonacci sequence, the world is your oyster.
The keen eyed among you will have noticed the caveat to this though. The phrase “for calibrations that have equal relative error across the calibration range” is very important, and since this is getting a bit long the next post will dig into how this might change how you space calibration points.