Starting with the basic techniques of graphing and use of the LINEST() function then plotting multiple series and estimating errors, we now will use several Excel statistical functions to calculate the errors of our Linear Least Squares fit using the complete error formula given above.
Our simple spreadsheet has gotten quite messy by now, as initial estimates clutter our screen for comparison purposes. Here we show how to use the "true" error equation for computing the uncertainty about the regression line.
To compute this we need to use the following Excel statistical functions:
Cell A15 holds the Slope, and Cell B15 holds the Intercept of the regression line y' = slope*X + intercept, where slope and intercept are computed using LINEST(Y-array, X-array) as described in the earlier tutorials.
Cell B20 = COUNT(A7:A10). N = COUNT(X-array); here X-array is $A$7:$A$10 if stated as an absolute reference.
Cell B22=TINV(0.05,B20-2). tn-2 = TINV(0.05, N-2) where N is replaced with the cell reference for N. The 0.05 comes from the 95% Confidence Limit specified, where 1-0.95 = 0.05.
Cell C15=STEYX(B7:B10,A7:A10). Sy|x = STEYX(Y-array,X-array), the standard deviation about regression.
Cell A20 =AVERAGE(A7:A10), the Mean of X. In general X-bar = AVERAGE(X-array). In the formula this is the x with a line over the top, pronounced "x bar."
Cell A22 =STDEV(A7:A10), the Standard Deviation of X. Sx = STDEV(X-array)
In Cell B11 we find an experimentally measured absorbance for an unknown. This "Y" value is converted to a concentration "X" value in Cell B18 with the formula =(B11-$B$15)/$A$15 using our computed values for slope and intercept. X is a required value in our 95%CL calculation.
In cell C20 we find a single calculation of the 95%CL about regression. The formula is =$B$22*$C$15*SQRT(1/$B$20 + (B18-$A$20)^2/(($B$20-1)*$A$22^2)) Check these cell references against the algebraic formula to see how it was set up.
In cell C22 we find =C20/A15. Remember that given the error in "Y", we get the error in "X" by dividing the Y error in C20 by the slope of the line in A15. The number computed is the error in the concentration found by using the absorbance of the unknown to compute concentration from the linear working curve.
Cell F6=C6+$B$22*$C$15*SQRT(1/$B$20 + (A6-$A$20)^2/(($B$20-1)*$A$22^2)). It is the first in a series that computes the upper 95% confidence limit for the entire regression curve. We have included an X=0 point here to bring our graph down near the origin at (0,0). As this formula is filled down, we create all the upper limit values for our plot. These are graphed as the outer curved line on the chart shown. The lower curved line is created in a similar fashion in the G column.
For comparison we have included the "fixed error estimates" from Tutorial LLSBasic03 in columns D and E. These form the inner straight lines on the plot. This error estimate is simply the "2-sigma" t value of 1.96 multiplied by the standard deviation about regression Sy|x. Two standard deviations around any experimental value generally include about 95% of the data observed and can serve as a "quick and dirty" estimate of error.
Compare Cell C18 to Cell C22. C18 is the error estimated from a fixed t-value of 1.96. C22 is the error computed from the full 95%CL calculation. At an absorbance of 0.544 near the center of the plot, the two values are almost the same. But if you look at values near the extremes of the plot you find that the 95%CL calculations give much larger errors than the fixed t estimates. Experience has shown that these calculated 95%CL errors are probably larger than they should be, especially when the data points plotted are few, but even so it points out the dangers of extrapolating past the ends of our experimental data.
If you think this presentation was confusing, wait until you see the treatment given to LLS errors in most Analytical Chemistry textbooks. Careful analysis will show that we are all saying the same thing, however.