What is "Adjusted" r-squared?

Linear regression is a common tool that the pharmacokineticist uses to calculate elimination rate constants. Standard linear regression provides estimates for the slope, intercept, and r², a statistic that helps define goodness of fit. Statistical texts define r² as the coefficient of determination and it is calculated using the following equation:

$r^2=1-\frac{SS_{residuals}}{SS_{total}}$

where SS = the sum of squares for either the residuals or the total (original data). As the residuals get smaller, r² gets larger to a maximum value of 1.

Another way to think of r² is to consider that it expresses the amount of variability in Y that is explained by X, given the selected mathematical model. To put that into pharmacokinetic terms, r² defines the amount of variability in concentration (Y) that is explained by time (X) using a monoexponential decline equation $(C = C_0*e^{-kt})$ .

When performing linear regression, the slope and intercept parameters are chosen to maximize r² which defines the “best-fit” of the data. This accepted methodology ensures that combination of slope and intercept parameters explain as much of the variability in concentrations that are observed for the selected data points. While this method is accepted for well-defined datasets, in pharmacokinetic analysis, the terminal slope of a pharmacokinetic curve may include 3, 4, 5, 6, or even more data points. And the option to select the number of datapoints is somewhat arbitrary. The addition of data to the model often increases the r² value by virtue of simply adding datapoints. As SS_total increases, r² decreases, even if SS_residuals does not decrease.

To address this concern, a new statistic called “adjusted” r² was developed. This new statistic essentially issues a penalty for each additional data point in the analysis. This has the effect of requiring the additional data point to improve the r² by more than just decreasing SS_total. The “adjusted” r² is calculated using the following equation:

$Adjusted\;r^2 = 1-(1-r^2)*\frac{(n-1)}{(n-2)}$

where n = the number of datapoints used in the regression. At very large values of n, adjusted r² is equivalent to r². However, at small values of n that are used in pharmacokinetic analysis (e.g. <10), the adjusted r² can be significantly different from r². For example, moving from 4 data points to 5 data points, the adjusted r² statistic is multiplied by 0.75, or given a penalty of 25%!

Thus the “adjustment” is related to selecting the right amount of data to include in the analysis. This statistic is biased toward selecting the fewest amount of data points while maximizing the coefficient of determination. Maximizing the adjusted r² when performing terminal slope regressions selects the best set of slope and intercept parameters with the fewest number of data points. Many consider the use of adjusted r² as the optimal method for selecting a terminal rate constant for pharmacokinetic data.

To learn about how we’ve improved Phoenix to make performing NCA and PK/PD modeling even easier, please watch this webinar I gave on the latest enhancements to Phoenix.

What is “Adjusted” r-squared?

About the author

eChalk Talk: Avoid getting “lost in translation” – Increase confidence in translational research using biosimulation

Simcyp Discovery Simulator

PBPK Modeling to Support Bioequivalence & Generic Product Approvals

FDA’s Digital Transformation: The Future of Technology and How to Prepare