What is “Adjusted” r-squared?

What is “Adjusted” r-squared?

Linear regression is a common tool that the pharmacokineticist uses to calculate elimination rate constants. Standard linear regression provides estimates for the slope, intercept, and r2, a statistic that helps define goodness of fit. Statistical texts define r2 as the coefficient of determination and it is calculated using the following equation:

r^2=1-\frac{SS_{residuals}}{SS_{total}}

where SS = the sum of squares for either the residuals or the total (original data). As the residuals get smaller, r2 gets larger to a maximum value of 1.

Another way to think of r2 is to consider that it expresses the amount of variability in Y that is explained by X, given the selected mathematical model. To put that into pharmacokinetic terms, r2 defines the amount of variability in concentration (Y) that is explained by time (X) using a monoexponential decline equation (C = C_0*e^{-kt}).

When performing linear regression, the slope and intercept parameters are chosen to maximize r2 which defines the “best-fit” of the data. This accepted methodology ensures that combination of slope and intercept parameters explain as much of the variability in concentrations that are observed for the selected data points. While this method is accepted for well-defined datasets, in pharmacokinetic analysis, the terminal slope of a pharmacokinetic curve may include 3, 4, 5, 6, or even more data points. And the option to select the number of datapoints is somewhat arbitrary. The addition of data to the model often increases the r2 value by virtue of simply adding datapoints. As SStotal increases, r2 decreases, even if SSresiduals does not decrease.

To address this concern, a new statistic called “adjusted” r2 was developed. This new statistic essentially issues a penalty for each additional data point in the analysis. This has the effect of requiring the additional data point to improve the r2 by more than just decreasing SStotal. The “adjusted” r2 is calculated using the following equation:

Adjusted\;r^2 = 1-(1-r^2)*\frac{(n-1)}{(n-2)}

where n = the number of datapoints used in the regression. At very large values of n, adjusted r2 is equivalent to r2. However, at small values of n that are used in pharmacokinetic analysis (e.g. <10), the adjusted r2 can be significantly different from r2. For example, moving from 4 data points to 5 data points, the adjusted r2 statistic is multiplied by 0.75, or given a penalty of 25%!

Thus the “adjustment” is related to selecting the right amount of data to include in the analysis. This statistic is biased toward selecting the fewest amount of data points while maximizing the coefficient of determination. Maximizing the adjusted r2 when performing terminal slope regressions selects the best set of slope and intercept parameters with the fewest number of data points. Many consider the use of adjusted r2 as the optimal method for selecting a terminal rate constant for pharmacokinetic data.

To learn about how we’ve improved Phoenix to make performing NCA and PK/PD modeling even easier, please watch this webinar I gave on the latest enhancements to Phoenix.

Nathan Teuscher

About the Author

Nathan Teuscher

More Posts by This Author

Dr. Nathan Teuscher is the Vice President of Pharmacometric Solutions at Certara. He is an expert in clinical pharmacology, pharmacometrics, pharmacokinetics and pharmacodynamics and was trained by David Smith at the University of Michigan. Dr. Teuscher has held leadership positions in biotechnology, pharmaceutical and contract research companies. In 2008 he established the Learn PKPD.com website to share his knowledge with the community. Prior to coming to Certara, he was the Founder and President of PK/PD Associates.