[R] Correct interpretation of a regression coefficient
Andrew Robinson
@pro @end|ng |rom un|me|b@edu@@u
Mon Mar 9 22:54:43 CET 2026
Hi Peter,
hopefully this clarifies.
## Here's a light-touch example
set.seed(8675309)
example <- data.frame(x1 = rnorm(100),
x2 = rnorm(100))
example$x3 <- example$x1 - 1
example$x4 <- example$x2 - 1
example$y <- with(example,
2 * x1 + 4 * x2 + x1 * x2 + rnorm(100) * 2)
## In the following code, the statistical information about the
## interaction term is the same across the two scalings
summary(lm(y ~ x1 * x2, data = example))
summary(lm(y ~ x3 * x4, data = example))
## In the following code, the statistical information about the
## interaction term is the not same across the two scalings
summary(lm(y ~ x1 + x1:x2, data = example))
summary(lm(y ~ x3 + x3:x4, data = example))
NB: this obscure fact was published in Robinson, A.P., Pocewicz, A.L., Gessler, P.E., 2004. A cautionary note on scaling variables that
appear only in products in ordinary least squares. Forest Biometry, Modelling and Information Sciences 1, 83–90. I first submitted it to Remote Sensing of the Environment (in which this failing to respect strong hierarchy is most pernicious) and R1 said it was completely obvious that failing to respect strong hierarchy was a stupid idea, reject; whereas R2 said they had never heard of this therefore it could not possibly be true, reject.
I'm not sure if it's similar to language independence .... ? Interesting conjecture! Can you unpack that a little?
Cheers,
Andrew
--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: apro using unimelb.edu.au<mailto:apro using unimelb.edu.au>
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/
I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders.
On Mar 9, 2026 at 21:04 +1100, Peter Dalgaard <pdalgd using gmail.com>, wrote:
Example?
Is this similar to language independence getting lost under similar circumstances because e.g. Ja/Nej in Danish sorts opposite to Yes/No?
-pd
On 9 Mar 2026, at 10.34, Andrew Robinson <apro using unimelb.edu.au> wrote:
Curiously enough, scale independence is lost in models that lack Nelder’s strong heredity (eg main effects are missing for interactions).
Cheers,
Andrew
--
Andrew Robinson
Director, CEBRA and Professor of Biosecurity,
School/s of BioSciences and Mathematics & Statistics
University of Melbourne, VIC 3010 Australia
Tel: (+61) 0403 138 955
Email: apro using unimelb.edu.au
Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/
I acknowledge the Traditional Owners of the land I inhabit, and pay my respects to their Elders.
On 9 Mar 2026 at 8:13 PM +1100, Peter Dalgaard <pdalgd using gmail.com>, wrote:
> Sometimes it is just a matter of units: If you change the predictor from millimeter to meter, then the regression coefficient automatically scales down by a factor 1000. The fit should be the same mathematically, although sometimes very extreme scale differences confuse the numerical algorithms. E.g. the design matrix can be declared singular even though it isn't.
>
> (Scale differences have to be pretty extreme to affect OLS, though. More common is that nonlinear methods are impacted via convergence criteria or numerical derivatives.)
>
> -pd
>
>> On 8 Mar 2026, at 19.15, Brian Smith <briansmith199312 using gmail.com> wrote:
>>
>> Hi Michael,
>>
>> You made an interesting point that, scale of the underlying variable
>> may be vastly different as compared with other variables in the
>> equation.
>>
>> Could I use logarithm of that variable instead of raw? Another
>> possibility is that we could standardise that variable. But IMO, for
>> out of sample prediction, the interpretation of standardisation is not
>> straightforward.
>>
>> On Sun, 8 Mar 2026 at 23:05, Michael Dewey <lists using dewey.myzen.co.uk> wrote:
>>> >
>>> > Dear Brian
>>> >
>>> > You have not given us much to go on here but the problem is often
>>> > related to the scale of the variables. So if the coefficient is per year
>>> > tryin to re-express time in months or weeks or days.
>>> >
>>> > Michael
>>> >
>>> > On 08/03/2026 11:50, Brian Smith wrote:
>>>> >> Hi,
>>>> >>
>>>> >> My question is not directly related to R, but rather a basic question
>>>> >> about statistics. I am hoping to receive valuable insights from the
>>>> >> expert statisticians in this group.
>>>> >>
>>>> >> In some cases, when fitting a simple OLS regression, I obtain an
>>>> >> estimated beta coefficient that is very small—for example, 0.00034—yet
>>>> >> it still appears statistically significant based on the p-value.
>>>> >>
>>>> >> I am trying to understand how to interpret such a result in practical
>>>> >> terms. From a magnitude perspective, such a small coefficient would
>>>> >> not be expected to meaningfully affect the predicted response value,
>>>> >> but statistically it is still considered significant.
>>>> >>
>>>> >> I would greatly appreciate any insights or explanations regarding this
>>>> >> phenomenon.
>>>> >>
>>>> >> Thanks for your time.
>>>> >>
>>>> >> ______________________________________________
>>>> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> >> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>>>> >> and provide commented, minimal, self-contained, reproducible code.
>>> >
>>> > --
>>> > Michael Dewey
>>> >
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list