Logarithmic Transformations#
The linear regression model admits considerable flexibility in how variables are defined. One of the most common transformations in economics and the social sciences is to take the natural logarithm of some or all of the variables in the model. This seemingly technical decision has very concrete consequences for how the coefficients are interpreted.
Why Take Logarithms?#
Many variables of interest in economics have strongly skewed distributions: a small number of observations take very large values while most of the data are concentrated at low values. Personal income is the classic example. If we plot income, the distribution shows a long right tail; if we plot its logarithm, the distribution becomes much more symmetric and normal, which in turn — as we will see — brings advantages when performing statistical tests.
But the motivation is not purely statistical. In many economic phenomena, what matters is not absolute changes but proportional changes. A worker does not view a $1,000 raise the same way if their salary is $10,000 as if it is $200,000. What matters is the percentage increase. When relationships between variables operate in proportional terms, the logarithm is the natural transformation to capture them. This is because, as we will see, the difference between logarithms approximates the percentage change.
Additionally, taking logarithms can linearize relationships that are nonlinear in levels. If the theory suggests that \(Y\) grows proportionally with \(X\), the relationship \(Y = A \cdot X^{\beta_1}\) is not linear in the parameters, but taking logarithms transforms it into:
which is. This allows \(\beta_1\) to be estimated by OLS.
Note
The natural logarithm is defined only for strictly positive values. Before transforming a variable, it is necessary to verify that all observations are positive. Variables that can take the value zero (such as hours worked, number of children, or exports) require special treatment.
The Log-Level Model: Logarithm in the Dependent Variable#
The first case is when we transform only \(Y\):
This model is often called semi-logarithmic or log-level. \(X\) is measured in its original units; \(Y\) is in logarithms.
Interpretation of the Coefficient \(\beta_1\)#
To understand what \(\beta_1\) measures, consider what happens when \(X\) increases by one unit. Before the increase:
After the increase:
The difference between the two logarithms is:
That is, \(Y'/Y = e^{\beta_1}\). The exact percentage change in \(Y\) associated with a one-unit increase in \(X\) is:
For small values of \(\beta_1\), the approximation \(e^{\beta_1} - 1 \approx \beta_1\) is very accurate, so it is common to report:
A one-unit increase in \(X\) is associated with an approximate change of \(100 \cdot \beta_1\) percent in \(Y\).
For example, if a regression of \(\ln(\text{wage})\) on years of education yields \(\hat\beta_1 = 0.08\), the interpretation is: one additional year of education is associated with an approximately 8% higher wage.
Approximation vs. Exact Value
The approximate interpretation (\(100\beta_1\%\)) is reliable when \(|\beta_1| < 0.10\) approximately. For larger coefficients it is better to report the exact effect: \(100(e^{\beta_1}-1)\%\). For example, \(\hat\beta_1 = 0.40\) implies an exact change of \(100(e^{0.40}-1) \approx 49\%\), not 40%.
The Level-Log Model: Logarithm in the Explanatory Variable#
The second case transforms only \(X\):
Here \(Y\) remains in its original units and the nonlinearity is captured through the logarithm of \(X\).
Interpretation of the Coefficient \(\beta_1\)#
Using the differential, a change \(\Delta X\) in \(X\) produces a change in \(Y\) of:
Expressing \(\Delta X / X\) as the percentage change in \(X\) (dividing by 100):
This leads to the standard interpretation:
A 1% increase in \(X\) is associated with a change of \(\beta_1 / 100\) units in \(Y\).
Or equivalently: if \(X\) doubles (\(\%\Delta X \approx 100 \cdot \ln 2 \approx 69\%\)):
This model is useful when the relationship exhibits diminishing returns: each additional unit of \(X\) contributes less than the previous one. For example, in the relationship between city size (in inhabitants) and average wages: moving from 10,000 to 100,000 inhabitants may have a large effect, but moving from 5 million to 6 million, much less.
The Log-Log Model: Logarithm in Both Variables#
The third case, known as the log-log or double-logarithmic model, transforms both \(Y\) and \(X\):
Interpretation of the Coefficient \(\beta_1\): Elasticity#
Using the differential:
In terms of percentage changes:
The coefficient \(\beta_1\) is precisely the elasticity of \(Y\) with respect to \(X\): it measures how many percent \(Y\) changes when \(X\) increases by 1%.
A 1% increase in \(X\) is associated with a change of \(\beta_1\%\) in \(Y\).
Elasticity is a unit-free measure, which makes it easy to compare across very different contexts. That is why the log-log model is the standard in economics for estimating demand elasticities, production functions, international trade, and others.
Some common reference values:
\(|\beta_1| < 1\): inelastic relationship — \(Y\) changes by a smaller proportion than \(X\)
\(|\beta_1| = 1\): unit elasticity
\(|\beta_1| > 1\): elastic relationship — \(Y\) changes by a larger proportion than \(X\)
Comparative Summary#
Model |
Equation |
Interpretation of \(\hat\beta_1\) |
|---|---|---|
Level-level |
\(Y = \beta_0 + \beta_1 X + \varepsilon\) |
\(\Delta X = 1 \Rightarrow \Delta Y = \hat\beta_1\) (units) |
Log-level |
\(\ln(Y) = \beta_0 + \beta_1 X + \varepsilon\) |
\(\Delta X = 1 \Rightarrow \%\Delta Y \approx 100\hat\beta_1\) |
Level-log |
\(Y = \beta_0 + \beta_1 \ln(X) + \varepsilon\) |
\(\%\Delta X = 1 \Rightarrow \Delta Y \approx \hat\beta_1/100\) |
Log-log |
\(\ln(Y) = \beta_0 + \beta_1 \ln(X) + \varepsilon\) |
\(\%\Delta X = 1 \Rightarrow \%\Delta Y \approx \hat\beta_1\) (elasticity) |
The choice among these models should not be based solely on which produces the highest \(R^2\). The correct approach is to be guided by economic theory or by a graphical inspection of the data. In many cases, the research question itself suggests which scale is most natural: if we are interested in the effect in levels, we use the level-level model; if we are interested in the percentage effect, taking the logarithm of \(Y\) is the appropriate choice.