Welcome to Econometrics with Simulations#

Spanish version

Econometrics, as a discipline that combines economic theory, statistics, and mathematics, can be challenging. Abstract concepts and mathematical proofs, although fundamental, sometimes hinder the intuitive understanding of econometric principles. This book is born from the conviction that a good way to understand econometrics is by experimenting with it.

Through interactive simulations, this book allows you to visualize and experiment with econometric concepts in real-time. The proposal consists of modifying parameters and assumptions, observing how results change to develop a deeper and practical understanding of econometric methods. Each chapter includes simulations that allow you to “play” with the concepts, making learning more dynamic and memorable. The book is designed as a resource for both students and practicing professionals. We aim for it to be useful as a complement for students, but also as a reference tool for those designing analysis strategies and experiments, allowing them to make informed decisions about different methodological aspects, such as sample size or the inclusion of variables.

📝 Note on this preliminary edition#

This document represents a preliminary version of “Econometrics with Simulations”. The chapters and linked dashboards (marked with hyperlinks) are available as a preview and demonstrate the interactive approach we seek in this project. The content of this project is under active development, with regular updates. We invite readers to explore the available simulations and follow the project’s progress.

Last update: April 2025
Status: Preliminary version 0.1

Index of Interactive Dashboards#

  1. Statistical Properties in Simple Regression

  • Illustrates the statistical properties of OLS estimators, including unbiasedness and the determinants of estimator variance.

    1.2 Inference: T-Tests in Regression

    • Explores the sensitivity of T-tests to the normality in errors assumption.

    1.3 Power and Optimal Sample Size for Regression Coefficient

    • Explores the concept of statistical power in hypothesis testing for a regression coefficient. Also allows setting the minimum sample size needed to detect a desired effect in a research study.

  1. Multiple Regression Analysis

    2.1 Collinearity

    • Shows the impact of multicollinearity on coefficient estimates, standard errors, and overall model interpretation.

    2.2 Omitted Variable Bias

    • Shows the biases generated by omitting a relevant variable.

  2. Asymptotics and Large Sample Properties of OLS

  • Demonstrates how OLS estimators converge in probability to the true parameter values and how their distribution approaches normality as sample size increases.

  1. Heteroscedasticity

  • Illustrates how heteroscedasticity affects the efficiency of OLS estimates, heteroscedasticity tests, and corrections.

  1. Autocorrelation

  • Shows the impact of autocorrelation on OLS estimates and illustrates tests such as Durbin-Watson.

  1. Endogeneity and Instrumental Variables

  • Explains the bias introduced by endogenous variables and demonstrates how instrumental variables (IV) can be used to obtain consistent estimates.

  1. Limited Dependent Variable Models

  • Shows the difference between linear probability models and nonlinear models such as Probit and Logit, focusing on coefficient interpretation and prediction.

Causal Inference Methods#

  1. Randomized Controlled Trials (RCTs)

  • Shows unbiased estimation of causal effects when treatment is randomized. Varies sample size and treatment proportion. Allows estimating the required sample size.

    8.1 Estimating Interaction/Moderation Effects

    • Shows sample size requirements for interaction effects.

  1. Identification under C.I.A. (Unconfoundedness): Regression versus Random Forests

  • Compares different methods of causal effect estimation that are valid under the Conditional Independence Assumption (CIA). Shows in which cases a Regression estimator and in which cases an ML estimator like Random Forests can improve estimation precision.

  1. Matching and Propensity Score Matching (PSM)

  • Shows how matching by covariates or estimated propensity scores can reduce selection bias. Shows remaining biases in PSM.

  1. Difference-in-Differences (DiD)

  • Explores how DiD identifies causal effects under the parallel trends assumption. Visualizes violations.

  1. Panel Data Models

  • Compares fixed and random effects estimators, showing how each handles individual-specific heterogeneity.

  1. Regression Discontinuity Design (RDD)

  • Shows how local comparison around the threshold can estimate causal effects. Varies bandwidth and tests manipulation.

  1. Synthetic Control Methods

  • Builds a synthetic control and compares post-treatment results. Shows how donor group selection affects results.

Causal Inference in Practice (Technology/Industry)#

  1. A/B Testing with Covariates

  • Shows how controlling for covariates can improve power and precision. Compares regression-adjusted differences vs. raw differences.

  1. Heterogeneous Treatment Effects

  • Shows methods that propose the estimation of conditional treatment effects.

  1. Selection on Observables with High-Dimensional Covariates

  • Shows methods that incorporate regularization adapted to causal inference.

Appendix#

A.1 Sampling Distributions

  • Demonstrates how the sampling distribution of the sample mean approaches a normal distribution as sample size increases, regardless of the population distribution.

Authors#

Principal Author#

Contributors#

Supporting Institutions#

Faculty of Business Sciences, Universidad Austral Beta Sigma