Explaining Sales via Linear Regression

Using multivariable linear regression to understand car sales volume

Posted by Kurt Eulau on December 06, 2020 · 1 min read

Unfortunately I can't share the code for this project publicly, but you can click here to access my Github repo containing code for other projects.

Description

This report was a group assignment required for the UC Berkeley MIDS w203 Statistics for Data Science course, which I completed during Fall 2021. The goal of the project was use multivariable linear regression within an causal theory to explain how much Americans consider horsepower when purchasing a new vehicle. I significantly contributed to the introduction, EDA, model building and selection, as well as results sections of the report. All data wrangling, cleaning, and analysis were completed in R using RStudio. To complete our research, we utilized a variety of statistical concepts:

  • Multi-variable linear regression
  • Sampling methodology
  • Variable transformation
  • Model selection
  • Robust standard errors
  • Omitted variable bias
  • Reverse causality

Report

The final report submitted for this project is provided below: