SCS 2014: Visualizing Regression

From Wiki1

Jump to: navigation, search
At their best, graphics are instruments for reasoning. – Edward Tufte

This is the home page for the SCS short course on Visualizing Regression offered Tuesday evenings from 6pm to 9pm from November 4 to November 25, 2014. The longer title is: The Concepts Behind Regression: A Visual Approach to Learning Almost All About Regression.

The workshop consists of a series of lectures/graphical demonstrations that illustrate a variety of paradoxes, surprises and incongruities related to applying regression to research problems. All we need is regression on two predictors to encounter traps that even the 'experts' frequently fall into. A recent paper that discusses the geometry behind many of the visualizations in this workshop is

Don't be daunted by the fact that the paper goes into the math behind the pictures. Our workshop will focus on using the pictures to visualize statistical concepts. We don't need the math unless you're interested in it for its own sake.

I will post pdf files and R scripts as well as screen-capture videos on this page as the workshop progresses.

If anyone wishes to review the material covered in an introductory regression course consider reading:

Some links to other books or materials discussed during the course:


Approaches to Regression

  • Using mathematical formulas
     y = \beta_0 + \beta_1 X + \beta_2 X^2 + \beta_3 Z + \beta_4 XZ + \epsilon\! where \epsilon \sim N(0, \sigma^2)\!
  • Using Matrices and Linear Algebra
    Y = X \beta + \epsilon\! where Y is an n \times 1\! vector ...
  • Data: scatterplots in 'data' space.
  • Computing: "How to": Commands to run regressions
  • Geometry: "Variable Space" in which each variable is represented as a vector n-dimensional space
  • Beta Space: The space of coefficients and their estimates
  • Interpretation: What does a regression and its coefficients mean in a real application

What insights can we get from visualizing regression

Here's a sample of questions that may not have very obvious answers when approaching regression from a traditional formulaic or matrix algebra approach.

Week 1


Week 2


The Dylan effect (or paradox)

  • Thanks to Bryn Greer-Wootten for hunting down the relevant lyrics from Bob Dylan's "Ballad of a Thin Man":
Because something is happening here
But you don’t know what it is
Do you, Mister Jones?

But another possible source for a name for the phenomenon comes from Buffalo Springfield's "Something happening here":

There's something happening here
What it is ain't exactly clear ...

Week 3


We were able to rescue the screen capture videos in two parts, before and after the projectors went off:

Week 4


Personal tools