Anscombe’s Quartet Desmos Activity

Well, I last wrote a post on August 18 – almost 8 months ago. I’m not interested in explaining why or apologizing or anything – anyone who has been teaching this year already knows why.

But, I finally created something new that I’m proud of and would like to share!

I’m teaching a course called Probability and Statistics, which is our district’s intro version to this material. It covers:

  • data collection (sampling methods and study types)
  • one variable data visualization (dotplots, histograms, boxplots, measures of center and spread)
  • two variable data visualization (scatterplots, regression, residuals)
  • basic probability rules (what is probability, addition, subtraction)
  • counting principles (permutations, combinations, binomial probability, geometric probability)

We’re working on the two variable data visualization right now, and my students first semester kind of struggled with the concept of whether the least-squares regression model was a “good fit” for the data or not. Basically, I wanted to focus more on residuals – what they are, what we want them to look like, and why it’s important to check them. Anscombe’s Quartet immediately came to mind as a good way to do that, but I didn’t just want to be like “well, here’s these four datasets that all have the same regression equation but look how different they look!” I wanted to do a slow reveal sort of deal, where they really got to play with the data before seeing it, and learn the lesson of why residuals are important.

I also was kind of ready to challenge my Desmos computation layer skills, since I’ve been casually watching most of the #DesmosLive videos this year. Before this, I had done a little bit of auto checking answers and putting sentence starters in text boxes with computation layer, and made an interactive slider for my conferences reflection activity, but not much more. This took a lot of googling and patience to make it look how I wanted it!

Here’s an overview if you’re teaching this activity. I walked through it with my class kind of slide by slide, since I know my students are not practiced in reading and processing long text directions on their own, but they still did all of the noticing/wondering on their own and the class level discussion was good. If you have students that are more self-sufficient in their abilities to read through directions independently, this could easily be assigned as homework or an independent in-class activity. I made certain answers “share with class” so they would still see some classmate responses as they worked through it.

Students already knew how to: make a scatterplot in Desmos, describe the association visually (strong/weak, positive/negative, linear/exponential/quadratic), and find/interpret r (correlation coefficient). We’d also talked about “lines of best fit” and how to read the regression equation off of Desmos’ output.

The Anscombe’s Quartet Desmos Activity

(the info below is also in the Desmos “Teacher Moves” for the activity)

Slides 1-2 are a typical notice/wonder structure. Note that we had already created scatterplots before doing this and described the visual association, which gives them more things to notice or wonder here.

Slide 3 refreshes their memory on the correlation coefficient and asks them to predict which dataset the regression equation belongs to – note that they HAVE to put responses for both items here or later on the activity will withhold certain information

Slide 4 has them test their prediction, and slide 5 asks if they were correct (obviously, spoiler alert for Anscombe’s Quartet here…they’re all gonna be correct). They must also submit a response on slide 5 before the activity gives them the info to move on.

slide 6 asks them to test a different dataset, at which point they should probably get suspicious…and slide 7 asks them what they’re noticing or wondering at this point. This would be a good point to pace your activity to if students are working independently, maybe snapshot some responses while students are working, and have a discussion at this point before going on.

slide 8 reveals some information, but only if students have submitted all responses they were asked to up until this point!

on slide 9, they get to look at all 4 scatterplots and think about which ones would be well fit by the linear model with this information added. My students said 1,3, and maybe 4 at this point, so the only one they really eliminated was 2, but they were ready to get more information because they sensed that only one was really a good fit.

slide 10 introduces residuals (you could do this activity after already introducing them, but this also explains it from scratch). Once again, some information is withheld until the student correctly calculates this information.

finally, they get to see the residual plots and decide once and for all for which datasets the equation is a good fit on slide 11.

slide 12 asks them to summarize what they learned. One of my students said “don’t judge a book by its cover” which I loved.

And then slides 13-14 are a wild extension activity involving the “Datasaurus Dozen” which is a similar collection of datasets where all the summary statistics match but they look really different – students are challenged to make their own dataset that also fits in the collection. I had a lot of fun doing this myself, but my students were all too overwhelmed to attempt it, which is fine. It would be a great challenge for AP students or that one really motivated student in your intro class, and maybe to pair up students for.

Let me know if you use this activity and if you’d change anything from how I set it up!

Author: missmastalio

Math teacher at an alternative high school. Living the best life.

Leave a comment