Content-type: text/html
A Data-First Approach to Learning Real-World Statistical Modeling

This is an interesting approach featuring a "design for an upper-level undergraduate statistics course structured around data rather than methods. The course is designed around curated datasets to reflect real-world data science practice and engages students in experiential and peer learning using (a) data science competition platform." There's detailed discussion of the course, first launched at Harvard in 2014, the data competition tool, Kaggle, and of student responses to the process. But note: "the  course  requires  an  instructor  with  a  broad  base  of  knowledge in statistical models, as well as experience applying these methods to real data."

The learning outcomes for the course are:

  1. Given a data set and accompanying problem, students will be able to identify a set of suitable (and rule out unsuitable) methods for the task at hand.
  2. Students will be able to apply a broad suite of tools to data science problems.
  3. Students will be able to collaborate with peers towards a common objective.
  4. Students will be able to write documentation that allows others to reproduce and expand their work.
  5. Students will be able to evaluate the strengths and limitations of new statistical methods and be comfortable experimenting with them.
Jacob Mortensen, Luke Bornn, Daria Ahrensmeier, Alexander Buhmann. 2022. A Data-First Approach to Learning Real-World Statistical Modeling. Canadian Journal for the Scholarship of Teaching and Learning, Open Knowledge Foundation.