General concepts

The following are some general concepts and motivations behind the choices that were made in Data 8. They give an idea for why particular decisions were made about which topics to cover, which technology to use, etc.

Major goals

• Diversity
• Equity
• Pedagogical Clarity
• Scalability
• Depth
• No computational barrier to entry

Core concepts / inspirations

• Think critically about the world around you
• Don’t take your data for granted
• Use the combination of CS + Stats as a feature, not a bug
• Come away with practical skills
• Be able to know your inference is sound
• Be able to run your own experiments and plan data collection
• Know the right statistical tools for the job.
• Make due with limited data
• Quantify and understand uncertainty in data
• Turn your data analysis into a decision
• Think of ways that you could be wrong
• Consider edge-cases and how this affects your model choice (e.g. do I want to over/under estimate something)
• Illustrate the above points with real-world data

• Shield the students from the topics that take away from these main ideas.
• Don’t learn a lot of package-specific APIs, use the datascience module
• Don’t force students to set up their own environments, use a JupyterHub
• Don’t force students to clean their own data, provide pre-collected/cleaned data
• Provide videos and extra help if students want to go further.
• Expect subsequent courses to cover these more rigorously
• Aim the course for anybody, not just statistics or CS majors.
• More advanced treatment of these topics / formalization of CS/stats concepts will happen in later classes

Interactions of topics

• Intersectionality is a feature, not a bug
• Use CS concepts to help teach statistics, and vice versa
• Learn computing skills by doing interesting things on data
• Learn statistical concepts by observing something interesting in data, or a product of an algorithm.
• Use interactivity to let people explore the above concepts at their own trajectory.