Data Science Discovery is the intersection of statistics, computation, and real-world relevance. As a project-driven course, students perform hands-on-analysis of real-world datasets to analyze and discover the impact of the data. Throughout each experience, students reflect on the social issues surrounding data analysis such as privacy and design.

Prerequisites: None

Course Topics

See: Schedule

Course Section

This course is comprised of two sections:

You are required to be registered for BOTH one lecture section and one lab/discussion section.

Course Materials

Course Assignments and Grades

Course grades are given in points, totaling 1,000 points throughout the semester. The breakdown of points is as follows:

Final Course Grade

Course points will be translated into a course grade at the end of the semester.


The most significant component to this course is the completion of the course projects (300 points). You will have 2.5 weeks to complete each project, including a lab/discussion dedicated to working with the course staff on your project. In each project, you will focus on the analysis of a single real-world dataset to discover interesting and insightful features and perform a detailed reflection on your findings to understand the social issues that arise from such analysis.

A mid-project checkpoint will be due after the first week to ensure you and your team is making progress (see the course schedule for due dates). All projects will be discussed in discussion sections; top projects will be showcased as part of the lecture.


Throughout this course, you will have many opportunities to be part of collaborative activates that include data collection, studio-style critiques, data cleaning efforts, and other experiences. Each of these experiences have a set number of points. A total of 40 points from these activates contribute to your course grade.

Late Submissions

No late submissions are accepted. However, you need to complete only 8/10 labs to earn all 105 points for lab section. Additional points earned are counted as extra credit. Other extra credit opportunities will be offered. All sources of extra credit cannot exceed +107 points to your final grade.

Learning Collaboratively

Data Science is a collaborative science. Do not try to tackle this course alone.

We strongly encourage you to discuss all of your course activities (with the exception of exams) with your friends and classmates! You will learn more though talking through the problems, teaching others, and sharing ideas.

Continue to read on “Academic Integrity” to understand the difference between collaboration and giving an answer away.

Academic Integrity

Collaboration is about working together. Collaboration is not giving the direct answer to a friend or sharing the source code to an assignment. Collaboration requires you to make a serious attempt at every assignment and discuss your ideas and doubts with others so everyone gets more out of the discussion Your answers must be your own words and your code must be typed (not copied/pasted) by you.

Academic dishonesty is taken very seriously in STAT 107 and all cases will be brought to the University, your college, and your department. You should understand how academic integrity applies specifically to STAT 107: the sanctions for cheating on an assignment includes a loss of all points for the assignment and that the final course grade is lowered by one whole letter grade (100 points). A second incident, or cheating on an exam, results in an automatic F in the course.

Academic integrity includes protecting your work. If you work ends up submitted by someone else, we have considered this a violation of academic integrity just as though you submitted someone else’s work.