Spring 2021 - STAT207
Data Science Exploration


Lab Assignments

Check back here in a few days for your first assignment information.
  • lab_01: Data Science Setup (Due: Tuesday August 31, 11:59pm CST)
    In this first lab, you will set up your account and computer for Data Science Exploration and begin to work with Python notebooks

  • lab_02: Dataframes (Due: Tuesday September 7, 2021 at 11:59pm CST)
    In this second lab, we will practice the dataframe manipulation and cleaning techniques. We will also describe numerical variables.

  • lab_03: Probability and Sampling (Due: September 14, 2021 at 11:59pm CST)
    In this third lab, we will use the combinatorics equations that we learned in class to calculate probabilities of random events. In addition, we will also generate a sampling distribution.

  • lab_04: Random Variables (Due: September 21, 2021 at 11:59pm CST)
    In this fourth lab, we will use random variables to calculate the probabilities of events. We will also calculate summary statistics of these random variables.

  • lab_5: Confidence Intervals (Due: September 28, 2021 at 11:59pm CST)
    In this fifth lab, we will explore properties of the Central Limit Theorem and create confidence intervals for population means and population proportions.

  • lab_6: Hypothesis Testing (Due: October 5, 2021 at 11:59pm CST)
    In this sixth lab, we conduct hypothesis testing on population means and population proportions.

  • lab_7: Inference for Associations (Due: October 12, 2021 at 11:59pm CST)
    In this seventh lab, we conduct hypothesis testing on the difference between two population means.

  • lab_8: Simple Linear Regression and Inference (Due: October 19, 2021 at 11:59pm CST)
    In this eigth lab, we conduct hypothesis testing on the difference between two population proportions, build and use simple linear regression models, perform inference on population slopes, and create visualizations with three variables.

  • lab_9: Multple Linear Regression, ANOVA, and Logistic Regression (Due: October 26, 2021 at 11:59pm CST)
    In this ninth lab, we will explore concepts related to multiple linear regression, ANOVA, and logistic regression.<

  • lab_10: Logistic Regression and Classifier Models (Due: November 2, 2021 at 11:59pm CST)
    In this tenth lab, we will fit logistic regression models and use them to build classifier models.

  • lab_11: Classifier Models (Due: November 9, 2021 at 11:59pm CST)
    In this eleventh lab, we will use training and test datasets as well as the Log Likelihood Ratio test to help us select explanatory variables to put in a logistic regression model that will be good at predicting observations in new datasets.

  • lab_12: Variable Selection Methods (Due: November 16, 2021 at 11:59pm CST)
    In this eleventh lab, we will use training and test datasets as well as the Log Likelihood Ratio test to help us select explanatory variables to put in a logistic regression model that will be good at predicting observations in new datasets.

  • project: Instructions and Materials for the Final Project (Due: December 7, 2021 at 11:59pm CST)
    The materials in this folder give information about how to complete and submit the final project. Please read over the instructions in the downloaded pdfs thoroughly before beginning the project. Don't wait until the last minute to start working on this project!