Lecture: Final Exam Notebook Release
Released on git by 9am
Lecture: Project and exam prep help
Open lab zoom link 5:00 - 7:00 pm
Lecture: Last class of the semester
Discuss review questions and projects. Help session Monday, May 11, 5-7 pm on the open lab zoom link. Good luck on your finals!
Lecture: Review and project work
No lecture. Discuss your review questions and project work
Lecture: Train/Test Regularized Regression
How to avoid fooling ourselves -- Comparing train/test performance with naive in-sample performance
Lecture: Regularized Linear Regression
Cross-validation and information criteria for regularized linear regression with many variables
Lecture: Regularized Linear Regression
Explore machine learning methods to compare different regularization penalties for linear regression with many feature variables
Lecture: Regularized Logit Classifiers
Explore machine learning methods for train/test splitting and cross-validation of regularized logistic regression
Lecture: Regularized Logit Classifiers
Explore machine learning methods for high dimensional classification using penalized logistic regression in scikit-learn
Lecture: Logit Model Selection
Train/test splitting for AIC/BIC driven model selection and evaluation
Lecture: Logit Model Selection
Log-likelihood-ratio tests and train/test AIC/BIC model selection for multiple logistic regression
Lecture: Train/Test Predictive Analytics
By randomly splitting the data into training and testing data we separate model bulding from model evaluation to reduce bias
Lecture: No Lecture Today. Exam 2 on git 9:00 am central time.
Exam 2 is distributed as a Jupyter notebook from the release repository. This is an open notes, open internet exam, but you must work on your own.
Lecture: Review
Discuss practice problems and other questions. The study guide notebook is available from the _classnotes repository.
Lecture: ROC curves for logit classifiers
We use ROC curves to summarize the sensitivity / specificity tradeoff and overall accuracy of a scoring system
Lecture: Classification via Logistic Regression
Sensitivity, specificity, logit classifier. The Exam 2 study guide notebook is available from _classnotes repo.
Lecture: Logistic regression modeling
Building and interpreting logit models with multiple explanatory variables
Lecture: Modeling probabilities using logit models
Odds ratios, 2 x 2 tables, and logistic regression
Lecture: F tests for comparing nested models
ANOVA and the constrained/unconstrained model framework
Lecture: Oneway ANOVA models with categorical predictors
Summarizing by groups and testing for group differences
Lecture: Analysis of variance and F test for regression
Analysis of variance and F tests for building models
Lecture: (Online) Regression recap plus LaTeX and images in Jupyter notebooks
See Piazza or Compass for the Zoom URL. We'll review linear regression and demo LaTeX and image insertion for Jupyter notebooks
Lecture: (Online starting today) Regression modeling
See Piazza or Compass for the Zoom URL. We'll see how to access and use information about the model parameters, model fit and residuals
Lecture: Multiple regression inference
Coefficient standard errors, confidence intervals and prediction intervals
Lecture: Multiple linear regression modeling
General framework and examples
Lecture: z-tests, t-tests and degrees of freedom
We compare z-tests, which rely on the central limit theorem for large sample validity, and t-tests, which provide a small sample adjustment
Lecture: Formulating and testing hypotheses
We study a general strategy for hypotheses testing in several different scenarios.
Lecture: Confidence intervals and hypothesis tests for differences
We explore how to make inferences about subpopulation differences in contexts such as treatment/control studies, A/B testing and sample surveys
Lecture: Exam 1
In class exam, 1090 Lincoln Hall. Bring non-programmable calculator.
Lecture: Review
Let's review what we've done so far - bring questions to class! Exam study guide solutions will be posted in Compass after class today.
Lecture: Confidence intervals for general means
We explore large sample confidence intervals for the mean of a population and solve the mystery of n-1!
Lecture: Normal Approximation and Confidence Intervals
We use sample means, proportions and their standard errors to compute confidence intervals for population parameters.
Lecture: Margin of Error for Sample-Based Estimates
Let's study the variation in sums, means and proportions and use their proporties to determine margin of error for these estimates.
Lecture: Model based probabililites and quantiles
We develop the basics for computing the interval probabilities and percentiles needed for many confidence intervals and tests.
Lecture: Statistics, Parameters and Random Variables
Compare sample statistics and population parameters to understand how the statistics estimate features of the population.
Lecture: Statistics, Parameters and Estimation
In building population sampling models for data, the concepts of random variables and distributions are crucial.
Lecture: Monte Carlo Studies of Sampling Distributions
How much variation is there in a sample statistic when we draw a random sample from a population? We investigate using Monte Carlo simulations.
Lecture: Random sampling and probability
Use combinatorial methods to calculate probabilities of compound events.
Lecture: Random Sampling and Probability
We use Python to demonstrate random sampling and explore the corresponding probabilities of different outcomes.
Lecture: Python, Pandas and Data Frames - Quantitative variables
Let's explore summary statistics, distributions and visuals for quantitative data, and see how to define our own functions for analysis.
Lecture: Quantitative Data Exploration
What if the data have missing values? How can we summarize and visualize qualitative and quantitative information in the data?
Lecture: Working with Data Frames
Let's see how to extract information from data frames, add more data, sort the data, and merge information from two or more sources.
Lecture: Structure of Data Frames
Let's delve more deeply into the structure of data frames and how we can process the data, extract subsets, and set up for further analysis.
Lecture: Notebooks and Git Repositories
Examples of Python, Jupyter notebooks and git operations.
Lecture: Introduction to Data Science Exploration
We explore data from a Pew Research Center political opinion survey.
Welcome to Data Science Exploration!
Our first lecture is Wednesday, Jan. 22 at 1:00pm in 1090 Lincoln Hall. See you there!