Review

Lecture: Review

December 11, 2019
Review

Lecture: Review

December 9, 2019
Regularized Linear Regression

Lecture: Regularized Linear Regression

Visualizing high dimensional data and selecting the regularization tuning parameter

December 6, 2019
Regularized Linear Regression

Labs: Regularized Linear Regression

Experiment with Lasso regression with high dimensional features

December 5, 2019
Regularized Linear Regression for High-Dimensional Data

Lecture: Regularized Linear Regression for High-Dimensional Data

We explore a machine learning approach for improving accuracy of multiple linear regression using penalized least squares, with application to gene expression analysis

December 4, 2019
Regularized Logistic Regression

Lecture: Regularized Logistic Regression

Regularization penalties and cross-validation accuracy of regularized logit classifiers

December 2, 2019
Regularized Logistic Regression

Labs: Regularized Logistic Regression

Use gene expression profiles and clinical data to build a regularized logit classifer for breast cancer recurrence

November 24, 2019
Regularized Logistic Regression for High-Dimensional Data

Lecture: Regularized Logistic Regression for High-Dimensional Data

Compare different regularization penalties for logistic regression with many feature variables

November 22, 2019
Train/Test and Cross Validation with Scikit-Learn

Lecture: Train/Test and Cross Validation with Scikit-Learn

Examples with regularized logistic regression

November 20, 2019
Model Selection in Logistic Regression

Lecture: Model Selection in Logistic Regression

We explore the tradeoff between model fit and model simplicity using criteria such as AIC and BIC

November 18, 2019
Train/Test methods for model assessment

Lecture: Train/Test methods for model assessment

We use tools provided in scikit-learn for modeling and machine learning in Python

November 15, 2019
Logit Classifier Training and Testing

Labs: Logit Classifier Training and Testing

Build a logit classifer using training data and evaluate on test data

November 14, 2019
Train/Test ROC Analysis

Lecture: Train/Test ROC Analysis

We split the data into training and testing data to reduce bias in ROC evaluation of a classfier.

November 13, 2019
Exam 2

Lecture: Exam 2

In class exam, 218 Ceramics Building. Bring non-programmable calculator.

November 11, 2019
Review

Lecture: Review

Work on practice problems in class

November 8, 2019
Review

Lecture: Review

Work on practice problems in class

November 6, 2019
Classification via Logistic Regression

Lecture: Classification via Logistic Regression

Sensitivity, Specificity, ROC curves

November 4, 2019
Logistic regression modeling

Lecture: Logistic regression modeling

Building and interpreting logit models with multiple explanatory variables

November 1, 2019
Odds Ratios and Logistic Regression

Labs: Odds Ratios and Logistic Regression

Study association between categorical variables and model categorical responses using logits

October 31, 2019
Modeling probabilities using logit models

Lecture: Modeling probabilities using logit models

Odds ratios, 2 x 2 tables, and logistic regression

October 30, 2019
ANOVA, F tests and Model Selection

Lecture: ANOVA, F tests and Model Selection

More examples with results and interpretation

October 28, 2019
Comparing nested regression models

Lecture: Comparing nested regression models

Examples of ANOVA F tests in model building

October 25, 2019
Compare Regression Models

Labs: Compare Regression Models

Compare nested regression models for U.S. melanoma mortality rates

October 24, 2019
Structured regression and categorical predictor variables

Lecture: Structured regression and categorical predictor variables

Analysis of variance and F tests for building models

October 23, 2019
Regression model assessment and prediction

Lecture: Regression model assessment and prediction

We'll see how to access and use information about the model parameters, model fit and residuals

October 21, 2019
Regression modeling and inference

Lecture: Regression modeling and inference

Coefficient standard errors, confidence intervals and prediction intervals

October 18, 2019
Regression models and inference

Labs: Regression models and inference

Apply two-sample analysis and linear regression modeling to real and simulated data.

October 17, 2019
Introduction to Regression Modeling using StatsModels

Lecture: Introduction to Regression Modeling using StatsModels

Python examples with results and interpretation

October 16, 2019
z-tests, t-tests and degrees of freedom

Lecture: z-tests, t-tests and degrees of freedom

We compare z-tests, which rely on the central limit theorem for large sample validity, and t-tests, which provide a small sample adjustment

October 14, 2019
Formulating and testing hypotheses

Lecture: Formulating and testing hypotheses

We study a general approach for testing hypotheses about parameters of interest in several representative examples.

October 11, 2019
Confidence Intervals and Hypothesis Tests

Labs: Confidence Intervals and Hypothesis Tests

Analyze lead exposure data while exploring connections between confidence intervals and hypothesis tests.

October 10, 2019
Confidence intervals and significance tests for differences

Lecture: Confidence intervals and significance tests for differences

Building on the results for single samples we explore how to compare samples from different subpopulations such as treatment/control, A/B testing and other grouping variables

October 9, 2019
Confidence intervals for general means

Lecture: Confidence intervals for general means

We explore large sample confidence intervals for the mean of a population and solve the mystery of n-1!

October 7, 2019
Exam 1

Lecture: Exam 1

In class exam, 218 Ceramics Building. Bring non-programmable calculator.

October 4, 2019
Review

Lecture: Review

Let's review what we've done so far - bring questions to class!

October 2, 2019
Normal Approximation and Confidence Intervals

Lecture: Normal Approximation and Confidence Intervals

We explore how to us sample means, proportions and other statistics to compute confidence intervals for population parameters.

September 30, 2019
Margin of Error for Sample-Based Estimates

Lecture: Margin of Error for Sample-Based Estimates

Let's study the variation in sums, means and proportions and use their proporties to determine margin of error for these estimates.

September 27, 2019
Standard Errors for Means and Proportions

Labs: Standard Errors for Means and Proportions

Work with the uniform and binomial distributions, and normal approximations for the sample mean and sample proportion.

September 26, 2019
Computing and Visualizing Interval Probabililites and Quantiles

Lecture: Computing and Visualizing Interval Probabililites and Quantiles

We develop the basics for computing the interval probabilities and percentiles needed for many confidence intervals and tests.

September 25, 2019
Case Study in Data Science

Lecture: Case Study in Data Science

Albert Man guest lectures on a data science project he did as part of a job interview!

September 23, 2019
Statistics, Parameters and Random Variables

Lecture: Statistics, Parameters and Random Variables

In order to understand uncertainty better, the concept of a random variable is extremely useful for understanding the variation in sample statistics.

September 20, 2019
Normal and Bernoulli Distributions

Labs: Normal and Bernoulli Distributions

This lab covers the normal distribution, Bernoulli distribution, parameters and random samples.

September 19, 2019
Statistics, Parameters and Estimation

Lecture: Statistics, Parameters and Estimation

How shall we collect data to estimate key population parameters? How can we estimate those parameters and determine the margin of error?

September 18, 2019
Monte Carlo Studies of Sampling Distributions

Lecture: Monte Carlo Studies of Sampling Distributions

How much variation is there in a sample statistic when we draw a random sample from a population? We investigate using Monte Carlo simulations.

September 16, 2019
For Loops and Functions for Simulation

Lecture: For Loops and Functions for Simulation

Python for loops and functions enable us to automate and simplify repetitive tasks, which is essential for Monte Carlo simulations.

September 13, 2019
Sampling, probability and looping

Labs: Sampling, probability and looping

This lab covers sampling, probability, for loops, and making your own function for simulations.

September 12, 2019
Random Sampling and Probability

Lecture: Random Sampling and Probability

We use Python to demonstrate random sampling and explore the corresponding probabilities of different outcomes.

September 11, 2019
Working with Data Frames

Lecture: Working with Data Frames

Let's see how to extract information from data frames, add more data, sort the data, and merge information from two or more sources.

September 9, 2019
Structure of Data Frames

Lecture: Structure of Data Frames

Let's delve more deeply into the structure of data frames and how we can process the data, extract subsets, and set up for further analysis.

September 6, 2019
Data Frames

Labs: Data Frames

In this lab you will learn more about data types in Python, read external data from csv files, and perform basic data extraction, analytics, and interpretation.

September 5, 2019
Python, Pandas and Data Frames - Quantitative variables

Lecture: Python, Pandas and Data Frames - Quantitative variables

Let's explore summary statistics, distributions and visuals for quantitative data

September 4, 2019
Labor Day

Lecture: Labor Day

September 2, 2019
Data and Python

Lecture: Data and Python

What if the data have missing values? How can we summarize qualitative and quantitative data, and study relations between different variables in the data? Let's explore how core modules like NumPy, Pandas, and Matplotlib help us manage and visualize data for these purposes.

August 30, 2019
Data Science Setup

Labs: Data Science Setup

Data scientists use powerful tools to help learn about data. In this first lab, you will set up your account and computer for Data Science Exploration and begin to work with Python notebooks

August 29, 2019
Data and Python

Lecture: Data and Python

What do we mean by data? How can we organize data? How can we visualize and summarize data? Python is a powerful data science environment for organizing and understanding our data.

August 28, 2019
Introduction to Data Science Exploration

Lecture: Introduction to Data Science Exploration

Building on STAT 107, Data Science Discovery, let's explore data science and statistical analysis in real world settings!

August 26, 2019
Welcome to Data Science Exploration!

Welcome to Data Science Exploration!

Our first lecture is Monday, Aug. 26 at 1:00pm in 218 Ceramics Building. See you there!

August 21, 2019