Overview

Data Science requires tools to help us learn about data. In this lab, you will accomplish two major things:

  1. Learning about data types in Python
  2. Learning about the pandas library
  3. Using pandas to read CSV files
  4. Performing basic processing on pandas DataFrame objects

Step 1: Retrieve the lab using git

Using your command line, navigate to your stat107 repository (cd Desktop -> cd stat107 -> cd [NETID]) and fetch the notebook from our release repository by running the following two git commands:

git fetch release
git merge release/lab_pandas -m "Merging initial files"

ONLY IF you get an error related to unrelated histories, use:

git merge release/lab_pandas --allow-unrelated-histories -m "Merging initial files" 

Step 2: Open the notebook

Open the notebook with the command:

jupyter notebook

Inside of the notebook webpage:

Whenever you are done, you should checkpoint (using File -> Save Checkpoint in the notebook) your notebook to save your work. Once your work is saved, you can:

Step 3: Submitting your work

When you’re ready to save your work online and/or submit your work, return to the command line and run:

git add -A
git commit -m "submission (or any message here)"
git push origin master