Data Frames

In this lab you will learn more about data structure in Python, read external data from csv files, and perform basic data extraction, analytics, and visualization.

Source Branch: lab_02
Due Date: Committed and pushed to git before February 5, 2020 at 11:59pm

Overview

Data Science requires tools to help us learn about data. In this lab, you will accomplish several major things:

  1. Learn more about data types in Python
  2. Learn more about the pandas library
  3. Use pandas to read CSV files
  4. Perform basic data processing, summary and visualization using pandas, matplotlib.pyplot and seaborn

Step 1: Retrieve the lab using git

Using your command line, navigate to your stat207 repository (cd Desktop -> cd stat207 -> cd NETID) and fetch the notebook from our release repository by running the following two git commands:

git fetch release
git merge release/lab_02 -m "Merging initial files"

ONLY IF you get an error related to unrelated histories, use:

git merge release/lab_02 --allow-unrelated-histories -m "Merging initial files" 

Step 2: Open the notebook

Open the notebook with the command:

jupyter notebook

Inside of the notebook webpage:

  • Navigate into the folder lab_02
  • Open up the lab_02.ipynb notebook
  • Follow the instructions inside of the notebook

Whenever you are done, you should checkpoint (using File -> Save Checkpoint in the notebook) your notebook to save your work. Once your work is saved, you can:

  • Use File -> Close and Halt on the notebook
  • Use Quit (in the top-right) on the directory view to completely exit jupyter

Step 3: Submitting your work

When you’re ready to save your work online and/or submit your work, return to the command line and run:

git add -A
git commit -m "submission (or any message here)"
git push origin master

Submitting Your Work

When you have completed working, you should always submit your work (even if you're not quite finished). We will always grade the latest push you made before the due date (and ignore everything else) — submitting multiple times is okay and encouraged!

Inside of Jupyter:

  • Click File -> Save Checkpoint to ensure your notebook is saved.
  • Click File -> Close and Halt to exit your notebook.
  • Click Quit (in the top-right) to close the directory view.

After exiting Jupyter, your command prompt will return to accept new commands. Using your command prompt, run:

git add -A
git commit -m "submission (or any message here)"
git push origin master

You can verify your submission was made by visiting the web interface to github: