STAT207 - Data Science Exploration
Spring 2024 - Ellison

Data Science Tools Setup: Python and Git


Data Science Setup for Windows

Data Science requires a few tools to help us discover interesting features in our data. We will primarily use two tools and several libraries within each of these tools. The two tools are:

  • python, a simple programming language (this allows for the computer to do the work for us)
  • git, a distributed version control system/repository tool (this runs technology behind “github”)

All of these tools are free (and open-source), so it just takes a few minutes for you to install them to get started!

Installing Python

You will need Python 3.6 (or later). We will first check if you have Python already (if you have done Data Science) and install it if you don’t already have it.

Checking for existing Python

  1. Open up your command prompt
  2. Type python --version and press Enter.
  • If you see Python 3.7.1 (or similar), you are all set – no need to install Python. (Skip to the git section.)
  • If you see 'python' is not recognized as an internal or external command, operable program or batch file., install it now:

Installing Python

  1. Visit https://conda.io/miniconda.html to get Miniconda, a light-weight version of the python programming language
  2. Download and install the latest Windows, 64-bit installer for the latest version of Python (eg: 3.7).
  3. After the install finishes, exit your command prompt, re-launch it, and verify it installed by following the steps above (in "checking for existing python").

 

Installing Git

Any modern version of git works. We will first check if you have git and install it if you don’t already have it.

Checking for git

  1. Open up your command prompt
  2. Type git --version and press Enter.
  • If you see git version ... (or similar), you are all set – no need to install git! (You’re done!)
  • If you see 'git' is not recognized as an internal or external command, operable program or batch file., install it now:

Installing git

  1. Visit https://git-scm.com/downloads to get git, a distributed version control system/repository tool
  2. Download and install the latest Windows installer. (You should not need to select/unselect any of the options that are already preselected in the installation proces... aka just keep hitting next.)
  3. After the install finishes, exit your command prompt, re-launch it, and verify it installed by following the steps above.

Data Science Setup for Mac OS X

Data Science requires a few tools to help us discover interesting features in our data. We will primarily use two tools and several libraries within each of these tools. The three tools are:

  • python, a simple programming language (this allows for the computer to do the work for us)
  • git, a distributed version control system/repository tool (this runs technology behind “github”)

All of these tools are free (and open-source), so it just takes a few minutes for you to install them to get started!

Installing Python

You will need Python 3.6 (or later). We will first check if you have Python already (if you have done Data Science) and install it if you don’t already have it.

Checking for existing Python

  1. Open up your command prompt
  2. Type python --version and press Enter.
  • If you see Python 3.7.1 (or similar), you are all set – no need to install Python!
  • If you see an error or Python 2.7, we will install it now!

Installing Python

  1. Visit https://conda.io/miniconda.html to get Miniconda, a light-weight version of the python programming language
  2. Download the latest Mac OS X, 64-bit bash installer for the latest version of Python (eg: 3.7).
  3. Open up your command prompt and run the script you downloaded by running the following:

One approach that might work
  • Run the following in your command line.
    cd Downloads
  • bash Miniconda3-latest-MacOSX-x86_64.sh
  • You will need to press q to exit the license screen and all default options are fine.
  • Restart your terminal

If that approach doesn't work, try this
  • Run the following in your command line.
    mkdir -p ~/miniconda3

    curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh -o ~/miniconda3/miniconda.sh

    bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3

    rm -rf ~/miniconda3/miniconda.sh

  • After installing, initialize your newly-installed Miniconda. The following commands initialize for bash and zsh shells. ~/miniconda3/bin/conda init bash
    ~/miniconda3/bin/conda init zsh
  • Restart your terminal

Installing Git

Any modern version of git works. We will first check if you have git and install it if you don’t already have it.

Checking for git

  1. Open up your command prompt
  2. Type git --version and press Enter.
  • If you see git version ... (or similar), you are all set – no need to install git!
  • If you see an error, we will install it now!

Installing git

  1. Visit https://git-scm.com/downloads to get git, a distributed version control system/repository tool
  2. Download and install the latest Mac OS X installer.
  3. After the install finishes, verify it installed by following the steps above.