Introduction to Python
High-level: What is Python?
Python is a highly useful language when it comes to data science.
- Python is an interpreted, general-purpose programming language that generally is considered easy to read. Interpreted means that one does not need to compile the code to machine language separately.
- Python has an extensive set of libraries that can be added, giving it considerable utility.
- You may hear of versions of Python such as Python 2 and Python 3. Well, as of January 1st, 2020, Python 2 is no longer supported. There are certain parts of the syntax of Python 2 that do not work in Python 3, and thus it is considered not backward compatible
- Python is object-oriented. This is a programming paradigm based on the concept of objects containing data (like a box), and some methods you can do to the box. You will have objects such as variables or datasets, to which you can apply functions to and manipulate according to the desired end product.
Getting Used to Our Development Environment
Jupyter is a graphical interface for developing and writing Python code – and can be used for many other languages such as R, scala etc. A
notebook is a collection of scripts/code, usually doing a single analysis. The JupyterLab is a place for storing and managing many Notebooks
About Jupyter Lab
A snapshot is shown below. The left frame shows a file manager system through which you can manage and access projects and relevant datasets. The middle frame contains the markdown. This is an editable document. Markdowns are a way of putting code and text together in a way that looks like a web page. The cool thing about markdowns is that they can be modified to be very intuitive. You can describe, in plain English, what you are doing and refer to these notes later on. These files contain the file extension ‘.ipynb’, which stands for I-python-notebook.
The markdowns consists of multiple cells. The box, shown below, is a cell. You can change the mode of the cell to be interpreted as a code block or markdown script using the toggle on the top. Each block of code in the markdown can be run by pressing the forward arrow on the cell.
You can see there are a lot of different types of files that can be created, and we discuss these in other modules. For this exercise, we largely just focus on the
Running Python code
To run a simple code that prints the word hi, type
print("Hi") in the prompt, press the “Run” button at the top. The word “Hi” is displayed. Jupyter allows you to type small blocks of code and run them individually using the “Run” function at the top.
Accessing the underlying operating system:
The nice thing about Jupyter is that we do have access to an underlying command-line and filesystem. Beginning a line with “!” allows the program to interpret that line as the command line would.
Specific commands can be used to access the underlying operating system. E.g.
pwd prints the current working directory. These are BASH commands and they may come in handy when you are working in your notebook. I’ve linked the BASH module below for you reference.
Building a new notebook
Once we are comfortable, let’s go ahead and create a new notebook that we will use for these exercises.
Begin by Clicking File> NewNotebook> Python 3.
This starts a binder where we can begin editing and creating code. To create commentary or notes, you’ll need to switch from
We then create a header using standard Markdown nomenclature. The “#” specifies that this is a heading.
To see it in a more rendered view, click
Some resources for learning a few more details
There are tremendous tools for learning Jupyter via other platforms as well! I’ve linked a few down here:
Getting Started In Python Using Jupyter
Please start an interactive environment by clicking the link below. Once you do so, you can navigate to the home page to begin the next section!
Syntax: The Rules of the Language
When using python, it is important to first understand the basic syntax or rules of the language. These are elements that you will use when writing any code in python.
In other languages, indentation is more of a visual choice. It allows different segments to be seen as separate. In Python, this is not the case. Indentation must be defined in order for the syntax to be current.
In explanation, if you have a conditional statement, the statements that must be carried out in case the condition is true must be indented by a single tab. If this isn’t followed, an error will let you know that the code was expecting an indent. The same goes for indenting when it isn’t required. This will prevent the code from being interpreted correctly.
Commenting is key
Commenting is an element that you will find in many programming languages. Any line that begins with ‘#’ is regarded as a comment. This tells the program that this line does not contain python code. They are essentially a way for you to leave notes, in plain English (or any language of choice).
I like to use comments for a couple of things. Firstly, I use them as notes to help me and anyone else that might read my code understand what a line of code does. Secondly, I use comments to separate longer scripts into sections. They function as headings for each new section.
In python, single, double and triple quotes are recognized. Strings can be enclosed in any of these to be interpreted in their literal sense. Triple quotes are a feature unique to python. A quote can span multiple lines when placed inside a triple quote.