Introduction to the Command Line
We will use the
console part of JupyterLabs for this section. You’ll want to start in a new window.
This module is meant to be a broad and very basic introduction to the command-line and you will learn enough to be functional. We have modules where we provide more detail, but they require a Unix-based computer (like Mac), and this module leverages a web-interface: Jupyter Labs. This is done because many users will be on Windows-based machines and may not have access to a Unix-Linux based computer. Begin by opening a new JupyterLab window and navigating to the terminal option from the dropdown.
You would see something similar to below.
On the left above is a file structure with several starter files in it. On the right is the terminal that we will use as our learning environment.
A Comparison: Windows over 20 years
Windows has generally seen many different Graphical User Interfaces (or GUI’s) over the years. Generally, the advent of Windows and key innovation was to remove the need for shell level computing. However a lot of scripting and the ability to manipulate, wrangle, edit, and work with text was lost. Bioinformatics tends to need these tools, and these are generally found in the various flavors ‘nix. There are ways to get to a ‘nix environment within Windows such as through Cygwin or other VM devices but we generally don’t discuss that in this material.
High-level: What is Unix/Linux/Bash/Command-line?
These are terms that are used loosely, and are not really reflective of their technical meaning. If I said to you this course teaches Windows for example, for most something comes to mind. However technically, Windows could be Windows 95,98,3.1, ME, XP, and so forth. Same here – just more extreme. Learning command-line is generally the same as learning BASH (the most common shell environment), and the same as learning to use Linux/Unix computers for most beginners. Down the road, there are important distinctions, so to keep all of this accessible to new users, we will not go down these rabbit holes.
Unix, Linux, and Command-line For Bioinformaticians
Overall, this module was developed originally for the command-line and is being ported over to Jupyter. We do still have command-line instructions, and at some level, its good to see those since so many recipes presume the command-line. That said, it should always be straight forward to adapt these to Jupyter.
UNIX is an operating system that was developed in the 60s and broadly refers to a set of programs that all work the same way at the command-line. They have the same feel. They have the same philosophy of design. Ok, it’s a specific operating system owned by AT&T, however, these days it refers to programs that all follow a common framework. There are many types of Unix – MacOSX, Linux, and Solaris where each of those is essentially different sets of codes owned by different companies or groups to get the common Unix common framework. MacOSX is owned and developed by Apple. Solaris is owned by Sun and Oracle. Linux is open-source and built from a community-led by Linus Torvalds, and was meant to work on x86 PCs. The x86 refers to a type of CPU architecture used across most personal computers today (both Mac and PC). If I log into a Unix machine in 1980, 1990, 2000, 2010, 2017 – it will often feel and work the same. By comparison, Windows is not Unix. Even if you go to the command-line everything is different and changes over the years.
Command-line shells are started up from a terminal program. Every Mac computer has
Terminal preloaded. Start that up and you’ll see a prompt from the shell. The shell is actually a program that responds to you, and you can change its look and feel. Most people like the shell that is called
bash. With Catalina, there is a recommendation of using
zsh. However, 15 years ago,
tcsh was more common. There are others like c-shell (
Bash is pretty handy in that things like up-arrow takes you to the previous command and you can press ‘tab’ to autocomplete. Now an important thing is that when bash starts
.bash_profile is executed for login shells, while
.bashrc is executed for interactive non-login shells. We can store a lot of settings here. Settings for the shell are also called
environmental variables. You can see some examples such as typing
echo $HOME' where echo simply prints the variable. $PATH is really important because to run any program, you would need to know the path to its location.
There are many different resources for learning command line on Linux/Unix based systems. Typically, a user may need to know 20 to 40 commands, with
less being common. We have provided a cheat sheet below and we link to some provided by others. All of the commands have lots of options, and one can learn about them by typing ‘man [command]’, ( ‘man grep’ for example). However, most people just google Linux command options. It is important to know that there are thousands of Linux commands, but most people only remember a small subset that is specific to their field. Some example resources are: