Linux Command Line Basics
The Linux command line is an extremely important tool to learn, and you will almost certainly see it in the future.
There are many different types of shells. Only sh
is POSIX compliant. Other options, such as BASH
, ZSH
have additional features that are non-standard. But that's not a big deal unless you're scripting.
Also, I highly recommend you use the FISH shell for command line (but maybe don't write scripts with it, it's kinda weird).
These notes are by no means comprehensive, just the important stuff.
Common Flags
The following flags are very common amongst most commands:
-v, --vesbose
output for info about what the command did-h, --help
show help for command
For flags of specific commands, it is important to learn how to Google, use the man pages, or use the -h
(help) flag. No one actually remembers all the flags, they just figure it out quickly. For example,
$ mv --help
usage: mv [-f | -i | -n] [-hv] source target
mv [-f | -i | -n] [-v] source ... directory
Concepts
Piping
Piping (|
) simply feeds the stdout of one command into the stdin of another. Multiple pipes can be strung together to form a "pipleline"
The #ls command outputs the files and and directories of the specified directory, and #grep filters stdin with some ls command outputs the files and and directories of the specified directory, and #grep filters stdin with some regex. So,
ls . | grep "\.json\$"
will show only files that end with .json
Redirecting
The contents of files can be redirected stdin for a command. The stdout (and stderr) of a command can be redirected to a file.
The ls command outputs files and directories in the specified directory. So,
ls -F . > files.txt
will write the output of the ls
command into files.txt
. That is, the files.txt
will contain all the files and directories of the current working directory. Note that if anything existed in files.txt
before, it will be overwritten. If we use >>
:
ls -F . >> files.txt
it does the same as >
, except it appends the result of ls .
. If the file existed before, it won't be overwritten, but added to.
Now, #grep filters files OR stdin with some regular expression. So we can redirect files.txt
into the stdin of grep
:
grep "/\$" < files.txt
will match any file that ends with a slash. The $
sign represents the "end", and must be escaped because it represents a variable.
We can also redirect stderr. If we try:
ls doesNotExist > file.txt
an empty file file.txt
will be created, and you will see ls: doesNotExist: No such file or directory
on the command line. The reason this isn't in the file is because it's stderr, not stdout. If we add to the end 2>&1
:
ls doesNotExist > file.txt 2>&1
this will redirect stderr to stdout which is redirected to file.txt
, so the contents of file.txt
will be ls: doesNotExist: No such file or directory
.
Globbing
File globbing is where the shell expands a certain pattern to be a "list" of all the files that match said pattern
*
(aka wildcard) matches any number of characters. Two asterisks denote nested directories.?
matches exactly one of character[]
represents a character class (see regex), and matches exactly one of the specified characters in the brackets!
inside a character class means anything but the characters inside the class[:alnum:]
(alphanumeric) - uppercase and lowercase letters, and numbers[:alpha:]
uppercase and lowercase letters[:lower:]
lowercase letters[:upper:]
uppercase letters[:digit:]
numbers[:punct:]
punctuation, any of `! ” # $ % & ‘ ( ) * + , – . / : ; < = > ? @ [ \ ] ^ _ { | } ~ ``[:space:]
space character
{}
makes a logical OR. That is, each comma-separated glob-pattern inside these brackets will be matched
Suppose we have the filesystem:
folder1
|__ file1.txt
|__ file2.txt
|__ file3.txt
folder2
|__ nested_folder
|__ nested_file.cpp
|__ nested_file2.cpp
|__ file.txt
|__ main.cpp
|__ random.py
__main__.py
file1.txt
file2.txt
program1.py
program2.py
So, if we enter these into the shell:
$ echo *.txt # Files ending in .txt
file1.txt file2.txt
$ echo **/*.cpp # Files ending in .cpp including subdirectories
folder2/nested_folder/nested_file.cpp folder2/nested_folder/nested_file2.cpp folder2/main.cpp
$ echo folder1/file[12].txt # Files in folder1 that are file[1 or 2].txt
folder1/file1.txt folder1/file2.txt
$ echo folder1/file[!12].txt # Files in folder1 that are NOT file[1 or 2].txt
folder1/file3.txt
$ echo program?.py # Files that are program[anything].py
program1.py program2.py
$ echo folder?/*.{py,txt} # Files in folder[anything] that end with .[py or txt]
folder1/file1.txt folder1/file2.txt folder1/file3.txt folder2/random.py
$ echo "folder?/*.{py,txt}" # Quotes disable globbing
folder?/*.{py,txt}
The shell expands the glob BEFORE it is sent. That means the following are equivalent:
$ echo *.txt
$ echo file1.txt file2.txt
$ echo "file1.txt" "file2.txt"
Variables
Shells are actually programming languages, you can write your own variables with it. We won't focus on that though. Just know that you can define your own variables, and there are some pre-defined variables that already exist.
Variables are accessed $
sign. If you don't use a dollar sign, the shell will interpret it as a string. You can also use them within double quotes and use {}
to specify where the variable ends.
The $USER
variable is the current user:
$ echo "Hello, $USER" # Quotes disable globbing, but not variables
Hello, [USERNAME]
$ echo Hello, $USER
Hello, [USERNAME]
$ echo "Hello, $USERmonkey" # Since there is no variable named `USERmonkey`
Hello,
$ echo "Hello, ${USER}monkey" # Denote where the variable ends
Hello, [USERNAME]monkey
$ echo 'Hello, $USER' # Single quotes disable variables
Hello, $USER
Filesystem
The Unix system (which is MUCH better than the Windows one) has many parts to it. We'll cover the important parts:
/
represents the "root" directory. Everything starts here./usr
stands for "user system resources", which are "user usable programs or data"/usr/bin
has executables (binaries) which are usually commands and other apps
/bin
same as/usr/bin
, but usually more system-related/opt
(optional), some installed programs will use this/home/[USERNAME]
- the home directory of the user. This is where documents, downloads, etc will go.
UNIX files are separated by /
(much better than \
). .
stands for current directory, and ..
means go back 1 directory.
The path ./././my-folder/./my-folder2/.././../my-folder/../
is the same as going nowhere 👍
PATH
When we type ls
, how does BASH know what to do? Well, if we run which ls
, it gives us /bin/ls
. This is where the command is located! It was written in C, and them compiled to an executable, which is stored in /bin
. So when you run ls
, you are actually running /bin/ls
in disguise. This is convenient, since we don't have to always type /bin/ls .
, where we could just run ls .
.
But how does BASH know where to look for the command? Because it's in our PATH variable. If you run echo $PATH
, you will see a bunch of colon-separated paths. You should see /usr/local/bin:/usr/bin:/bin:/usr/sbin
somewhere. Each colon-separated path is a directory that BASH will look into when you type a command to see if it exists. In this case, since we have /bin
, it will look inside /bin
, and when it finds ls
, it will execute /bin/ls
for us, since it's in our PATH
.
Home
The home directory for a user is where their "stuff" is stored. It's typically /home/[USERNAME]
, but it can change depending on the distribution. However, it's always aliased with a tilde ~
. So your documents would be at ~/documents
, and downloads at ~/downloads
.
Commands
- cat (concatenate) - output contents of a file to stdout
- cd (change directory)
- cp (copy) - copies file(s)
- find - recursively looks for files, including those in subdirectories
- ls (list) - lists files and directories only at the level specified, without going into subdirectories
- mkdir (make directory) - creates a directory, use
-p
(parent) flag to create nested directories - mv (move) - moves file(s)
- rm (remove) - deletes file(s), use
-r
flags to remove directories. BE CAREFUL, permanent deletion - touch - creates file(s)
cat
cat
stands for concatenate. It outputs the contents of the files to stdout.
cat [filename]
will print the contents of[filename]
- Multiple files can be printed at once,
cat [file1] [file2]
cd
cd
stands for change directory. It changes your current working directory.
cd ~/Documents
will change my current working directory to my Documents
folder.
cp
cp
stands for copy. It copies files.
cp [file] [destination]
copies[file]
to[destination]
cp -r [directory] [destination]
recursively copies all the contents of[directory]
to[destination]
cp -rv [directory] [destination]
also logs the files that were copied and where they were copied to
find
ls on steroids. Find will recursively find and print every file and folder in a directory.
find [dir]
will (eventually) output all files and folders in[dir]
(including those in subdirectories) to stdoutfind [dir] -name="[glob_pattern]"
will find all files and folders in[dir]
(including subdirectories) that match the glob pattern[glob_pattern]
. Make sure the pattern is quoted, otherwise it will be expanded by the shell.find [dir] | grep [regex]
will find all files and folders in[dir]
(including subdirectories) and pipe them into grep to be filtered with the regular expression[regex]
. This is useful if you want to match more sophisticated patterns not supported by globbing.
Lets say my PWD project
looks like this:
folder1
|__ file1.txt
|__ file2.txt
folder2
|__ nested_folder
|__ nested_file.cpp
|__ main.cpp
__main__.py
file1.txt
file2.txt
then find .
will output something like:
/users/username/project
/users/username/project/folder1
/users/username/project/folder1/file1.txt
/users/username/project/folder1/file2.txt
/users/username/project/folder2
/users/username/project/folder2/nested_folder
/users/username/project/folder2/nested_folder/nested_file.cpp
/users/username/project/folder2/main.cpp
/users/username/project/__main__.py
/users/username/project/file1.txt
/users/username/project/file2.txt
as you can see, this can get out of hand quickly. The most common use for find
is to find the location of a specific file. We can use grep, or the name flag.
find . | grep txt
or find . -name "*txt*"
will output:
/users/username/project/folder1/file1.txt
/users/username/project/folder1/file2.txt
/users/username/project/file1.txt
/users/username/project/file2.txt
ls
ls
stands for list. It lists the files and directories within the specified directory, only on the "top level".
ls [directory]
shows the files and directories in the specified directoryls [file]
shows the filenamels -FG [file or directory]
colourizes and adds an indicator to specify what you're looking at (note if you're using a shell like FISH, this is automatically added)
/
after a file to indicate a directory*
to indicate an executable@
to indicate symbolic link
Using the same PWD structure as before
folder1
|__ file1.txt
|__ file2.txt
folder2
|__ nested_folder
|__ nested_file.cpp
|__ main.cpp
__main__.py
some_executable
file1.txt
file2.txt
$ ls .
folder1 folder2
__main__.py some_executable
file1.txt file2.txt
$ ls -F .
folder1/ folder2/
__main__.py some_executable*
file1.txt file2.txt
$ ls -F folder2
nested_folder/ main.cpp
mkdir
mkdir
stands for make directory.
mkdir [dir1] [other dirs...]
creates a directory (folder) called[dir1]
and any other optionally specified directoriesmkdir -p [dir1]/[dir2]
creates a directorydir1
AND createsdir2
inside of it.-p
stands for parent, and allows you to create nested directories.
mv
mv
stands for move. It moves files to a new destination.
mv [fileOrDir1] [fileOrDir2] ... [dest]
moves each of the files or directories to[dest]
mv -v [fileOrDir1] [fileOrDir2] ... [dest]
will also print what files were moved and their destination
rm
rm
stands for remove. It deletes files and directories.
rm -rf [fileOrDir] ...
removes all specified files, and the -r
recursively removes the contents of all specified directories.
rm
is permanent and irreversible. Use caution. You may use the -i
option to ask for confirmation before deleting each file, but this isn't helpful since you end up spamming yes mindlessly.
touch
touch
creates a file with the specidied name.
touch [file1] [file2...]
creates specified files
Other Commands
- curl, wget - make HTTP (and other) requests
- echo, printf - print to console. Echo adds newline, printf allows formatting and special characters
- #grep (global regular expression print) - filters input and output with some regular expression
- info, man - gives information about a given command
- which, whereis - gives useful information about the location of a command
curl, wget
curl
(cURL) and wget
make requests to the world wide web.
curl -L [url]
will grab the contents of the webpage at[url]
echo, printf
echo
and printf
both print arguments to stdout. However, echo
automatically adds a newline to the end, where as printf
does not. Also, printf
allows for input formatting (hence the name). Printf also allows for escape characters (such as "\n", the newline character), whereas this can only be achieved using the (non-standard) -e
flag with echo
.
echo [string]
outputs[string]
to stdoutprintf [format] [args]
outputs the string[format]
, with variables in[args]
inserted in the correct locations
var1="hello"
var2=world
var3=420
echo "$var1 $var2 $var3" # hello world 420
printf "%s %s %i\n" $var1 $var2 $var3 # hello world 420
printf "%s %s %X" $var1 $var2 $var3 # hello world 1A4⏎ (no newline)
The format specifiers are the same as in C. You can view them here.
grep
grep
stands for global regular expression print. It filters some input and only outputs lines that match the given regular expression.
grep [pattern] [files...]
outputs the lines in[files]
which match[pattern]
grep [pattern]
will take some input from stdin and output the lines that match[pattern]
to stdoutgrep -E [pattern] [files?..]
gives extended regex functionality. Equivalent to the (non-standard) commandegrep
Suppose we have file emails.txt
with:
info@pust.co
privacy@innersloth.com
fnord@mail.com
foo@bar.org
piss@gmail.com
Note that grep colours the matched part of each line
$ grep ".co" emails.txt
info@pust.co
privacy@innersloth.com
fnord@mail.com
piss@gmail.com
$ cat emails.txt | grep "^p"
privacy@innersloth.com
piss@gmail.com
$ grep ".[a-z]\{3\}" emails.txt | grep "mail\.com" # Since {} has special meaning in double quotes
fnord@mail.com
piss@gmail.com
Not quoting [pattern]
could result in unwanted globbing
info, man
man
stands for manual. It displays the manual page for a given command if available. info
displays even for information than man
. Usually, man
is sufficient.
man [command]
will display documentation for [command]
if it exists on the system
which, whereis
which
shows where a command is located. whereis
searches for not only the binary file of the command, but the source and manual pages as well.
which [command]
shows the location of[command]
whereis [command]
shows the location of the binary, source, and manual pages of[command]
$ which ls
/bin/ls
$ whereis ls
ls: /bin/ls /usr/share/man/man1/ls.1