I have a love/hate relationship with Bash.
It’s a quirky little language with many subtleties that make it all too easy to make mistakes. It’s not uncommon for programmers writing Bash scripts to discover bugs in their scripts related to a variety of things, including getting some obscure syntax wrong, using quotes incorrectly, platform or shell compatibility issues, and mishandling errors.
On the other hand, Bash is ubiquitous, and it’s a great “glue language” for
composing CLI commands together. The Unix philosophy is chock
full of good ideas that keep Bash scripts simple, easy to understand, and easy
to write (once you know your way around the quirks). Pipes are a simple, but
brilliant idea; you can use a single |
character to pass in the output of one
command as input to the next, and loads of CLI utilities are designed to
participate in these pipelines.
As great as these strengths of Bash are, the aforementioned quirks can be an obstacle, standing between you and a safe, working script. So, let’s talk about 10 things that are quirky about Bash and how to live with them.
The quirks
- Quirk 1: Shebang lines
- Quirk 2: Variable definition syntax
- Quirk 3: Single vs. double quotes
- Quirk 4: Unbound variables
- Quirk 5: stdout and stderr
- Quirk 6:
>
vs.>>
- Quirk 7: Comparing strings and numbers
- Quirk 8: Passing arguments through
- Quirk 9: Bailing out if there is an error
- Quirk 10: Handling errors in a pipeline
Quirk 1: Shebang lines
At the top of most shell scripts is a special line that tells your computer which program to use to run the script.
You’ll sometimes see the shebang line point to /bin/bash
(don’t do this!):
This only works if you happen to have Bash installed at /bin/bash
. Depending
on the operating system and distribution of the person running your script, that
might not necessarily be true!
It’s better to use env
, a program that finds an executable on the user’s
PATH
and runs it. env
is basically always reliably located in the
/usr/bin
folder, so we can assume that /usr/bin/env
is there for us to use.
You can also use env
to run other interpreters, such as Ruby:
And Python:
After using chmod
to mark a script like this as executable, you can run it
directly in your shell:
In the examples below, I will leave off the
#!/usr/bin/env bash
line for the sake of brevity, but you should always make this the very first line of every Bash script you write!
Quirk 2: Variable definition syntax
Unlike in most other programming languages, when you define a variable in Bash, you must not include spaces around the variable name.
Quirk 3: Single vs. double quotes
A lot of bugs in Bash scripts are related to quoting, so it’s valuable to understand how quoting works in Bash.
Inside of double quotes, a dollar sign ($
) can be used to insert a variable
value into a string:
If your intention is to use a literal dollar sign, you must remember to escape it with a backslash:
Or you can use single quotes, which don’t expand variables:
Quirk 4: Unbound variables
You can trust most language runtimes to explode if you mistype the name of a variable. That’s what you want to happen! For example, if you attempt to run this Ruby program:
The Ruby interpreter helpfully returns a non-zero exit code to indicate failure, and the error message makes it clear what you did wrong:
However, in Bash, the default behavior is to treat every undefined variable as if its value is an empty string. So if you run this Bash program:
The Bash interpreter returns the exit code 0 (indicating success) and prints the following:
This behavior makes it way too easy for your Bash scripts to have subtle bugs where a variable that you thought had a value turns out to be an empty string, all because you mistyped the variable name.
On rare occasions, it is actually useful for Bash to treat undefined variables as empty strings, but I won’t get into that here.
The good news is that there is an option (set -u
) that you can enable to make
Bash behave more like other programming languages and throw an error in this
kind of situation:
Running the above Bash program results in an exit code of 1 (indicating failure) and prints the following:
Quirk 5: stdout and stderr
Unlike functions in other languages, Bash functions don’t really have concrete return values. Instead, Bash implements an idea from the Unix philosophy that the output of any program can be the input of another. The standard I/O is a data stream, which, for practical purposes, we can think of as lines of text.
So, in Bash, a function, command or process does not really return a value; it prints something to standard out (stdout). Other functions or commands can read that output and do other things with it, like print it out, capture it in a file or variable, pipe it into some other process, etc.
For example, the ls
command lists the names of files in a particular
directory. It prints each file name to stdout:
You can use |
to pipe stdout into another process or command. The grep
command can be used to filter out any lines of text that don’t contain a
particular string. So, you could find only recipes with “broccoli” in the file
name by piping the stdout of the ls
command above into grep
:
There is another standard data stream called standard error (stderr) that, despite the name, is not only for error messages (although that is one common use). stderr is a stream where you can print user-facing messages without them accidentally being interpreted as output data that the next process in the pipeline will read.
The syntax for “redirecting” some output to stderr is >&2
. >
means “pipe
stdout into” whatever is on the right, which could be a file, etc., and &2
is
a reference to “file descriptor #2” which is stderr.
To better illustrate the difference between stdout and stderr, here is a code example with a function that prints a message to stderr for informational purposes (to tell the user that some work is happening), and then, after the work is done (simulated by a 2 second pause), the output data is printed to stdout.
Then, we call our function and capture its output into a variable. When we print the variable’s value at the end, we can see that the value is only the stdout from the function, and not the stderr (which was only printed for informational purposes).
When you run this program, it prints:
In contrast, if we had printed everything to stdout, including the “Thinking…” message, the output would have looked like this, which isn’t quite what we want:
The same idea applies to working at the command line. Imagine if the function above were a standalone script:
We’ve kept the same implementation, so the Thinking...
message is printed to
stderr
and the output stegosaurus
is printed to stdout.
Because our script does a good job of separating standard output from “human-oriented” messages, we can capture the output in a variable without getting the messages all mixed up in the output:
Notice how, when we ran dino="$(./the_best_dinosaur.sh)"
to capture the
script’s output in the variable dino
, the stdout (stegosaurus
) was hidden
from us because it was redirected into the variable. However, we still saw the
message on stderr (Thinking...
) in the terminal, because that part wasn’t
redirected. That’s good! We want that output in the terminal, because the whole
point of stderr is to print messages where the person running the script can see
them. Meanwhile, the stdout can be redirected into important places like
variables, files, and other processes.
Quirk 6: >
vs. >>
It’s very easy to write to files in Bash. Any time something is being printed to
stdout, you can just slap a >
on the end, followed by a file name, and the
output data will be written to the file.
When we use cat
to print the contents of the file, we can see that the output
of the date
command was written to the file:
Let’s say that we wanted to write several entries into the same file. Can we use
>
for that? Let’s see:
Nope! The thing is, >
will overwrite the current contents of the file, if
the file already exists.
If you want to append lines instead, use >>
:
Quirk 7: Comparing strings and numbers
In many programming languages, you can check to see if two “things” are equal by
using ==
(equals) and !=
(not-equals) operators, and this works regardless
of the types of the things you are comparing. Here is Ruby, for example:
And there are also additional operators like >
, >=
, <
and <=
for
comparing whether numbers are greater than or less than each other. Here’s Ruby
again:
Bash is different in a couple of ways:
-
You have to use
[[
to do these sorts of checks. -
There are two sets of comparison operators:
==
and!=
for string comparison-eq
,-ne
,-gt
,-lt
,-le
-ge
for numerical comparison
To check whether two strings are equal or not equal:
To compare numbers:
Be careful not to mix up ==
/!=
and -eq
/-ne
! You will run into
unexpected behavior if you compare strings using the numerical operators:
Quirk 8: Passing arguments through
A common task, when you’re writing Bash scripts, is to pass the arguments of one
script through to another. Just as a silly example, let’s write a little
“wrapper script” for the ls
command:
This works well enough in that it calls ls
without arguments:
But it doesn’t work if I try to use ls-wrapper
the same way that I might use
ls
. For example, I can’t ask it to list the files in a different directory
than the one I’m in:
We can make ls-wrapper
work just like ls
by passing the arguments of
ls-wrapper
through to ls
. So how do we do that? Well, if you google this
question, you’re going to find a lot of complicated
answers, but the answer is more or less straightforward:
just use "$@"
:
You will sometimes see people use
$*
. Don’t do that! You should almost always use"$@"
, unless you really know what you’re doing and you have a specific reason not to.
Now, here is our ls
wrapper script in action:
Quirk 9: Bailing out if there is an error
In most programming languages, we expect a program to stop what it’s doing immediately if there is an error. Not in Bash!
Take this program, for example:
Here’s the output when you run this script:
Oops! We did get that far. But in most cases, if you’re running a script that you wrote, and there is an error halfway through, you don’t want it to just keep going, right? Each line of your script could be depending on the success of the lines before it.
To make Bash behave more intuitively, we can use the set -e
option:
This makes the Bash interpreter exit immediately as soon as one of the commands
(ls
, in this case) returns a non-zero exit code to indicate failure:
This is a much better behavior 99% of the time, in my opinion.
Of course, a lot of the time, you expect certain commands to fail, and you
want your script to proceed in a certain way. A common example is checking for
the existence of a string in a file using grep
:
So you might be worried that set -e
doesn’t account for situations like these,
where you might want to explicitly handle errors yourself:
Don’t worry! set -e
takes constructs like if
, while
, &&
and ||
into
account, so your script will Just Work™ the way that you would expect, without
exiting prematurely:
As an aside: It turns out that using
set -e
is controversial because there are various edge cases where it behaves unexpectedly. My take is that if you useset -e
at the top of all of your scripts by default (especially if you also use-o pipefail
, which I’ll talk about next), even though it may not be perfect and there are certain edge cases that you might run into from time to time, it’s still well worth it because it will help you avoid subtle bugs where some part of your script doesn’t work, but the script proceeds to run past that point anyway.The decision about whether or not to use
set -e
andset -o pipefail
is one about trade-offs, and in my opinion, the benefits ofset -e
outweigh the costs. If you’re new to writing Bash scripts, I recommend that you putset -eo pipefail
at the top of all of your scripts as a general rule, and see how you like it in the long run. As you gain more experience with Bash and learn more about howset -eo pipefail
works in practice, you can decide for yourself whether or not you prefer to use it. My prediction is that you’ll prefer to use these safeguards and deal with any edge cases that come up, instead of not using them and having to handle every single possible error yourself!
Quirk 10: Handling errors in a pipeline
One weakness of set -e
is that it doesn’t account for the errors that can
happen in the middle of a pipeline.
In the case of the ls /some/nonexistent/directory
example above, the script
exited immediately because the ls
command in the middle of the script returned
a non-zero exit code to indicate failure.
But what if we piped that ls
command into another command? As a simple
example, let’s pipe the ls
command into cat -
, which just passes through its
input as output:
Even with set -e
, when the ls
command fails, the cat
command still gets
executed. But it’s even worse than that. Because the cat
command executes
successfully, the script doesn’t bail out like we want it to. If it weren’t for
the error message that ls
prints on stderr, we would have no idea that the
ls
command failed, and the script proceeds to execute to the end as if nothing
went wrong!
The remedy for this is the -o pipefail
option. You can use it on its own with
set -o pipefail
, but you’ll usually see it used in conjunction with set -e
,
as set -eo pipefail
:
Now, our script is smart enough to recognize that the ls
command in the
pipeline failed, and so should the script as a whole at that point:
That’s it!
I hope you found this helpful! Despite its quirks, Bash is an indispensable tool for the modern software developer. Once you know how to navigate the weird parts, you can reap the benefits of having this wonderful, bizarre language in your repertoire.
Comments?
Reply to this tweet with any comments, questions, etc.!