set -euo pipefail to make your bash scripts safer .. or not

“Add set -euo pipefail into your bash scripts to make them safer”.

-> This is a piece of advice that I’ve often come across in other blog posts and articles.

In this article, I would like to take a more nuanced look at this statement, and explain why this advice is rather dangerous, and should not be taken without a deeper thought. These options can easily break scripts, as well as be the source of nightmarish debugging sessions, so it may be a good idea to take a look at them one by one:

set -e

This is the most dangerous of the three. The -e option will cause a bash script immediately to cease further execution if the return status of any command (with some caveats) is an error status. The intent here is to imitate the behaviour of higher level languages like PHP, that also crash out in situations like that.

E.g.

<?php
foo();
echo "test";
?>

Executing crashes out before the string “test” is ever printed.

PHP Fatal error:  Uncaught Error: Call to undefined function foo() in /root/test.php:2
Stack trace:
#0 {main}
  thrown in /root/test.php on line 2

Normally, a bash script just continues on:

#!/bin/bash
foo
echo "test"

Executing it will print an error, but the script goes on.

test.sh: line 3: foo: command not found
test

Adding a set -e under the shebang will imitate the behaviour seen with the PHP example:

test.sh: line 3: foo: command not found
root@debian-test:~#

While this mostly works, there are a lot of possible pitfalls. In practice, many bash scripts use external programs to perform tasks, and in some cases, said programs may very well return an error status (non-zero status), even if the task was completed as expected. On top of that, there are also cases where commands native to bash behave in ways that are not necessarily intuitive.

With that in mind, let’s dig into some examples:

conditionals and pipelines

set -e has a bunch of special rules, where for practical reasons a non-zero exit status won’t cause a script to be aborted, examples of such rules would be “commands that are part of a conditional are immune” and “commands not at the end of a pipeline are immune”.

An easy example is the test command, a popular conditional used in many scripts. Here in this contrived example, I do not have a dir called exampledir, so test -d will exit with 1. Yet, “useless message” is still printed, because commands inside conditionals are immune to being aborted by set -e. E.g. a script dir.sh to illustrate:

#!/bin/bash
set -e
test -d exampledir && echo "this won't print"
echo "useless message"

Giving it a try:

bash dir.sh
useless message

function

Functions return the status of the last command they ran. If we wrap a conditional inside a function, the protection from the previous example will appear to have gone away to the untrained eye.

#!/bin/bash
set -e
function foo() { test -d exampledir && echo "this won't print"; }
foo
echo "useless message"

Giving it a try:

bash dir.sh
root@debian-test:~#

let

let is commonly used to increment variables in bash.

cat let.sh
#!/bin/bash
set -e
i=0
echo "i = $i"
let i++
echo "i = $i"

Now running this will result in:

bash let.sh
i = 0

That’s right, something as simple as let i++ can choke up on set -e. let i++ when i=0 evaluates to 0 and returns 1. let i++ when i>0 will return 0. If you want to increment things using let while using set -e, either don’t start from 0, or use something like bc to not get tripped over:

#!/bin/bash
set -e
i=0
echo "i = $i"
i=$(bc <<< "$i+1")
echo "i = $i"

Running this passes through set -e:

bash let.sh
i = 0
i = 1

and now let’s consider some external programs:

diff

diff will exit with status 1 if the two files are different, 0 otherwise.

cat > a
1
2
3
cat > b
1
3
2
diff a b; echo $?
2d1
< 2
3a3
> 2
1

Two files different may easily be an expected behaviour, so if you wish to use diff together with set -e, don’t forget to manually handle the case where $? is 1.

grep

If grep fails to find a match in the input, it’ll exit with a status 1.

echo "asdf" | grep -c "h"; echo $?
0
1

Grep prints the number of matches as 0, but the exit status is 1, because no matches were found. Again, this may not necessarily be something you want to abort a script over, so once again, a manual checking of $? is needed if you wish to use grep consistently with set -e.

more pitfalls

The above examples are just a small taster of potential pitfalls, if we were to start considering sub-shells, and sourced scripts, this write up would end up being a wiki-sized behemoth. Not only that, but it’s also very common for scripts and programs to assign specific meanings to specific exit statuses outside of the normal conventions.

The key takeaway is that set -e is perfectly suitable for a short linear script that uses no advanced language features in bash. In the case of larger, more involved scripts, you’ll quickly find yourself littering your scripts with various workarounds so that set -e won’t clobber your script to death at an unexpected moment.

set -o pipefail

This next one, the second most dangerous on the list, does a very simple thing. When a command before a pipe exits with a status other than zero, then the overall exit status will also be set to that. e.g. with a script pipe.sh to illustrate:

#!/bin/bash
set -o pipefail
grep -c "f" nosuchfile | sed 's/\t/,/g' | sed 's/a/b/g'

echo $?

Running this will print 2, because the file given to grep doesn’t exist

bash pipe.sh
grep: nosuchfile: No such file or directory
2

Without pipefail, echo $? would print 0, as sed is more than happy to work through an empty input.

set -o pipefail can cause unexpected exits from scripts when paired with set -e in “needle in a haystack” situations. E.g.

zcat huge_audit_log.gz | grep -q -m1 'suspicious delete statement'

If grep finds what it needs and quits, zcat will have no knowledge of that, since it cannot see what is on the other end of the pipe, and zcat may still be trying to push more things into the pipe.

When that happens, it’s possible for zcat to exit with a SIGPIPE, and therefore take the script to an early exit.

The takeaway here is just to make sure that you are 100% certain that each part of a pipeline will first consume the entire input if you want to stick to set -e and set -o pipefail together.

For this particular example, of course, one can just use zgrep to sidestep the problem altogether.

set -u

The least dangerous of the three, this setting will cause a script to exit if an attempt is made to access an undefined variable. Much like the other two entries, this one can also cause seemingly unexpected ends to scripts, but the situations that call for it are significantly more rare.

Main source of issues are cases of expansions of positional parameters that may not yet be set. The most common case is when trying to build command line options into scripts, e.g.

while [ "$1" ]; do
  case $1 in
  ... etc ...
  esac
done

$1 would expand to an unset parameter, and therefore exit and break the script, even though it’s not really a problem.

This can be fixed either by switching $1 to "$# -gt 0" or by using getopts instead. getopts is not affected by this problem, and also has a number of other, additional advantages.

Conclusion

The advice to always include a set -euo pipefail can be pretty dangerous depending on the complexity of a script, and may end up causing as many problems as it solves. It is always best to evaluate each of these options in their own right, and determine whether they are suitable for inclusion in a script.

-> -e can be brittle, has convoluted rules in some situations, so it’s almost always better to handle errors explicitly, by checking the value of $? and handling things from there. Avoid the tempting call of || true, because many programs (grep being one of them) have specific exit status values for different problems.

-> -o pipefail, again, ${PIPESTATUS} can be used to inspect any part of a pipeline, in a similar fashion to $? for single commands

-> -u the least harmful, recommend if the script is only using simple variables, no expansions or arrays

That way, you can have some control over how and when your scripts exit upon encountering errors.