bash getopts: manage command line options in your scripts
In the case of multi-stage shell scripts, it is often useful to break the script into distinct sections, where each function handles a specific task only. Production often tends to be a messy place, scripts can, and sometimes do break midway.
A system operator might need to only re-run a part of the script, or run it again from a specific point, without starting over from the beginning. This is the situation where structured CLI options become very useful.
Wouldn’t it be nice to just be able to add a flag (e.g. -i = insert, -d = download etc.) to your script for each function you want to run? That’s where getopts comes to the rescue.
But before diving in…
The “traditional” way
A lot of script authors attempt to use $#
with a while
loop and then for each option, use shift to remove the associated argument for each option after the option itself has been parsed.
Here’s how that would look like for a hypothetical example script:
while [ $# -gt 0 ]; do
OPT=$1
shift
case ${OPT} in
-d)
provider="$1"
download_src_files "$provider" ;;
-i)
target="$1"
insert_from_csv "$target" ;;
*) if [ "${OPT:0:1}" = "-" ]; then
echo "Unknown option: $OPT"
fi
;;
esac
done
Of course this works fine, but it’s clunky and also pretty difficult to read (especially once the number of options grows more numerous).
There’s a couple of evident issues here:
-> No handle on the situation where the system operator forgets to specify any options at all. In fact, checking $?
would report that the script went through successfully!
-> This construct doesn’t check whether an argument was provided with each option, that has to be done manually either here, or within the desired function(s), adding further to that clunky feel.
-> Handling the shifting of option arguments manually is difficult to follow mentally, especially if returning to refactor the script at a later date.
-> There’s no innate way to tell whether something is an option or an option argument, so if we want to check whether the system operator has tried to use an invalid option, we must build the if statement to check for this ourselves.
Let’s break down what actually happens here:
- While loop keeps going as long as there are CLI arguments remaining (
$#
is greater than zero). - For each pass of the loop, we save the first argument as an option (
$OPT
), and remove it from the array of inputs usingshift
(this is destructive, i.e. the variable is permanently removed from the list of CLI arguments). - Make a decision using the case statement, if
$OPT
matches with one of the known cases (like -d or -i), it should call the relevant function. - If it doesn’t, check that it wasn’t an argument, and return an error stating that the option wasn’t recognized.
Using getopts
Now let’s reimagine the above situation using getopts:
while getopts "d:i:" OPT; do
case $OPT in
d) provider="$OPTARG"
download_src_files "$provider" ;;
i) target="$OPTARG"
insert_from_csv "$target" ;;
esac
done
The following facts are immediately evident:
-> This now looks much cleaner, and we can even drop the *)
case entirely, as the getopts itself will already throw an error for the situation where the system operator calls the script with an invalid option.
-> getopts also checks whether an option was provided with an argument or not, and throws an error if not.
It’s easy to see the benefits: looks much cleaner, and a lot less mental overhead.
Let’s break down what actually happens here:
First of all, you’re probably wondering what this cryptic "d:i:"
is all about.
This string is called the optstring
, and it contains the option characters to be recognized by getopts. If a character is followed by a :
it means that getopts should expect that option to have an argument, which should be supplied following the option. I.e. "d:"
means that when using the option -d
, we must also supply an argument, e.g. -d something
.
If we want to add a new option that does not require an argument, then simply add a new option without the :
after it. E.g. if we add a cleanup function to our hypothetical script:
while getopts "d:i:c" OPT; do
case $OPT in
d) provider="$OPTARG"
download_src_files "$provider" ;;
i) target="$OPTARG"
insert_from_csv "$target" ;;
c) run_cleanup ;;
esac
done
The OPT following the optstring
is referred to as the name
, and it’s simply the name of the bash variable that getopts will be used to refer to the option during the current loop. If a :
exists after an option in the optstring, then the value of the option argument will be set to $OPTARG
.
There’s also a variable we’ve not yet touched on, which is $OPTIND
. This variable refers to the index of the next argument to be passed. This is a variable we can use to fix a problem we’ve almost forgotten about: the system operator calling the script without any options.
Much like before, calling the script with no options would pass, and $?
would report 0. The difference is that we can now easily fix this issue with $OPTIND
. The value of $OPTIND
should be more than 1 (the first index being the script itself) for options to be present:
if [[ $OPTIND -eq 1 ]]; then
echo "$0: This script should be called with at least one option"
fi
This simple check is all we need. If $OPTIND
is still 1 after parsing through the while loop, then it means that no valid options were encountered at all.
Multiple option arguments
If you need multiple arguments into one of your options, it is possible, though the way to go is far from intuitive, given that getopts expects one option argument per option.
We can circumvent this by redefining $OPTARG
as an array.
Once we have $OPTARG
as an array, we just need to check the contents of the input, and keep adding new elements until we hit something that starts with -
(since that would be the next option).
We can then simply inject a function into the relevant case where we need extra arguments, to do this.
Here’s how that looks:
function getopts_extra_args() {
declare -i i=1
while [ "${OPTIND}" -le "$#" -a "${!OPTIND:0:1}" != "-" ]; do
OPTARG[i]=${!OPTIND}
(( i++, OPTIND++ ))
done
}
Here it is good to note that the first value of OPTARG
will be equal to what $OPTARG
would normally be, and any additional values will only be stored in indices starting from 1. Therefore, when we unravel this array, we need to start from index=1 if we want only the additional arguments.
With that kept in mind, in the affected case we can call this function:
while getopts "d:i:c" OPT; do
case $OPT in
d) getopts_extra_args "$@"
extra_settings=("${OPTARG[@]}")
provider="$OPTARG"
download_src_files "$provider" ;;
i) target="$OPTARG"
insert_from_csv "$target" ;;
c) run_cleanup ;;
esac
done
This now gives access to many arguments when the option -d
is called:
function download_src_files() {
extra_settings_len=${#extra_settings[*]}
for (( x=1; x<$extra_settings_len; x++ )); do
echo ${extra_settings[$x]}
done
}
Let’s try it out, assuming we have an example function like
function download_src_files() {
echo ${FUNCNAME[0]}
echo $1
extra_settings_len=${#extra_settings[*]}
for (( x=1; x<$extra_settings_len; x++ )); do
echo ${extra_settings[$x]}
done
}
Running bash script.sh -d example setting1 setting2
should output:
download_src_files
example
setting1
setting2
If you needed multiple arguments for the i
flag as well, simply do the same thing, but with a new variable other than extra_settings
.
I.e.
while getopts "d:i:c" OPT; do
case $OPT in
d) getopts_extra_args "$@"
extra_settings=("${OPTARG[@]}")
provider="$OPTARG"
download_src_files "$provider" ;;
i) getopts_extra_args "$@"
extra_settings2=("${OPTARG[@]}")
target="$OPTARG"
insert_from_csv "$target" ;;
c) run_cleanup ;;
esac
done
Adding an example function:
function insert_from_csv() {
echo ${FUNCNAME[0]}
echo $1
extra_settings2_len=${#extra_settings2[*]}
for (( x=1; x<$extra_settings2_len; x++ )); do
echo ${extra_settings2[$x]}
done
}
Running bash script.sh -d example setting1 setting2 -i example2 setting3 setting4
should output:
download_src_files
example
setting1
setting2
insert_from_csv
example2
setting3
setting4
This can be done to as many options as needed.
I hope this post has convinced you on the utility of getopts
and will be a valuable addition to your future scripts.