Removing leading/trailing spaces in the shell

Posted on 2020-01-16 (Updated on 2021-12-13)

Whitespace causes lots of interesting issues with the command line - whether it is present in file names, arguments, or any other data flowing between commands. Quoting can help, but there's some very particular edge cases I've encountered in my scripts.

Leading and trailing whitespace

If you are piping output directly into another command, occasionally you can get whitespace characters before or after your data. You have to use quotes to make sure internal whitspace is preserved, but that also leaves any whitespace characters at the beginning and/or end of your data.

I use three methods of removing leading or trailing whitespace: echo, xargs, and sed.

The echo method involves echoing a value without quotes, and re-assigning it to a variable:

TEST='   lousy  spaces!     '
TEST="$( echo $TEST )"

echo "$TEST"
# Prints:
#lousy spaces!

Benefits:

Does not call an external command - echo is a bash builtin
Short and sweet

Drawbacks:

Collapses internal spaces as well
Doesn't lend itself to chaining/pipes

The xargs method is my personal favorite, although it's a bizarre one.

echo '   lousy  spaces!     ' | xargs
# Prints:
#lousy spaces!

The xargs command, as a side-effect, strips out leading and trailing whitespaces.

If you think about what its doing, it makes sense; it's taking each unescaped "word" string, and sending it to stdout (or wherever). It doesn't care about whitespace, so it gets truncated along the way.

Benefits:

Can be part of a piped chain of commands
xargs is part of findutils, and installed by default on most Debian-based systems
Using a command that begins with 'x' automatically gives you a +1 to your charisma stats

Drawbacks:

Collapses internal spaces as well
Runs an external program

The sed method is not something I've used personally. It's something sed was created for - stream editing - but seems like a costly solution to the problem.

It relies on regular expressions, which are usually slower than other forms of text manipulation. This method is probably plenty performant, but I tend to leave any regex as a last resort.

echo '   lousy  spaces!     ' | sed 's/^[[:space:]]\+//' \
 | sed 's/[[:space:]]\+$//'

I've included the long [[:space:]] format because it's compatible with non-GNU sed, but if you're rockin' the GNU toolchain, you can use \s as a terse and sensible replacement.

Benefits:

Only strips spaces off of the beginning and end of a string, leaving multiple spaces inside intact
Can be quickly adjusted to strip off other leading or trailing strings

Drawbacks:

Not zero, not one, but two command invocations
Regex, and all its baggage
sed may not be installed by default on the system you're working on

The real solution would be to quit doing so much data processing in the shell, and move to an actual scripting/programming language. However, there's a delicate balance between what belongs in the shell, and what needs its own fully-fledged script; and these techniques have let me accomplish a lot in scripts that didn't need a full programming language backing it.

Update

hackerdefo has submitted several other excellent methods for your consideration. I'm particular fond of the awk implementation myself.

Note: The tr method removes all spaces, not just leading and trailing.

echo -e " Fragmented Development " | tr -d "[:blank:]"
echo -e " Fragmented Development " | awk '{$1=$1};1'
echo -e " Fragmented Development " | ruby -pe 'gsub(/^\s+/, "").gsub(/\s+$/, $/)'
echo -e " Fragmented Development " | perl -plne 's/^\s*//;s/\s*$//;s/\s+/ /;'

Tags: linux terminal

Comments

You can have multiple pattern commands with sed, so you only need to call it once. If sed isn't installed the "actual" programming language might not either.
Nat!

Leading and trailing whitespace

Update

Comments

Add Your Comment