GNU Sed

Table of Contents

1. Introduction

The command name sed is derived from stream editor. Here, stream refers to data being passed via shell pipes. Thus, the command's primary functionality is to act as a text editor for stdin data with stdout as the output target. Over the years, functionality was added to edit file input and save the changes back to the same file.

s stands for substitute command / is an idiomatic delimiter character to separate various portions of the command REGEXP stands for regular expression REPLACEMENT specifies the replacement string FLAGS are options to change default behavior of the command

1.1. Basic Usage

printf '1,2,3,4\na,b,c,d\n' | sed 's/,/-/g'

2. In-place file editing

2.1. With backup

When an extension is provided as an argument to -i option, the original contents of the input file gets preserved as per the extension given. For example, if the input file is ip.txt and -i.orig is used, the backup file will be named as ip.txt.orig

sed -i.bkp 's/blue/green/' examples/sed/colors.txt

2.2. Without backup

Sometimes backups are not desirable. Using -i option on its own will prevent creating backups. Be careful though, as changes made connot be undone. In such cases, tets the command with sample input before using -i option on actual file. You could also use the option with backup, compare the differences with a diff program and then delete the backup.

sed -i 's/an/AN/g' examples/sed/fruits.txt

2.3. Multiple files

Multiple input files are treated individually and the changes are written back to repective files.

sed -i.bkp 's/bad/good/' examples/sed/f1.txt examples/sed/f2.txt

Using wildcard and globbing

sed -i.bkp 's/bad/good/' examples/sed/f?.*

2.4. Prefix backup name

A * character in the argumetn to -i option is special. It will get replaced with the input filename. This is helpful if you need to use a prefix instead of suffix for the backup filename. Or any other combination that may be needed.

sed -i'bkp.*' 's/green/yellow/' examples/sed/colors.txt

2.5. Place backups in different directory

The * trick can also be used to place the backups in another directory instead of the parent directory of input files. The backup directory should already exist for this to work.

sed -i'backups/*' 's/good/nice' examples/sed/f?.*

2.6. Cheatsheet and summary

Note Description
-i after processing, write back changes to the source file(s) changes made cannot be undone, so use this option with caution
-i.bkp in addition to in-place eiditing, preserve original contents to a whose name is derived from input filename and .bkp as a suffix
-i'bkp.*' * here gets replaced with input filename thus providing a way to add a prefix instead of a suffix
-i'backups/*' this will place the backup copy in a different existing directory instead of source directory

3. Selective editing

By default, sed acts on the entire input content. Many a times, you only want to act upon specific portions of the input. To that end, sed has features to filter lines, similar to tools like grep, head and tail. sed can replicate most of grep's filtering features without too much fuss. And has features like line number based filtering, selecting lines between two patterns, relative addressing, etc which isn't possible with grep. If you are familiar with functional programming, you would have come across map, filter, reduce paradigm. A typical task with sed involves filtering a subset of input and then modifying (mapping) them. Sometimes, the subset is the entire input, as seen in the examples of previous chapters.

3.1. Conditional execution

As seen earlier, the syntax for substitute command is s/REGEXP/REPLACEMENT/FLAGS. The /REGEXP/FLAGS portion can be used as a conditional expression to allow commands to execute only for the lines matching the pattern.

printf '1,2,3,4\na,b,c,d\n' | sed '/2/ s/,/-/g'

Use /REGEXP/FLAGS! to act upon lines other than the matching ones.

printf '1,2,3,4\na,b,c,d\n' | sed '/2/! s/,/-/g'

3.2. Delete command

To delete the filtered lines, use the d command. Recall that all input lines are printed by default.

printf 'sea\neat\ndrop\n' | sed '/at/d'
# same as: grep -v 'at'
printf 'sea\neat\ndrop\n' | sed '/at/!d'

3.3. Print command

To print the filtered lines, use the p command. But, recall that all input lines are printed by default. So, this command is typically used in combination with -n command line option, which would turn off the default printing.

# same as: grep 'twice' examples/sed/programming_quotes.txt
sed -n '/twice/p' examples/sed/programming_quotes.txt

# same as: grep 'e th' examples/sed/programming_quotes.txt
sed -n '/e th/p' examples/sed/programming_quotes.txt

The substitue command provides p as a flag. In such a case, the modified line would be printed only if the substituion succeeded.

# same as grep '1' programming_quotes.txt | sed 's/1/one/g'
sed -n 's/1/one/gp' examples/sed/programming_quotes.txt

# filter + substitution + p combination
# same as grep 'not' programming_qoutes.txt | sed 's/in/**/g'
sed -n '/not/ s/in/**/gp' examples/sed/programming_quotes.txt

Using !p with -n option will be equivalent to using d command.

# same as: sed '/at/d'
printf 'sea\neat\ndrop\n' | sed -n '/at/!p'

3.4. Quit commands

Using q command wil exit sed immediately, without any further processing.

# quits after an input line containing 'if' is found
sed '/if/q' examples/sed/programming_quotes.txt 

Q command is similar to q but won't print the matching line.

sed '/if/Q' examples/sed/programming_quotes.txt 

Use tac to get all lines starting from last occurrence of the search string with respect to entire file content.

tac examples/sed/programming_quotes.txt | sed '/not/q' | tac

You can optionally provides an exit status (from 0 to 255) along with the quit commands.

printf 'sea\neat\ndrop\n' | sed '/at/q2'

3.5. Line addressing

Line numbers can also be used as a filtering criteria.

# here, 3 represents the address for the print command
# same as: head -n3 programming_quotes.txt | tail -n1 and sed '3!d'
sed -n '3p' examples/sed/programming_quotes.txt
echo '--------------------'
# print 2nd and 5th line
sed -n '2p; 5p' examples/sed/programming_quotes.txt
echo '--------------------'
# substitution only on 2nd line
printf 'gates\nnot\nused\n' | sed '2 s/t/*/g'

As a special case, $ indicates the last line of the input

# same as: tail -n1 programming_quotes.txt
sed -n '$p' examples/sed/programming_quotes.txt

For large input files, use q command to avoid processing unnecessary input lines.

seq 3542 4623452 | sed -n '2452{p; q}'
echo '--------------------'
seq 3542 4623452 | sed -n '250p; 2452{p; q}'
echo '--------------------'
# here is a sample time comparison
time seq 3542 4623452 | sed -n '2452{p; q}' > examples/sed/f1
echo '--------------------'
time seq 3542 4623452 | sed -n '2452p' > examples/sed/f2

Mimicking head command using line addressing and the q command.

# same as: seq 23 45 | head -n5
seq 23 45 | sed '5q'

3.6. Print only line number

The = command will display the line numbers of matching lines.

# gives both line number and matching line
grep -n 'not' examples/sed/programming_quotes.txt

# gives only line number of matching line
sed -n '/not/=' examples/sed/programming_quotes.txt 

If needed, matching line can also be printed. But there will be a newline character between the matching line and line number.

sed -n '/off/{=; p}' examples/sed/programming_quotes.txt
echo '--------------------'
sed -n '/off/{p; =}' examples/sed/programming_quotes.txt