Wednesday, June 17, 2015

sed by Example


Syntax: sed [options] 'instruction' file

Actually you can have more than 1 instruction:
sed -e 'instruction1' -e 'instruction2' -e 'instruction3'
The most important sed option is -n. Bear in mind that without using -n option, lines that were not touched by sed will be printed as well. In other words, by using -n sed prints only lines which affected. For example in the following print examples we do need -n otherwise it prints all lines again. 

Important Note: In case of deleting we do not need -n option. 

Important Note 2: Whenever you use -n you must use p in the action section to print in STDOUT. 


Print


To print the 25th line of a file:
$ sed -n '25p' /etc/passwd
To print line 24 to line 26:
$ sed -n '24,26p' /etc/passwd

To print all lines but the 3rd one:
$ sed -n '3!p' /etc/passwd
To print all lines except 3 to 7:
$ sed -n '3,7!p' /etc/passwd
To print the last line of a file:
$ sed -n '$p' /etc/passwd
To print all lines containing test:
$ sed -n '/test/p' /etc/passwd

To ignore case and print all lines containing test, Test, teSt,  etc:
$ sed -n '/test/Ip' /etc/passwd
You can also use Bash wildcards like * and ? in the file name:
$ sed -n '/behnam/Ip' testfile*.txt
If we have a list of names and want to have a range of lines containing those words:
$ sed -n '/nagios/,/ntp/p' /etc/passwd
So it prints lines starting the line containing nagios (1st nagios in the file) through the line containing ntp (again 1st appearance)



To print the line that have the 1st nagios plus two following lines:
$ sed -n '/nagios/,+2p' /etc/passwd
The following example does not work because Regex quantifiers (just for + and ? but not for *) should be escaped by escape character which is \
$ sed -ne '/^b.+'/Ip /etc/passwd
So we should run:
$ sed -ne '/^b.\+'/Ip /etc/passwd
Now it detects and prints Behnam, behnam, bp, BP, etc at the beginning of the lines. As you saw earlier, I is for ignoring case. 

The following example also works by adding -r option to sed and not using \ before +
$ sed -ner '/^b.+'/Ip /etc/passwd

Delete

To delete 1st line: (do not use -n option)
$ sed '1d' /etc/passwd > ~/passwd
To delete last line:
$ sed '$d' /etc/passwd > ~/passwd
To delete lines other than the last line:
$ sed '$!d' /etc/passwd > ~/passwd
To delete every 2nd line beginning of line 3. i.e. line 3, 5, 7, ...  

$ sed '3~2d' /etc/passwd > ~/passwd
To delete all blank lines in the file:
$ sed '/^$/d' /etc/passwd > ~/passwd

Substitute

Most common syntax for substitution is 
sed '/s/LHS/RHS/g' file.txt 
Or you can use /Ig instead of /g for ignoring case sensitivity. 

LHS can be literal and regex, RHS can be literal and back references like & and \1

To delete all blank lines and replaces behnam as well:

$ sed -e '/^$/d' -e 's/behnam/bp/g' /etc/passwd

Note: As you see the syntax is similar to find and replace command in vi:

:s/behnam/bp/g
If you want to do the same but want to create a new file with bak extension:
$ sed -i.bak -e '/^$/d' -e 's/behnam/bp/g' /etc/passwd

To change Behnam in the test01.txt file to bp in the test02.txt file:
$ sed s/Behnam/bp/ test01.txt > test02.txt
Another way to do that is: 
$ cat test01.txt | sed s/Behnam/bp/p > test02.txt
Note: using quotes is highly recommended. If you have metacharacters in the command, quotes are necessary so you'd better type:
$ sed 's/Behnam/bp/' test01.txt > test02.txt
To change Behnam to Behdad:
$ echo Behnam | sed 's/nam/dad/'
As you know, sed is line oriented. So if you have such a file:
one two three, one two four
four two three two one
one hundred and one
And run:
$ sed 's/one/FIVE/' testfile.txt
The output would be:
Five two three, one two four
four two three two FIVE
FIVE hundred and one
Note that this changed one to FIVE once on each line and din touch the 2nd ones. 


To replace all Behnam just in lines which have Pournader:
$ sed '/Pournader/s/Behnam/Ben/g' testfile.txt
$ sed -n '/Pournader/s/Behnam/Ben/gp' testfile.txt
 To replace all Behnam just in lines which starts with Behnam (case insensitive):
$ sed '/^Behnam/Is/Pournader/123/g' testfile.txt

If you want to change a pathname that contains a slash you could use the backslash to quote the slash:
$ sed 's/\/etc\/passwd/' old_file > new_file

Back References

We can Use & as the matched string. & means the full value of matched pattern. 

To search for a pattern and add some characters, like parenthesis, around the pattern:
$ sed 's/[a-z1-9]*/(&)/' old_file > new_file
You can also double a pattern
$ sed 's/[a-z1-9]*/& &/' old_file > new_file
$ echo "123 abc" | sed -n 's/[0-9]*/& &/p'
$ echo "123 abc" | sed -nr 's/[0-9]+/& &/p'
To put "item:" at the beginning of each line: 
$ sed -n 's/.*/item: &/p' testfile.txt
To put "item:" at the beginning of each word: 
$ sed -n 's/.*/item: &/gp' testfile.txt
To search and print lines with a particular pattern:
$ sed -n 's/^Behnam/&/gp' testfile.txt
It is just another way to do:
$ sed -n '/^Behdad/gp' testfile.txt
To match a number between 100 and 99999 and print:

$ sed -n 's/[1-9][0-9]\{2,4\}/&/gp'  

Note: be careful to escape { and } in sed by using escape character \

\1 is the first remembered pattern and the \2 is the second remembered pattern. We can continue up to \9

If you want to keep the 1st word of a line, and delete the rest of the line, mark the important part with the parenthesis:
$ echo "behnam pournader" | sed -n 's/\([a-z]*\).*/\1/p'
[a-z]* matches 0 or more lower case letters (behnam).* matches zero or more characters after the first match (pournader)
Note: Do not forget to use \( and \) to group the pattern when using \1

This returns abc as again [a-z]* matches just abc and .* matches 123:
$ echo "abc123" | sed -n 's/\([a-z]*\).*/\1/p'
So to keep the 1st word of a line and delete the rest of the line, we use:
$ sed -n 's/\([a-z]*\) .*/\1/p' testfile.txt  
Note: Do not forget to put an space before dot. 

If you want to switch two words around:
$ echo "red dog" | sed -n 's/\([a-z]*\) \([a-z]*\)/\2 \1/p'
Note 1: Space between the 2 remembered patterns is there to make sure 2 words are found. If a line just have 1 (or less) word, sed does not touch it in this case. 

Again by using -r, backslash is not needed before ( and ):
$ echo "red dog" | sed -nr 's/([a-z]*) ([a-z]*)/\2 \1/p'
If you want to eliminate duplicated words, you can try:
$ echo "behnam behnam" | sed -n 's/\([a-z]*\) \1/\1/p'
To just detect duplicated words:
$ sed -n '/\([a-z][a-z]*\) \1/p'
To reverse the first three characters on a line:
$ echo "behnam" | sed -n 's/^\(.\)\(.\)\(.\)/\3\2\1/p'
Note: Instead of using [A-Za-z]* which won't match words like "won't", we'd better use [^ ]* that matches everything except a space. This will also match anything because * means 0 or more! 

The following will put parenthesis around just the 1st word in each line: 
$ sed -n 's/[^ ][^ ]*/(&)/' old_file > new_file  
Note: [^ ] is used 2 times in order to avoid matching the null string.

As you see before, if you want to make changes for every word, add a g after the last delimiter. Otherwise it replaces just the 1st match on all lines. 
$ sed -n 's/[^ ][^ ]*/(&)/g' old_file > new_file
To keep the 1st word on the line but delete the 2nd one:
$ echo "red dog" | sed -n 's/\([a-zA-Z]*\) \([a-zA-Z]*\) /\1 /p'


sed Script File

We also can use a sed script file to do that using following syntax: 
sed -f SedScriptFile DataFile01.txt DataFile02.txt
So the contents of sed script file can be
/^$/d
s/behnam/bp/g
Important note: we do not need to escape any character in sed script file.

Labels: ,