3.45 msed - Replace String Matching Regular Expression

Replace string in the fields specified in the f= parameter with a string specified in the v= parameter for content that matches the regular expression specified in the c= parameter .

Format

msed c= f= v= [-A] [-g] [-W] [i=] [o=] [-nfn] [-nfno] [-x] [--help] [--version]

Parameters

f=

specify the target list of field name(s) (multiple fields can be specified) for parsing.

c=

Define the regular expression for string substitution.

 

Refer to usage of regular expressions.

v=

Specify the string to replace the substring that matches with the regular expression specified in the c= parameter.

 

It is possible to substitute match result with the following methods:

 

$& : Matched string

 

$` : Search for the string from the beginning of the target replacement character string, until a string is matched.

 

$' : After a matched string, substitute target replacement string with matched string till the end.

 

$N : partial string match for the N-th occurrance (N>=1).

-A

Instead of replacing the specified field, add field as a new column.

-g

Replace all matches of the regular expression.

-W

Replace wide character matches of the regular expression.

Using regular expressions

List of regular expression specified in the c= parameter is shown from Table 3.12 to Table 3.15.

Table 3.12: Regular expression match with 1 character

Regular expression

Description

Example of pattern

Example of c=,v=

Result

.

Any character

abbbcc

c=. v=X -g

XXXXXX

[abc]

either a,b, or c character

abbbcc

c=[ac] v=X -g

XbbbXX

[^abc]

Any character other than a,b,c

abbbcc

c=[^ac] v=X -g

aXXXcc

[a-z]

Any character from a to z

abbbcc

c=[a-b] v=X -g

XXXXcc

[^a-z]

Any character outside the range of a to z

abbbcc

c=[^a-b] v=X -g

abbbXX

\t

Tab character

     

\w

Word string ([0-9a-zA-Z_])

ab#cd&ef

c=\w v=X -g

XX#XX&XX

\W

Characters other than Word string

ab#cd&ef

c=\w v=X -g

abXcdXef

\s

Space character ([ \t])

ab cd ef

c=\s v=X -g

abXcdXef

\S

Non-whitespace character

ab cd ef

c=\s v=X -g

XX XX XX

\d

Numeric constituent characters ([0-9])

ab12c0

c=\d v=X -g

abXXcX

\D

Non-numeric constituent characters

ab12c0

c=\d v=X -g

XX12X0

Table 3.13: Repetition of regular expressions

Regular expression

Description

Example of pattern

Example of c=,v=

Result

a*

Zero or more repetition of a

abbbcc

c=ab* v=X

Xcc

a+

Repetition of one or more a

abbbcc

c=ab+ v=X

Xcc

a?

Single occurrence of a

abbbcc

c=ab? v=X

Xbbcc

a{M,N}

Repetition of a more than M and less than N

abbbbbcc

c=ab{3,4} v=X

Xbcc

a{M}

Repetition of a more than M times

abbbbbcc

c=ab{3} v=X

Xbbcc

a|b

a or b

abbbc

c=(ab)|(bc) v=X

XbX

?

Shortest match after the repeat sign

abbbc

c=ab*? v=X

Xbbbc

Table 3.14: Position of regular expression

Regular expression

Description

Example of pattern

Example of c=,v=

Result

^

Match from the beginning

abac

c=^a v=X -g

Xbac

$

Match till the end

acac

c=c$ v=X -g

acaX

\b

Match starting characters of string

aac ba ac bac

c=\ba v=X -g

Xac bX Xc bac

\B

Match within the string

aac ba ac bac

c=\Ba v=X -g

aXc ba ac bXc

Table 3.15: Others

Regular expression

Description

Example of pattern

Example of c=,v=

Result

(expr)

Grouping

     

\1,..,\9

Back reference

abbcababc

c=(ab)(bc)\1 v=x

Xabc

(?=expr)

Position before matched string at expr

     

(?!expr)

Position before unmatched string at expr

     

Examples

Example 1: Basic Example

Replace the 4-digit substring in the zipCode field starting 00 with ####.

$ more dat1.csv
customer,zipCode
A,6230041
B,6240053
C,6330032
D,6230087
E,6530095
$ msed f=zipCode c=00.. v=#### i=dat1.csv o=rsl1.csv
#END# kgsed c=00.. f=zipCode i=dat1.csv o=rsl1.csv v=####
$ more rsl1.csv
customer,zipCode
A,623####
B,624####
C,633####
D,623####
E,653####

Example 2: Specify field name

Replace the 4-digit substring in the zipCode field starting 00 with ####. Save output in column zipCode4.

$ msed f=zipCode:zipCode4 c='00\d\d' v=#### i=dat1.csv o=rsl2.csv
#END# kgsed c=00\d\d f=zipCode:zipCode4 i=dat1.csv o=rsl2.csv v=####
$ more rsl2.csv
customer,zipCode4
A,623####
B,624####
C,633####
D,623####
E,653####

Example 3: Global replacement

Global search using the regular expression - to replace value of 0 in zipCode.

$ msed f=zipCode c=0 v=- -g i=dat1.csv o=rsl3.csv
#END# kgsed -g c=0 f=zipCode i=dat1.csv o=rsl3.csv v=-
$ more rsl3.csv
customer,zipCode
A,623--41
B,624--53
C,633--32
D,623--87
E,653--95

Example 4: Replace substring

Delete fruit from the beginning of the string in item. Note that when first match (^) is specified, the substring within the word grapefruit in the last row is retained.

$ more dat2.csv
item,price
fruit:apple,100
fruit:peach,250
fruit:pineapple,300
fruit:orange,450
fruit:grapefruit,500
$ msed f=item c='^fruit' v= -g i=dat2.csv o=rsl4.csv
#END# kgsed -g c=^fruit f=item i=dat2.csv o=rsl4.csv v=
$ more rsl4.csv
item,price
:apple,100
:peach,250
:pineapple,300
:orange,450
:grapefruit,500

Example 5: Substitution using match results

Replaced 1 or more consecutive character strings of b using $& is defined in the v=.

$ more dat3.csv
str1
abc
abbc
ac
$ msed f=str1 c='b+' v='#$&#' i=dat3.csv o=rsl5.csv
#END# kgsed c=b+ f=str1 i=dat3.csv o=rsl5.csv v=#$&#
$ more rsl5.csv
str1
a#b#c
a#bb#c
ac

Example 6: Combination of the global match

When performing a global match, each match is evaluated against the contents defined at v=.

$ msed f=str1 c=b v='#$&#' -g i=dat3.csv o=rsl6.csv
#END# kgsed -g c=b f=str1 i=dat3.csv o=rsl6.csv v=#$&#
$ more rsl6.csv
str1
a#b#c
a#b##b#c
ac

Example 7: Prefix substitution

Replace the matching first character of b in the character string (prefix) using $`.

$ msed f=str1 c=b v='#$`#' i=dat3.csv o=rsl7.csv
#END# kgsed c=b f=str1 i=dat3.csv o=rsl7.csv v=#$`#
$ more rsl7.csv
str1
a#a#c
a#a#bc
ac

Example 8: Suffix substitution

Replace the matching last character of b in the character string (suffix) using $'.

$ msed f=str1 c=b v="#$'#" i=dat3.csv o=rsl8.csv
#END# kgsed c=b f=str1 i=dat3.csv o=rsl8.csv v=#$'#
$ more rsl8.csv
str1
a#c#c
a#bc#bc
ac

Related Commands

mchgstr : Use this command to replace with a simple string match.

mcal : Include several functions to handle the regular expression.