3.49 mselstr - Select Records Matching Query String

For records where the values in fields f= match with the string specified at v= are selected.

Commonly used examples are shown in 3.20 - 3.22. In table 3.21, select records where val is "y" regardless of the value of key. In table 3.22, if any of the record contains the value "x" in val, records with the same key value will be selected. i.e. all records with value "a" in key column are selected. Since none of the records with key value "b" contains value "x", none records are selected.

Table 3.20: Input data

key

val

a

x

a

y

b

y

b

z

Table 3.21: f=val v=y

key

val

a

y

b

y

Table 3.22: k=key f=val v=x

key

val

a

x

a

y

Various selection criteria can be carried out with the parameters below. Use msel command to build complex conditions using regular expressions and operators which cannot be specified in this command.

Sample data with same key, containing two records and two columns is shown in (Table3.23).

mselstr k=key f=fld1,fld2 v=s1,s2

Matching criteria without -R,-F options are shown in 3.24.

Table 3.23: Input data

$\verb|key|$

$\verb|fld1|$

$\verb|fld2|$

k

$v_{a1}$

$v_{a2}$

k

$v_{b1}$

$v_{b2}$

Table 3.24: Using the input data shown in Table 3.23, the query results of the command mselstr k=key f=fld1,fld2 v=v1,v2 with and without -R and F options differs accordingly. If query matches all conditions, the output will print all rows (2 rows), when there is no matched records, no records will returned.

-F

-R

Matching conditions

   

(($v_{a1}$ == s1 or $v_{a1}$ == s2) or ($v_{a2}$ == s1 or $v_{a2}$ == s2)) or (($v_{b1}$ == s1 or $v_{b1}$ == s2) or ($v_{b2}$ == s1 or $v_{b2}$ == s2))

-F

 

(($v_{a1}$ == s1 or $v_{a1}$ == s2) and ($v_{a2}$ == s1 or $v_{a2}$ == s2)) or (($v_{b1}$ == s1 or $v_{b1}$ == s2) and ($v_{b2}$ == s1 or $v_{b2}$ == s2))

 

-R

(($v_{a1}$ == s1 or $v_{a1}$ == s2) or ($v_{a2}$ == s1 or $v_{a2}$ == s2)) and (($v_{b1}$ == s1 or $v_{b1}$ == s2) or ($v_{b2}$ == s1 or $v_{b2}$ == s2))

-F

-R

(($v_{a1}$ == s1 or $v_{a1}$ == s2) and ($v_{a2}$ == s1 or $v_{a2}$ == s2)) and (($v_{b1}$ == s1 or $v_{b1}$ == s2) and ($v_{b2}$ == s1 or $v_{b2}$ == s2))

Format

mselstr f= v= [k=] [u=] [-F] [-r] [-R] [-sub] [-head] [-tail] [-W] [i=] [o=] [bufcount=] [-nfn] [-nfno] [-x] [-q] [--help] [--version]

Parameters

f=

Target field name for query (allow multiple fields).

v=

Select rows(s) where the string specified at f= parameter matches any of the specified string(s) (allow multiple fields).

k=

Select records based on the defined key field (allow multiple fields).

o=

Print record(s) matching query to specified output file.

u=

Print unmatched record(s) to this output file.

-F

Match all character strings specified at the f= parameter.

-r

Reverse selection

 

Remove record matching records.

-R

Returns rows that match all character strings specified at the k= parameter.

-sub

Search for substring that matches the part of the string pattern.

 

Select records based on substring match specified in v= parameter,

 

against the string specified column(s) at the f= parameter.

-head

Match beginning of string

-tail

Match end of string

-W

Match a sequence of wide-character substring when -sub,-head,-tail option is specified.

Examples

Example 1: Basic example

Select records matching apple and orange in the Product field , print matching results to rsl1.csv file. Unmatched records such as pineapplejuice will be saved to other.csv file using the parameter u=oth1.csv.

$ more dat1.csv
Product,Amount
apple,100
milk,350
orange,100
pineapplejuice,500
wine,1000
$ mselstr f=Product v=apple,orange u=oth1.csv i=dat1.csv o=rsl1.csv
#END# kgselstr f=Product i=dat1.csv o=rsl1.csv u=oth1.csv v=apple,orange
$ more rsl1.csv
Product,Amount
apple,100
orange,100
$ more oth1.csv
Product,Amount
milk,350
pineapplejuice,500
wine,1000

Example 2: Remove records

Contrary to example 1, remove records matching keywords apple and orange using the -r option, the output is saved to rsl2.csv file.

$ mselstr f=Product  v=apple,orange -r i=dat1.csv o=rsl2.csv
#END# kgselstr -r f=Product i=dat1.csv o=rsl2.csv v=apple,orange
$ more rsl2.csv
Product,Amount
milk,350
pineapplejuice,500
wine,1000

Example 3: Select based on the key unit

Select all records of customer who have purchased oranges by specifying Customer at the k= parameter. Save unmatched records to oth2.csv.

$ more dat2.csv
Customer,Product,Amount
A,apple,100
A,milk,350
B,orange,100
B,orange,100
B,pineapple,500
B,wine,1000
C,apple,100
C,orange,100
$ mselstr k=Customer f=Product v=orange u=oth2.csv i=dat2.csv o=rsl3.csv
#END# kgselstr f=Product i=dat2.csv k=Customer o=rsl3.csv u=oth2.csv v=orange
$ more rsl3.csv
Customer%0,Product,Amount
B,orange,100
B,orange,100
B,pineapple,500
B,wine,1000
C,apple,100
C,orange,100
$ more oth2.csv
Customer%0,Product,Amount
A,apple,100
A,milk,350

Example 4: Partial match

Select records where the Product field contain the keyword apple, and save the output to a file named rsl4.csv. Records with partial match such as pine(apple)juice will also be saved in the output file rsl4.csv.

$ mselstr f=Product v=apple -sub i=dat1.csv o=rsl4.csv
#END# kgselstr -sub f=Product i=dat1.csv o=rsl4.csv v=apple
$ more rsl4.csv
Product,Amount
apple,100
pineapplejuice,500

Example 5: Wide character substring match

Select records where the Product field contains wide characters "柿", "桃", and "葡萄".

Matching maybe based on single byte character if the query string includes wide character, the query string maybe interpreted as multibyte character for matching. Therefore, it is necessary indicate wide character in the query string with -W option.

$ more dat3.csv
Product,Amount
fruit:柿,100
fruit:桃,250
fruit:葡萄,300
fruit:梨,450
fruit:苺,500
$ mselstr f=Product v=柿,桃,葡萄 -sub -W i=dat3.csv o=rsl5.csv
#END# kgselstr -W -sub f=Product i=dat3.csv o=rsl5.csv v=柿,桃,葡萄
$ more rsl5.csv
Product,Amount
fruit:柿,100
fruit:桃,250
fruit:葡萄,300

Example 6: Select product(s) with consecutive purchases in 2013.

Use the -F option to select transactions where the date of purchase and the previous date of purchase for the product both took place in 2013. Save the query results to an output file rsl6.csv. Save unmatched records to oth3.csv.

$ more dat4.csv
Customer,Product,Amount,Gender,Date,PreviousDate
A,apple,100,1,2013/01/04,2013/01/01
A,milk,350,1,2013/04/04,2011/05/06
B,orange,100,2,2012/11/11,2011/12/12
B,orange,100,2,2013/05/30,2012/11/11
B,pineapple,500,2,2013/04/15,2013/04/01
B,wine,1000,2,2012/12/24,2011/12/24
C,apple,100,2,2013/02/14,NULL
C,orange,100,2,2013/02/14,2013/01/31
D,orange,100,2,2011/10/28,NULL
$ mselstr f=Date,PreviousDate -F -sub v=2013 u=oth3.csv i=dat4.csv o=rsl6.csv
#END# kgselstr -F -sub f=Date,PreviousDate i=dat4.csv o=rsl6.csv u=oth3.csv v=2013
$ more rsl6.csv
Customer,Product,Amount,Gender,Date,PreviousDate
A,apple,100,1,2013/01/04,2013/01/01
B,pineapple,500,2,2013/04/15,2013/04/01
C,orange,100,2,2013/02/14,2013/01/31
$ more oth3.csv
Customer,Product,Amount,Gender,Date,PreviousDate
A,milk,350,1,2013/04/04,2011/05/06
B,orange,100,2,2012/11/11,2011/12/12
B,orange,100,2,2013/05/30,2012/11/11
B,wine,1000,2,2012/12/24,2011/12/24
C,apple,100,2,2013/02/14,NULL
D,orange,100,2,2011/10/28,NULL

Example 7: Extract all transactions of customers who have consecutive purchases in 2013

Use k= parameter to select all transactions of customers who have purchased a product with date of purchase and date of previous purchase both took place in 2013. Save unmatched records to a file oth4.csv.

$ mselstr k=Customer f=Date,PreviousDate -F -sub v=2013 u=oth4.csv i=dat4.csv o=rsl7.csv
#END# kgselstr -F -sub f=Date,PreviousDate i=dat4.csv k=Customer o=rsl7.csv u=oth4.csv v=2013
$ more rsl7.csv
Customer%0,Product,Amount,Gender,Date,PreviousDate
A,apple,100,1,2013/01/04,2013/01/01
A,milk,350,1,2013/04/04,2011/05/06
B,orange,100,2,2012/11/11,2011/12/12
B,orange,100,2,2013/05/30,2012/11/11
B,pineapple,500,2,2013/04/15,2013/04/01
B,wine,1000,2,2012/12/24,2011/12/24
C,apple,100,2,2013/02/14,NULL
C,orange,100,2,2013/02/14,2013/01/31
$ more oth4.csv
Customer%0,Product,Amount,Gender,Date,PreviousDate
D,orange,100,2,2011/10/28,NULL

Example 8: Select new customer(s) who purchased in 2013

Use the -R option to select all transactions of new customer(s) who made their first purchase in 2013, where date of previous purchase is NULL. Write the query results to an output file ¥verb|rsl8.csv|, and save unmatched records to oth5.csv.

$ mselstr k=Customer f=Date,PreviousDate -F -R -sub v=2013,NULL u=oth5.csv i=dat4.csv o=rsl8.csv
#END# kgselstr -F -R -sub f=Date,PreviousDate i=dat4.csv k=Customer o=rsl8.csv u=oth5.csv v=2013,NULL
$ more rsl8.csv
Customer%0,Product,Amount,Gender,Date,PreviousDate
C,apple,100,2,2013/02/14,NULL
C,orange,100,2,2013/02/14,2013/01/31
$ more oth5.csv
Customer%0,Product,Amount,Gender,Date,PreviousDate
A,apple,100,1,2013/01/04,2013/01/01
A,milk,350,1,2013/04/04,2011/05/06
B,orange,100,2,2012/11/11,2011/12/12
B,orange,100,2,2013/05/30,2012/11/11
B,pineapple,500,2,2013/04/15,2013/04/01
B,wine,1000,2,2012/12/24,2011/12/24
D,orange,100,2,2011/10/28,NULL

Related Commands

msel : Select records with more complex criteria.

mcommon : When selecting a large number of target strings use mcommon command.