2.6 Specify Fields

MCMD reads the field names in the first row of CSV data, the field can also be specified with a field number without the field names. There are four options for handling the row of field names includes -nfn,-nfno,-nfni and -x. Its usage will be illustrated with examples as follows. In addition, note that the field number starts at 0 from the left such as 0, 1, 2.

Example 1: Specify -nfn

When -nfn (no field name) is specified, the first row in the data will not be considered as field names. Thus, each field is specified as a number (note that the number starts from 0).

$ more dat2.csv
a,2
b,5
b,4
$ msum -nfn k=0 f=1 i=dat2.csv o=rsl1.csv
#END# kgsum -nfn f=1 i=dat2.csv k=0 o=rsl1.csv
$ more rsl1.csv
a,2
b,9

Example 2: Specify -nfno

When -nfno (no field name for output) is specified, the first row of input data is initialized as field names, but the field names is removed from the output data.

$ more dat1.csv
key,val
a,2
b,5
b,4
$ msum k=key f=val -nfno i=dat1.csv o=rsl2.csv
#END# kgsum -nfno f=val i=dat1.csv k=key o=rsl2.csv
$ more rsl2.csv
a,2
b,9

Example 3: Specify -nfni

The option -nfni (no field name for input) is only available in the mcut command. This option does the opposite of -nfno, where the first row of input data is treated as data items, but the field names will be shown in the output data.

$ mcut f=0:key,1:val -nfni i=dat2.csv o=rsl3.csv
#END# kgcut -nfni f=0:key,1:val i=dat2.csv o=rsl3.csv
$ more rsl3.csv
key,val
a,2
b,5
b,4

Example 4: Specify -x

For CSV data with a field names, use the -x option to specify the field number.

$ msum -x k=0 f=1 i=dat1.csv o=rsl4.csv
#END# kgsum -x f=1 i=dat1.csv k=0 o=rsl4.csv
$ more rsl4.csv
key,val
a,2
b,9

2.6.1 Valid field names

Field name can contain the characters stated as follows:

However, it is recommended to avoid using the following symbols. Using the symbols will not return an error, however, the symbol maybe be used as special characters in MCMD and the special character function may not be available if it is used in as a field name.

2.6.2 Valid item number

When specifying field number, the field numbers can be listed with a comma delimiter. Alternatively, it is also possible to specify the number at the end of the field name (add "L") or specify the range (-).

For example, the argument 0L specifies the last field, and 2L specifies the 2nd field from the end (note that field number starts from 0). Furthermore, when 0-5 is specified, six fields starting from 0 to 5 are selected, which is equivalent to 0,1,2,3,4,5

Example 1: Specify range

By specifying "0-4", fields 0,1,2,3,4 are specified.

$ more dat1.csv
brand,quantity01,quantity02,quantity03,quantity04,quantity05,quantity06,quantity07,quantity08,quantity09,quantity10
A,10,50,90,130,170,210,250,290,330,370
B,20,60,100,140,180,220,260,300,340,380
C,30,70,110,150,190,230,270,310,350,390
D,40,80,120,160,200,240,280,320,360,400
$ mcut -x f=0-4 i=dat1.csv o=rsl1.csv
#END# kgcut -x f=0-4 i=dat1.csv o=rsl1.csv
$ more rsl1.csv
brand,quantity01,quantity02,quantity03,quantity04
A,10,50,90,130
B,20,60,100,140
C,30,70,110,150
D,40,80,120,160

Example 2: Specify range in reverse order

By specifying “4-0”, fields 0,1,2,3,4 are specified.

$ mcut -x f=4-0 i=dat1.csv o=rsl2.csv
#END# kgcut -x f=4-0 i=dat1.csv o=rsl2.csv
$ more rsl2.csv
quantity04,quantity03,quantity02,quantity01,brand
130,90,50,10,A
140,100,60,20,B
150,110,70,30,C
160,120,80,40,D

Example 3: Specify Multiple ranges

By specifying “1-0,2-4”, fields “1,0,2,3,4” are specified.EOF scp=<<’EOF’ mcut -x f=1-0,2-4 i=dat1.csv o=rsl3.csv more rsl3.csv

$ mcut -x f=4-0 i=dat1.csv o=rsl2.csv
#END# kgcut -x f=4-0 i=dat1.csv o=rsl2.csv
$ more rsl2.csv
quantity04,quantity03,quantity02,quantity01,brand
130,90,50,10,A
140,100,60,20,B
150,110,70,30,C
160,120,80,40,D

Example 4: Specified field from the end

By specifying "2L", the second field from the end is specified (quantity 08).

$ mcut -x f=2L i=dat1.csv o=rsl4.csv
#END# kgcut -x f=2L i=dat1.csv o=rsl4.csv
$ more rsl4.csv
quantity08
290
300
310
320

Example 5: Specify the range of fields from the end

By specifying "5-3L", the 5th to the 3rd item from end is specified, i.e. "5,6,7".

$ mcut -x f=5-3L i=dat1.csv o=rsl5.csv
#END# kgcut -x f=5-3L i=dat1.csv o=rsl5.csv
$ more rsl5.csv
quantity05,quantity06,quantity07
170,210,250
180,220,260
190,230,270
200,240,280

2.6.3 1.6.3 Input and output fields

The f= parameter is used to specify the field(s) in many commands. The format of f= is defined as “input field:output field". If output field name is not specified, the same input field name will be used as output field name. In addition, it is also possible to specify combinations of field names such as f=0:quantity.

Example 1: Basic Example

By specifying "quantity:unit sales", the field name is converted from “quantity” to “unit sales” in the output.

$ more dat1.csv
brand,quantity
A,10
B,20
C,30
D,40
$ mcut f=brand,quantity:salesquantity i=dat1.csv o=rsl1.csv
#END# kgcut f=brand,quantity:salesquantity i=dat1.csv o=rsl1.csv
$ more rsl1.csv
brand,salesquantity
A,10
B,20
C,30
D,40

Example 2: Add field name

The maccum command accumulates the values in the "quantity" field, and add the field name "cumulative quantity" in the output results. If the parameter is specified as "f=quantity", the field name of the cumulative result will remain as "quantity", thus results in error because the same field name “quantity” exists in the output.

$ maccum f=quantity:accumulationquantity i=dat1.csv o=rsl2.csv
#END# kgaccum f=quantity:accumulationquantity i=dat1.csv o=rsl2.csv
$ more rsl2.csv
brand,quantity,accumulationquantity
A,10,10
B,20,30
C,30,60
D,40,100
$ maccum f=quantity i=dat1.csv o=rsl2.csv
#ERROR# same field name is specified: quantity (kgaccum)

Example 3: Mixing field name and field number

The field name and field number can be specified at the same time.

$ mcut f=0,1:salesquantity -x i=dat1.csv o=rsl3.csv
#END# kgcut -x f=0,1:salesquantity i=dat1.csv o=rsl3.csv
$ more rsl3.csv
brand,salesquantity
A,10
B,20
C,30
D,40

2.6.4 Wildcard

The wildcard characters "*" and "?" can be used to specify multiple field names. The asterisk sign "*" matches 0 or more characters, and the question mark "?" matches a single character. Note that the order of evaluation of wildcard characters follows the order of the fields in the input data. For example, if the order of the fields in input data is A5,A3,A4,A2,A1, the parameter f=A* is evaluated as f=A5,A3,A4,A2,A1.

Example 1: Basic Example

The expression "quantity*" matches field names starting with quantity ("quantity10", "quantity11", "quantity12" and "quantity123").

$ more dat1.csv
brand,quantity10,quantity11,quantity12,quantity123
A,10,15,9,1
B,20,16,8,2
C,30,17,7,3
D,40,18,6,4
$ mcut f= quantity* i=dat1.csv o=rsl1.csv
#ERROR# invalid argument: quantity* (kgcut)
$ more rsl1.csv
rsl1.csv: No such file or directory

Example 2: Wildcard character “?”

Select field names which begin with "quantity" followed by 1, and match any single character after 1. In this case, the wildcard does not match with field name “quantity123”.

$ mcut f= quantity 1? i=dat1.csv o=rsl2.csv
#ERROR# invalid argument: quantity (kgcut)
$ more rsl2.csv
rsl2.csv: No such file or directory

2.6.5 Replace the name of an output field

The special character "&" specified in the output field name can be replaced with the current field name. For example, the parameter f=abc:xx&xx returns xxabcxx as the output field name. The "&" character can be specified at any position as many times as required in the output field name. However, the ampersand is a special character in shell which is interpreted as "background execution". Thus, it is necessary to escape and enclose the field name in double quotes when including "&" in field name.

Example 1: Basic Example

In this example, "&" is replaced with “brand” in the input field name, which is equivalent to the expression "f=brand:brand code".

$ more dat1.csv
brand,quantity10,quantity11,quantity12,quantity123
A,10,15,9,1
B,20,16,8,2
C,30,17,7,3
D,40,18,6,4
$ mcut f="brand:& code" i=dat1.csv o=rsl1.csv
#END# kgcut f=brand:& code i=dat1.csv o=rsl1.csv
$ more rsl1.csv
brand code
A
B
C
D

Example 2: Combine with wildcard

Attach “&” after sales& to replace the character with input field name (e.g. "quantity10") in the output field name. For all input fields name beginning with “quantity”, attach “sales” as the prefix in the output field name.

$ mcut f="brand,quantity*:sales&" i=dat1.csv o=rsl2.csv
#END# kgcut f=brand,quantity*:sales& i=dat1.csv o=rsl2.csv
$ more rsl2.csv
brand,salesquantity10,salesquantity11,salesquantity12,salesquantity123
A,10,15,9,1
B,20,16,8,2
C,30,17,7,3
D,40,18,6,4