MCMD reads the field names in the first row of CSV data, the field can also be specified with a field number without the field names. There are four options for handling the row of field names includes -nfn,-nfno,-nfni and -x. Its usage will be illustrated with examples as follows. In addition, note that the field number starts at 0 from the left such as 0, 1, 2.
When -nfn (no field name) is specified, the first row in the data will not be considered as field names. Thus, each field is specified as a number (note that the number starts from 0).
$ more dat2.csv a,2 b,5 b,4 $ msum -nfn k=0 f=1 i=dat2.csv o=rsl1.csv #END# kgsum -nfn f=1 i=dat2.csv k=0 o=rsl1.csv $ more rsl1.csv a,2 b,9
When -nfno (no field name for output) is specified, the first row of input data is initialized as field names, but the field names is removed from the output data.
$ more dat1.csv key,val a,2 b,5 b,4 $ msum k=key f=val -nfno i=dat1.csv o=rsl2.csv #END# kgsum -nfno f=val i=dat1.csv k=key o=rsl2.csv $ more rsl2.csv a,2 b,9
The option -nfni (no field name for input) is only available in the mcut command. This option does the opposite of -nfno, where the first row of input data is treated as data items, but the field names will be shown in the output data.
$ mcut f=0:key,1:val -nfni i=dat2.csv o=rsl3.csv #END# kgcut -nfni f=0:key,1:val i=dat2.csv o=rsl3.csv $ more rsl3.csv key,val a,2 b,5 b,4
For CSV data with a field names, use the -x option to specify the field number.
$ msum -x k=0 f=1 i=dat1.csv o=rsl4.csv #END# kgsum -x f=1 i=dat1.csv k=0 o=rsl4.csv $ more rsl4.csv key,val a,2 b,9
Field name can contain the characters stated as follows:
Multibyte characters (such as UTF-8)
Alphabet (a-z, A-Z)
Number (0-9)
Symbol
However, it is recommended to avoid using the following symbols. Using the symbols will not return an error, however, the symbol maybe be used as special characters in MCMD and the special character function may not be available if it is used in as a field name.
, Comma
: Colon
% Percent
* Asterisk
? Question mark
& And
\ Backslash
] Square brackets, right
[ Square brackets, left
When specifying field number, the field numbers can be listed with a comma delimiter. Alternatively, it is also possible to specify the number at the end of the field name (add "L") or specify the range (-).
For example, the argument 0L specifies the last field, and 2L specifies the 2nd field from the end (note that field number starts from 0). Furthermore, when 0-5 is specified, six fields starting from 0 to 5 are selected, which is equivalent to 0,1,2,3,4,5
By specifying "0-4", fields 0,1,2,3,4 are specified.
$ more dat1.csv brand,quantity01,quantity02,quantity03,quantity04,quantity05,quantity06,quantity07,quantity08,quantity09,quantity10 A,10,50,90,130,170,210,250,290,330,370 B,20,60,100,140,180,220,260,300,340,380 C,30,70,110,150,190,230,270,310,350,390 D,40,80,120,160,200,240,280,320,360,400 $ mcut -x f=0-4 i=dat1.csv o=rsl1.csv #END# kgcut -x f=0-4 i=dat1.csv o=rsl1.csv $ more rsl1.csv brand,quantity01,quantity02,quantity03,quantity04 A,10,50,90,130 B,20,60,100,140 C,30,70,110,150 D,40,80,120,160
By specifying “4-0”, fields 0,1,2,3,4 are specified.
$ mcut -x f=4-0 i=dat1.csv o=rsl2.csv #END# kgcut -x f=4-0 i=dat1.csv o=rsl2.csv $ more rsl2.csv quantity04,quantity03,quantity02,quantity01,brand 130,90,50,10,A 140,100,60,20,B 150,110,70,30,C 160,120,80,40,D
By specifying “1-0,2-4”, fields “1,0,2,3,4” are specified.EOF scp=<<’EOF’ mcut -x f=1-0,2-4 i=dat1.csv o=rsl3.csv more rsl3.csv
$ mcut -x f=4-0 i=dat1.csv o=rsl2.csv #END# kgcut -x f=4-0 i=dat1.csv o=rsl2.csv $ more rsl2.csv quantity04,quantity03,quantity02,quantity01,brand 130,90,50,10,A 140,100,60,20,B 150,110,70,30,C 160,120,80,40,D
By specifying "2L", the second field from the end is specified (quantity 08).
$ mcut -x f=2L i=dat1.csv o=rsl4.csv #END# kgcut -x f=2L i=dat1.csv o=rsl4.csv $ more rsl4.csv quantity08 290 300 310 320
By specifying "5-3L", the 5th to the 3rd item from end is specified, i.e. "5,6,7".
$ mcut -x f=5-3L i=dat1.csv o=rsl5.csv #END# kgcut -x f=5-3L i=dat1.csv o=rsl5.csv $ more rsl5.csv quantity05,quantity06,quantity07 170,210,250 180,220,260 190,230,270 200,240,280
The f= parameter is used to specify the field(s) in many commands. The format of f= is defined as “input field:output field". If output field name is not specified, the same input field name will be used as output field name. In addition, it is also possible to specify combinations of field names such as f=0:quantity.
By specifying "quantity:unit sales", the field name is converted from “quantity” to “unit sales” in the output.
$ more dat1.csv brand,quantity A,10 B,20 C,30 D,40 $ mcut f=brand,quantity:salesquantity i=dat1.csv o=rsl1.csv #END# kgcut f=brand,quantity:salesquantity i=dat1.csv o=rsl1.csv $ more rsl1.csv brand,salesquantity A,10 B,20 C,30 D,40
The maccum command accumulates the values in the "quantity" field, and add the field name "cumulative quantity" in the output results. If the parameter is specified as "f=quantity", the field name of the cumulative result will remain as "quantity", thus results in error because the same field name “quantity” exists in the output.
$ maccum f=quantity:accumulationquantity i=dat1.csv o=rsl2.csv #END# kgaccum f=quantity:accumulationquantity i=dat1.csv o=rsl2.csv $ more rsl2.csv brand,quantity,accumulationquantity A,10,10 B,20,30 C,30,60 D,40,100 $ maccum f=quantity i=dat1.csv o=rsl2.csv #ERROR# same field name is specified: quantity (kgaccum)
The field name and field number can be specified at the same time.
$ mcut f=0,1:salesquantity -x i=dat1.csv o=rsl3.csv #END# kgcut -x f=0,1:salesquantity i=dat1.csv o=rsl3.csv $ more rsl3.csv brand,salesquantity A,10 B,20 C,30 D,40
The wildcard characters "*" and "?" can be used to specify multiple field names. The asterisk sign "*" matches 0 or more characters, and the question mark "?" matches a single character. Note that the order of evaluation of wildcard characters follows the order of the fields in the input data. For example, if the order of the fields in input data is A5,A3,A4,A2,A1, the parameter f=A* is evaluated as f=A5,A3,A4,A2,A1.
The expression "quantity*" matches field names starting with quantity ("quantity10", "quantity11", "quantity12" and "quantity123").
$ more dat1.csv brand,quantity10,quantity11,quantity12,quantity123 A,10,15,9,1 B,20,16,8,2 C,30,17,7,3 D,40,18,6,4 $ mcut f= quantity* i=dat1.csv o=rsl1.csv #ERROR# invalid argument: quantity* (kgcut) $ more rsl1.csv rsl1.csv: No such file or directory
Select field names which begin with "quantity" followed by 1, and match any single character after 1. In this case, the wildcard does not match with field name “quantity123”.
$ mcut f= quantity 1? i=dat1.csv o=rsl2.csv #ERROR# invalid argument: quantity (kgcut) $ more rsl2.csv rsl2.csv: No such file or directory
The special character "&" specified in the output field name can be replaced with the current field name. For example, the parameter f=abc:xx&xx returns xxabcxx as the output field name. The "&" character can be specified at any position as many times as required in the output field name. However, the ampersand is a special character in shell which is interpreted as "background execution". Thus, it is necessary to escape and enclose the field name in double quotes when including "&" in field name.
In this example, "&" is replaced with “brand” in the input field name, which is equivalent to the expression "f=brand:brand code".
$ more dat1.csv brand,quantity10,quantity11,quantity12,quantity123 A,10,15,9,1 B,20,16,8,2 C,30,17,7,3 D,40,18,6,4 $ mcut f="brand:& code" i=dat1.csv o=rsl1.csv #END# kgcut f=brand:& code i=dat1.csv o=rsl1.csv $ more rsl1.csv brand code A B C D
Attach “&” after sales& to replace the character with input field name (e.g. "quantity10") in the output field name. For all input fields name beginning with “quantity”, attach “sales” as the prefix in the output field name.
$ mcut f="brand,quantity*:sales&" i=dat1.csv o=rsl2.csv #END# kgcut f=brand,quantity*:sales& i=dat1.csv o=rsl2.csv $ more rsl2.csv brand,salesquantity10,salesquantity11,salesquantity12,salesquantity123 A,10,15,9,1 B,20,16,8,2 C,30,17,7,3 D,40,18,6,4