Convert csv formatted data into arff file (data format for WEKA). User must specify the type of attribute for arff, for instance, d= defines category format field, n= defines numeric format field, s= defines string format field, and finally, D= defines date format field. The date format includes time information when attached %t to date format field name.
mcsv2arff n=|d=|D=|s= [T=] i= [o=] [--help] [--version]
n= Numeric field name(s) (multiple items can be specified). d= Category field name(s) (multiple items can be specified). D= List of date (time) field name(s) (multiple items can be specified). [%t] When %t is not specified:yyyyMMdd When %t is specified :yyyyMMddHHmmss s= Character string field names (multiple items can be specified). T= Title in character string.
Convert data to arff format and define "customer" field as string type, "product" field as category type, "date" field as date type (exclude time), “quantity” and “amount” fields as numeric attributes.
$ more dat1.csv customer,product,date,quantity,amount No.1,A,20081201,1,10 No.2,A,20081202,2,20 No.3,A,20081203,3,30 No.4,B,20081201,4,40 No.5,B,20081203,5,50 $ mcsv2arff s=customer d=product D=date n=quantity,amount T=Customer_Purchase_Data i=dat1.csv o=rsl1.csv #END# kgcsv2arff D=date T=Customer_Purchase_Data d=product i=dat1.csv n=quantity,amount o=rsl1.csv s=customer $ more rsl1.csv @RELATION Customer_Purchase_Data @ATTRIBUTE customer string @ATTRIBUTE date date yyyyMMdd @ATTRIBUTE quantity numeric @ATTRIBUTE amount numeric @ATTRIBUTE product {A,B} @DATA No.1,20081201,1,10,A No.2,20081202,2,20,A No.3,20081203,3,30,A No.4,20081201,4,40,B No.5,20081203,5,50,B
Specify the date with the time information by adding %t such that D=date%t.
$ more dat2.csv customer,product,date,quantity,amount No.1,A,20081201102030,1,10 No.2,A,20081202123010,2,20 No.3,A,20081203153010,3,30 No.4,B,20081201174010,4,40 No.5,B,20081203133010,5,50 $ mcsv2arff s=customer d=product D=date%t n=quantity,amount T=Customer_Purchase_Data i=dat2.csv o=rsl2.csv #END# kgcsv2arff D=date%t T=Customer_Purchase_Data d=product i=dat2.csv n=quantity,amount o=rsl2.csv s=customer $ more rsl2.csv @RELATION Customer_Purchase_Data @ATTRIBUTE customer string @ATTRIBUTE date date yyyyMMddHHmmss @ATTRIBUTE quantity numeric @ATTRIBUTE amount numeric @ATTRIBUTE product {A,B} @DATA No.1,20081201102030,1,10,A No.2,20081202123010,2,20,A No.3,20081203153010,3,30,A No.4,20081201174010,4,40,B No.5,20081203133010,5,50,B
marff2csv : Reverse conversion.