3.15 mcsv2arff - Convert csv to arff Format

Convert csv formatted data into arff file (data format for WEKA). User must specify the type of attribute for arff, for instance, d= defines category format field, n= defines numeric format field, s= defines string format field, and finally, D= defines date format field. The date format includes time information when attached %t to date format field name.

Format

mcsv2arff n=|d=|D=|s= [T=] i= [o=] [--help] [--version]

Parameters

n=

Numeric field name(s) (multiple items can be specified).

d=

Category field name(s) (multiple items can be specified).

D=

List of date (time) field name(s) (multiple items can be specified). [%t]

 

When %t is not specified:yyyyMMdd

 

When %t is specified   :yyyyMMddHHmmss

s=

Character string field names (multiple items can be specified).

T=

Title in character string.

Examples

Example 1: Convert csv format data to arff format

Convert data to arff format and define "customer" field as string type, "product" field as category type, "date" field as date type (exclude time), “quantity” and “amount” fields as numeric attributes.

$ more dat1.csv
customer,product,date,quantity,amount
No.1,A,20081201,1,10
No.2,A,20081202,2,20
No.3,A,20081203,3,30
No.4,B,20081201,4,40
No.5,B,20081203,5,50
$ mcsv2arff s=customer d=product D=date n=quantity,amount T=Customer_Purchase_Data i=dat1.csv  o=rsl1.csv
#END# kgcsv2arff D=date T=Customer_Purchase_Data d=product i=dat1.csv n=quantity,amount o=rsl1.csv s=customer
$ more rsl1.csv
@RELATION	Customer_Purchase_Data

@ATTRIBUTE	customer	string
@ATTRIBUTE	date	date yyyyMMdd
@ATTRIBUTE	quantity	numeric
@ATTRIBUTE	amount	numeric
@ATTRIBUTE	product	{A,B}

@DATA
No.1,20081201,1,10,A
No.2,20081202,2,20,A
No.3,20081203,3,30,A
No.4,20081201,4,40,B
No.5,20081203,5,50,B

Example 2: Convert csv format data to arff format (include time in the date attribute)

Specify the date with the time information by adding %t such that D=date%t.

$ more dat2.csv
customer,product,date,quantity,amount
No.1,A,20081201102030,1,10
No.2,A,20081202123010,2,20
No.3,A,20081203153010,3,30
No.4,B,20081201174010,4,40
No.5,B,20081203133010,5,50
$ mcsv2arff s=customer d=product D=date%t n=quantity,amount T=Customer_Purchase_Data i=dat2.csv  o=rsl2.csv
#END# kgcsv2arff D=date%t T=Customer_Purchase_Data d=product i=dat2.csv n=quantity,amount o=rsl2.csv s=customer
$ more rsl2.csv
@RELATION	Customer_Purchase_Data

@ATTRIBUTE	customer	string
@ATTRIBUTE	date	date yyyyMMddHHmmss
@ATTRIBUTE	quantity	numeric
@ATTRIBUTE	amount	numeric
@ATTRIBUTE	product	{A,B}

@DATA
No.1,20081201102030,1,10,A
No.2,20081202123010,2,20,A
No.3,20081203153010,3,30,A
No.4,20081201174010,4,40,B
No.5,20081203133010,5,50,B

Related Command

marff2csv : Reverse conversion.

Reference

http://weka.wikispaces.com/ARFF