3.23 mhashavg - Compute Average with Hash Function

Calculate the average of data series specified at f= parameter based on the key at k= with hash function.

The processing speed of this command is faster than mavg since the key fields do not have require prior sorting. However, variation in key lengths (different length of strings in field) will slow down the processing speed.

Format

mhashavg f= [hs=] [k=] [-n] [i=] [o=] [-nfn] [-nfno] [-x] [precision=] [--help] [--version]

Parameters

f=

Calculate the average of the field name (Multiple fields can be specified) .

 

Specify the new field name after colon ":". Example: f=Quantity:AverageQuantity.

k=

Calculate the average on the data series based on the key field(s) (Multiple keys can be specified).

 

This command do not use aggregate key break processing, prior sorting is not required.

hs=

Hash size (Default value: 199999)

 

Refer to mhashsum for related information.

-n

Return NULL in output if there are null values in f=.

Example

Example 1: Basic Example

Calculate the average Quantity and average Amount for each Customer.

$ more dat1.csv
Customer,Quantity,Amount
A,1,
B,,15
A,2,20
B,3,10
B,1,20
$ mhashavg k=Customer f=Quantity,Amount i=dat1.csv o=rsl1.csv
#END# kghashavg f=Quantity,Amount i=dat1.csv k=Customer o=rsl1.csv
$ more rsl1.csv
Customer,Quantity,Amount
A,1.5,20
B,2,15

Example 2: NULL value in output

The output returns NULL if there NULL value is present in Quantity and Amount. Use -n option to print the null value.

$ mhashavg k=Customer f=Quantity,Amount -n i=dat1.csv o=rsl2.csv
#END# kghashavg -n f=Quantity,Amount i=dat1.csv k=Customer o=rsl2.csv
$ more rsl2.csv
Customer,Quantity,Amount
A,1.5,
B,,15

Remarks

Refer to the benchmark at mhashsum to find out more on processing speed.

Related commands

mavg : Compute average

mhashsum : Compute hash total value