Specify the numeric fields in the parameter f=, and calculate the statistics specified in the parameter c=. Specify the aggregate key unit at k=. NULL value in the specified field(s) at f= are ignored. However, if all records include NULL values, NULL values will be included in the output.
mstats c= f= [k=] [i=] [o=] [-nfn] [-nfno] [-x] [-q] [precision=] [--help] [--version]
k= Compute aggregate statistics on the key field(s) specified (multiple fields can be specified). f= Fields for which statistics are computed (multiple fields can be specified). c= Statistics (select one from the list below) sum|mean|count|ucount|devsq|var|uvar|sd|usd|USD|cv|min|qtile1| median|qtile3|max|range|qrange|mode|skew|uskew|kurt|ukurt
Value of c= Description Equation Remarks count Count (Except NULL value) It can not be applied to character string field. ucount Unique count It can not be applied to character string field. sum Total mean Arithmetic mean devsq Sum of squared deviation var Variance uvar Variance (unbiased estimate) sd Standard deviation usd Standard deviation (unbiased variance) commonly used standard deviation USD Unbiased standard deviation Omission Accurate unbiased estimation cv Coefficient of variation mode Mode Print the value of the smaller value if the frequency is same Print NULL if values are different. min Minimum value max Maximum value range Range median Median qtile1 First quartile qtile3 Third quartile qrange Interquartile range skew Skewness uskew Skewness (unbiased estimate) omitted kurt Kurtosis ukurt Kurtosis (unbiased estimated) omitted
: Number of non-NULL records
: Number of duplicate values removed
: Most frequent value
Calculate the statistical sum of "quantity" and "amount" field for each "customer".
$ more dat1.csv customer,quantity,amount A,1,10 B,5,20 B,2,10 C,1,15 C,3,10 C,1,21 $ mstats k=customer f=quantity,amount c=sum i=dat1.csv o=rsl1.csv #END# kgstats c=sum f=quantity,amount i=dat1.csv k=customer o=rsl1.csv $ more rsl1.csv customer%0,quantity,amount A,1,10 B,7,30 C,5,46
Calculate the statistical maximum value.
$ mstats k=customer f=quantity,amount c=max i=dat1.csv o=rsl2.csv #END# kgstats c=max f=quantity,amount i=dat1.csv k=customer o=rsl2.csv $ more rsl2.csv customer%0,quantity,amount A,1,10 B,5,20 C,3,21
msim : Find out the bivariate statistics.
mavg : Commands specific toc=avg.
msum : Commands specific to c=sum.
mcount : Unlike c=count, this count the number of rows for each aggregate key.