Calculate the type of statistics specified at c= parameter for fields specified at f= parameter.
msummary c= f= [a=] [k=] [i=] [o=] [-nfn] [-nfno] [-x] [-q] [precision=] [--help] [--version]
k= Compute statistics based on the key field(s) specified (multiple fields can be specified). f= Field lists for computation of statistical summary (multiple fields can be specified). When -x,-nfn option is specified, specify the field number (0 ). c= Statistics (multiple fields can be specified) Specify list of statistics delimited by comma. Statistics list: sum/mean/count/ucount/devsq/var/uvar/sd/usd/cv/min/qtile1/median/qtile3/max/ range/qrange/mode/skew/uskew/kurt/ukurt -a New column name. Results from calculation on column(s) specified at f= parameter (default is fld).
The list of statistics specified at c= parameter is shown in Table 3.33.
Value of c= Description Equation Remarks count Count (Except NULL value) It can not be applied to character string field. ucount Unique count It can not be applied to character string field. sum Total mean Arithmetic mean devsq Sum of squared deviation var Variance uvar Variance (unbiased estimate) sd Standard deviation usd Standard deviation (sort of unbiased variance) commonly used standard deviation cv Coefficient of variation mode Mode Print the value of the smaller value if the frequency is same. min Minimum value max Maximum value range Range median Median qtile1 First quartile qtile3 Third quartile qrange Interquartile range skew Skewness uskew Skewness (unbiased estimate) omitted kurt Kurtosis ukurt Kurtosis (unbiased estimate) omitted
: Number of non-NULL records
: Number of duplicate values removed
: Most frequent value
Find out the median and average "quantity" and "amount" by each customer. Save the output in a new column named "type".
$ more dat1.csv customer,quantity,amount A,1,10 A,2,20 B,1,15 B,3,10 B,1,20 $ msummary k=customer f=quantity,amount c=median:medianval,mean:meanval a=type i=dat1.csv o=rsl1.csv #END# kgsummary a=type c=median:medianval,mean:meanval f=quantity,amount i=dat1.csv k=customer o=rsl1.csv $ more rsl1.csv customer%0,type,medianval,meanval A,quantity,1.5,1.5 A,amount,15,15 B,quantity,1,1.666666667 B,amount,15,15
mstats : Compute one type of statistics.