# Statistical Aggregation functions¶

## Avg¶

Avg() finds the average value of the aggregated data in the expression over a number of records as defined by a group by clause.

`Avg( [DISTINCT] expr )`

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
DISTINCT If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
```Temp: crosstable (Month, Sales) load * inline [ Customer|Jan|Feb|Mar||Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec Astrida|46|60|70|13|78|20|45|65|78|12|78|22 Betacab|65|56|22|79|12|56|45|24|32|78|55|15 Canutility|77|68|34|91|24|68|57|36|44|90|67|27 Divadip|36|44|90|67|27|57|68|47|90|80|94 ] (delimiter is '|'); ```

```Avg1: LOAD Customer, Avg(Sales) as MyAverageSalesByCustomer Resident Temp Group By Customer; ```

Customer MyAverageSalesByCustomer
Astrida 48.916667
Betacab 44.916667
Canutility 56.916667
This can be checked in the sheet by creating a table including the measure:
Sum(Sales)/12
Given that the Temp table is loaded as in the previous example:
`LOAD Customer,Avg(DISTINCT Sales) as MyAvgSalesDistinct Resident Temp Group By Customer;`

Customer MyAverageSalesByCustomer
Astrida 43.1
Betacab 43.909091
Canutility 55.909091
Only the distinct values are counted. Divide the total by the number of non-duplicate values.

## Correl¶

Correl() returns the aggregated correlation coefficient for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`Correl( value1, value2 )`

Return data type: numeric

Argument Description
value1, value2 The expressions or fields containing the two sample sets for which the correlation coefficient is to be measured.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
```Salary: Load *, 1 as Grp; LOAD * inline [ "Employee name"|Gender|Age|Salary Aiden Charles|Male|20|25000 Brenda Davies|Male|25|32000 Charlotte Edberg|Female|45|56000 Daroush Ferrara|Male|31|29000 Eunice Goldblum|Female|31|32000 Freddy Halvorsen|Male|25|26000 Gauri Indu|Female|36|46000 Harry Jones|Male|38|40000 Ian Underwood|Male|40|45000 Jackie Kingsley|Female|23|28000 ] (delimiter is '|'); Correl1: LOAD Grp, Correl(Age,Salary) as Correl_Salary Resident Salary Group By Grp;  ``` In a table with the dimension Correl_Salary, the result of the Correl() calculation in the data load script will be shown: 0.9270611

## Fractile¶

Fractile() finds the value that corresponds to the fractile (quantile) of the aggregated data in the expression over a number of records as defined by a group by clause.

`Fractile( expr, fraction )``

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
fraction A number between 0 and 1 corresponding to the fractile (quantile expressed as a fraction) to be calculated.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
```Table1: crosstable LOAD recno() as ID, * inline [ Observation|Comparison 35|2 40|27 12|38 15|31 21|1 14|19 46|1 10|34 28|3 48|1 16|2 30|3 32|2 48|1 31|2 22|1 12|3 39|29 19|37 25|2 ] (delimiter is '|'); Fractile1: LOAD Type, Fractile(Value,0.75) as MyFractile Resident Table1 Group By Type; ``` In a table with the dimensions Type and MyFractile, the results of the Fractile() calculations in the data load script are:
``` Type MyFractile Comparison 27.5 Observation 36```

## Kurtosis¶

Kurtosis() returns the kurtosis of the data in the expression over a number of records as defined by a group by clause.

`Kurtosis( [distinct ] expr )`

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
distinct If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
```Table1: crosstable LOAD recno() as ID, * inline [ Observation|Comparison 35|2 40|27 12|38 15|31 21|1 14|19 46|1 10|34 28|3 48|1 16|2 30|3 32|2 48|1 31|2 22|1 12|3 39|29 19|37 25|2 ] (delimiter is '|'); Kurtosis1: LOAD Type, Kurtosis(Value) as MyKurtosis1, Kurtosis(DISTINCT Value) as MyKurtosis2 Resident Table1 Group By Type; ``` In a table with the dimensions Type, MyKurtosis1,and MyKurtosis2, the results of the Kurtosis() calculations in the data load script are:
``` Type MyKurtosis1 MyKurtosis2 Comparison -1.1612957 -1.4982366 Observation -1.1148768 -0.93540144```

## LINEST_B¶

LINEST_B() returns the aggregated b value (y-intercept) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_B( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_DF¶

LINEST_DF() returns the aggregated degrees of freedom of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_DF( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_F¶

This script function returns the aggregated F statistic (r2/(1-r2)) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_F( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_M¶

LINEST_M() returns the aggregated m value (slope) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_M( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_R2¶

LINEST_R2() returns the aggregated r2 value (coefficient of determination) of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_R2( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_SEB¶

LINEST_SEB() returns the aggregated standard error of the b value of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_SEB( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_SEM¶

LINEST_SEM() returns the aggregated standard error of the m value of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_SEM( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_SEY¶

LINEST_SEY() returns the aggregated standard error of the y estimate of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_SEY( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_SSREG¶

LINEST_SSREG() returns the aggregated regression sum of squares of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_SSREG( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## LINEST_SSRESID¶

LINEST_SSRESID() returns the aggregated residual sum of squares of a linear regression defined by the equation y=mx+b for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`LINEST_SSRESID( y_value, x_value[, y0 [, x0 ]] )`

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.
y(0), x(0)

An optional value y0 may be stated forcing the regression line to pass through the y-axis at a given point. By stating both y0 and x0 it is possible to force the regression line to pass through a single fixed coordinate.

Unless both y0 and x0 are stated, the function requires at least two valid data-pairs to calculate. If y0 and x0 are stated, a single data pair will do.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

## Median¶

Median() returns the aggregated median of the values in the expression over a number of records as defined by a group by clause.

`Median( expr )`

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.

Add the example script to your app and run it. Then build a straight table with Type and MyMedian as dimensions.

Example Result
```Table1: crosstable LOAD recno() as ID, * inline [ Observation|Comparison 35|2 40|27 12|38 15|31 21|1 14|19 46|1 10|34 28|3 48|1 16|2 30|3 32|2 48|1 31|2 22|1 12|3 39|29 19|37 25|2 ] (delimiter is '|'); Median1: LOAD Type, Median(Value) as MyMedian ```

```Resident Table1 Group By Type; ```

The results of the Median() calculation are:

• Type is MyMedian
• Comparison is 2.5
• Observation is 26.5

## Skew¶

Skew() returns the skewness of expression over a number of records as defined by a group by clause.

`Skew( [ distinct] expr )`

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
DISTINCT If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then build a straight table with Type and MySkew as dimensions.

Example Result
```Table1: crosstable LOAD recno() as ID, * inline [ Observation|Comparison 35|2 40|27 12|38 15|31 21|1 14|19 46|1 10|34 28|3 48|1 16|2 30|3 32|2 48|1 31|2 22|1 12|3 39|29 19|37 25|2 ] (delimiter is '|'); Skew1: LOAD Type, Skew(Value) as MySkew Resident Table1 Group By Type; ```

The results of the Skew() calculation are:

• Type is MySkew
• Comparison is 0.86414768
• Observation is 0.32625351

## Stdev¶

Stdev() returns the standard deviation of the values given by the expression over a number of records as defined by a group by clause.

`Stdev( [distinct] expr )`

Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
distinct If the word distinct occurs before the expression, all duplicates will be disregarded.

Add the example script to your app and run it. Then build a straight table with Type and MyStdev as dimensions.

Example Result
```Table1: crosstable LOAD recno() as ID, * inline [ Observation|Comparison 35|2 40|27 12|38 15|31 21|1 14|19 46|1 10|34 28|3 48|1 16|2 30|3 32|2 48|1 31|2 22|1 12|3 39|29 19|37 25|2 ] (delimiter is '|'); Stdev1: LOAD Type, Stdev(Value) as MyStdev Resident Table1 Group By Type;```

The results of the Stdev() calculation are:

• Type is MyStdev
• Comparison is 14.61245
• Observation is 12.507997

## Sterr¶

Sterr() returns the aggregated standard error (stdev/sqrt(n)) for a series of values represented by the expression iterated over a number of records as defined by a group by clause.

`Sterr( [distinct] expr )` Return data type: numeric

Argument Description
expr The expression or field containing the data to be measured.
distinct If the word distinct occurs before the expression, all duplicates will be disregarded.

Text values, NULL values and missing values are disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
```Table1: crosstable LOAD recno() as ID, * inline [ Observation|Comparison 35|2 40|27 12|38 15|31 21|1 14|19 46|1 10|34 28|3 48|1 16|2 30|3 32|2 48|1 31|2 22|1 12|3 39|29 19|37 25|2 ] (delimiter is '|');   Sterr1: LOAD Type, Sterr(Value) as MySterr Resident Table1 Group By Type;```

In a table with the dimensions Type and MySterr, the results of the Sterr() calculation in the data load script are:

Type MySterr

Comparison 3.2674431

Observation 2.7968733

## STEYX¶

STEYX() returns the aggregated standard error of the predicted y-value for each x-value in the regression for a series of coordinates represented by paired numbers in x-expression and y-expression iterated over a number of records as defined by a group by clause.

`STEYX( y_value, x_value )``

Return data type: numeric

Argument Description
y_value The expression or field containing the range of y-values to be measured.
x_value The expression or field containing the range of x-values to be measured.

Text values, NULL values and missing values in any or both pieces of a data-pair result in the entire data-pair being disregarded.

Add the example script to your app and run it. Then add, at least, the fields listed in the results column to a sheet in your app to see the result.

Example Result
```Trend: Load *, 1 as Grp; LOAD * inline [ Month|KnownY|KnownX Jan|2|6 Feb|3|5 Mar|9|11 Apr|6|7 May|8|5 Jun|7|4 Jul|5|5 Aug|10|8 Sep|9|10 Oct|12|14 Nov|15|17 Dec|14|16 ] (delimiter is '|'); STEYX1: LOAD Grp, STEYX(KnownY, KnownX) as MySTEYX Resident Trend Group By Grp;```

In a table with the dimension MySTEYX, the result of the STEYX() calculation in the data load script is 2.0714764.