The following statistical functions are provided which operate on a range or a list of several cells. Note that lists can include arrays. There are examples of the statistical functions in the PipeDream Examples archive.
AVG(list)
Returns the arithmetic mean value of list.
Only numeric elements are evaluated so that blank cells do not upset averages.
BETA(a, b)
Returns the value of the Beta function for arguments a, b.
One definition of the Beta function is
beta(a, b) = gamma(a) × gamma(b) ÷ gamma(a + b)
This special function is actually evaluated by PipeDream using the formula
beta(a, b) = exp( gammaln(a) + gammaln(b) - gammaln(a + b) )
COMBIN(n, k)
This function yields the number of combinations of n unlike things taken k at a time.
The formula used is:
nCk = n! ÷ ((n - k)! × k!)
For very large n this is an approximation achieved using logarithms.
COUNT(list)
Counts the number of numeric elements in list.
GAMMALN(number)
This special function yields the natural logarithm of the gamma function of number.
GRAND(m, s)
Returns a random number with gaussian distribution, mean m and standard deviation s.
If m is not given, PipeDream assumes 0; if s is not given, PipeDream assumes 1.
GROWTH(LOGEST_data, x_data)
Takes the result of the LOGEST function (LOGEST_data) and applies it to x_data to return estimated y_data.
LINEST(known_y's, known_x's)
Calculates the formula of the straight line which best fits your data (known_y's and known_x's) and returns an array.
The result is an array of 2 values giving {c,m} for the straight line fit
y = mx + c
where the first array element, c, is the y-intercept value (i.e. where x=0) of the best-fitting straight line and the second array element, m, is the gradient of that line.
An example is given below. LINEST is used to calculate the gradient and intercept of the best-fitting straight line through the data points. INDEX is used to extract separate elements of the result array.
A | B | C | D | |
---|---|---|---|---|
1 | 2.00 | 4.00 | -2.00 | C1=linest(B1B5,A1A5) |
2 | 3.00 | 7.00 | 3.00 | C2=index(C1,2,1) |
3 | 5.00 | 13.00 | ||
4 | 7.00 | 19.00 | ||
5 | 10.00 | 28.00 |
To verify the results, a scatter chart of the data series can be drawn and a line of best fit may be specified to be drawn on the chart.
The result is an array of m+1 values such that
y = a0 + Σi=m(ai·xi)
where the xi are the independent x variables.
An example is given below. It is assumed that the value of a house is related to the garden size, number of bedrooms and distance from the city centre. Values for eight houses are known and these are used to calculate (in A10) an array of the coefficients. Expanding the array into the cells A12 to C12 shows the resulting formula to be
Value = 40688 + 62*Garden + 3021*Bedrooms - 956*Distance
This is then used to calculate a value for a house with a garden of 600 square metres, 4 bedrooms and 2 miles from the city centre. D14 and D15 show two ways of calculating the result.
A | B | C | D | E | |
---|---|---|---|---|---|
1 | Garden m2 | Bedrooms | Distance | Value | |
2 | 1600 | 4 | 4 | 150000 | |
3 | 450 | 3 | 67000 | ||
4 | 50 | 2 | 0.5 | 55000 | |
5 | 0 | 2 | 1 | 45000 | |
6 | 650 | 4 | 3 | 100000 | |
7 | 200 | 1 | 3.5 | 58000 | |
8 | 250 | 4 | 55000 | ||
9 | 620 | 3 | 2 | 75000 | |
10 | 40688 | A10=linest(D2D9,A2C9) | |||
11 | 2 | A11=set_value(A12C12,index(A10,2,1,3,1)) | |||
12 | 62 | 3021 | -956 | ||
13 | |||||
14 | 600 | 4 | 2 | 87949 | D14=trend(A10,A14C14) |
15 | 87949 | D15=sum(A12C12*A14C14)+A10 |
LISTCOUNT(array)
Returns a two-column array with the first column being a sorted list of unique entries from the first column of array, and the second column giving the number of occurrences of each entry.
If array has two columns the second column will be treated as a corresponding set of numbers to be added in to the total.
Example:
You have a list of customer names in column A
and in column C
you have a list of sales figures for individual sales to those customers, then
SET_VALUE(D1E20,LISTCOUNT({A1A100,C1C100}))
would write out 20 rows of customers names and total sales figures in D1E20
. {A1A100,C1C100}
makes a two column array of 100 rows.
You could use SORT()
and FLIP()
to sort the result of LISTCOUNT()
into order before listing it out.
LOGEST(known_y's, known_x's)
Calculates an exponential curve which fits your data and returns an array which describes the curve.
The equation for the curve is:
y = b · (mx)
or
y = b · Πi=m(mixi)
where the xi are the independent x variables.
This function is simply a logarithmic version of LINEST.
MAX(list)
Returns the maximum value in list.
MEDIAN(array)
Returns the median value in the (not necessarily sorted) range.
If there are an even number of values in the range, the mean of the two middle values is returned.
MIN(list)
Returns the minimum value in list.
PERMUT(n, k)
This function yields the number of permutations of n unlike things taken k at a time.
The formula used is:
nPk = n! ÷ (n - k)!
For very large n this is an approximation achieved using logarithms.
RAND
Returns a random number in the range 0 to 1 from a sequence which is the same each time PipeDream starts.
RAND(0) generates a random number in the range 0 to 1 using a random seed.
RAND(number)
Generates a random number in the range 0 to 1 using number as the initial seed.
RANK(array {, spearflag})
Takes an array and ranks the values it contains, producing as an output an array with two columns { rank, entries }, where
Rank is the rank of each value (from highest to lowest).
Entries contains the number of entries of that value in the input array.
If spearflag is present and is non-zero, equally ranked entries are adjusted so that the output is suitable for passing to the SPEARMAN() rank correlation function.
SPEARMAN(array2)
Returns the standard deviation of the values in list.
This is the square root of the variance. See VAR for the equation used.
STDP(list)
Returns the population standard deviation of the values in list.
This is the square root of the population variance. See VARP for the equation used.
TREND(LINEST_data, x_data)
Takes the result of the LINEST function (LINEST_data) and applies it to x_data to return estimated y_data.
VAR(list)
Returns the variance of the values in list.
The equation used is:
(n·(Σ(x2)) - (Σx)2) ÷ n·(n-1)
where x is each value and n is the number of values.
VARP(list)
Returns the population variance of the values in list.
The equation used is:
(n·(Σ(x2)) - (Σx)2) ÷ n2
where x is each value and n is the number of values.