https://bit.ly/2CYW8LN
The two statistics most commonly used to characterize observational data are the average and the standard deviation. Denote by x1 , x2 ,..., xn the n individual observations in a random sample from some process. Then the average and standard deviation are defined as follows: Average:
Clearly, the average gives one number around which the n observations tend to cluster. The standard deviation gives a measure of how the n observations vary or spread about this average. The square of the standard deviation is called the variance. If we consider a unit mass at each point xi , then the variance is equivalent to a moment of inertia about an axis through x. It is readily seen that for a fixed value of x, greater spreads from the average will produce larger values of the standard deviation s. The average and the standard deviation can be used jointly to summarize where the observations are concentrated. Tchebysheff’s theorem states: A fraction of at least 1 2 (1/k2) of the observations lie within k standard deviations of the average. The theorem guarantees lower bounds on the percentage of observations within k (also known as z in some textbooks) standard deviations of the average.
Since the average and the standard deviation are computed from a sample, they are themselves subject to fluctuation. However, if m is the long-term average of the process and
s is the long-term standard deviation, then
 
 
 
 
