Great Expectations and What the Dickens is Probability Distribution Anyway?

If you are feeling like Pip in Charles Dickens’ wonderful novel Great Expectations every time you think of statistics, you are not alone! Not sure who Pip is? Have a peek at the latest of many movies based on this book: Great Expectations trailer

Pip started life in a poor community raised by a much older cruel sister. He did, however, grow up to be a gentleman (and a scholar?) and come to realize that our great expectations in life won’t necessarily come true. We instead work hard all of our lives and ultimately have to accept what is. Getting too serious? Have a gander at Diggy Simmons music video “Great Expectations” to relax a bit: Diggy Simmons music video

Ok we’re back. So what is the link between Pip and statistics?

As a researcher we are often interested in “what to expect” in future experiments or trials. The methodology used to perform the research and analysis of results will help to obtain an estimate of the answer to your question – see my previous post if you are in the dark about this one (Allegory of the cave).

In statistics the term “expectation” is given a precise definition in terns of probabilities (the chance that something will happen – how likely is it that some event will happen). Thus, if we consider an experiment or trial as taking a variable x at random from some population of readings and recording its value then the value to expect for x is the mean µ of this population.

Here is the rub: the population mean is usually a quantity whose value we can NEVER determine exactly – it is the value to EXPECT. This is a VERY important concept in statistics.

*** Caution: stats talk below – skip if already feeling dizzy…

When we make predictions about future trials we have to keep in mind that we are working with a sample of results that will necessarily have a measure of uncertainty associated with them. By organizing our data into frequency tables we can then present its distribution graphically (ie: frequency curve, histogram) and get our first appreciation of where the center is (mean, median, mode) and scatter (variance and standard deviation). Finally, if we convert our frequency distributions to probability distributions (divide each class frequency by the sum of frequencies) we can obtain expected values from these distributions. Plural? There are different types? Yes, and we will chat about these in future posts…

*** Safe re-entry here:

I am ok with having to work with estimates and never knowing the truth. You? As Socrates once said (a long, long time ago!): “The only true wisdom is in knowing you know nothing.”

So what next? Maybe watch the movie “Great Expectations” this week-end and tell everyone that you were studying for your stats class. Let’s talk about organizing data next.

Enjoy the movie.

Pascal Tyrrell