It’s Cold Out Today – Please Remember to Dress Your Naked P-Value…

Ok, so you agree to dress your little friend before sending him/her out into the cold world of publication. But what is a p-value anyway? I realize that I am jumping the gun (pun intended) a little as it forces us to talk about inferential statistics – a challenging topic. So today I will only give you a small taste of what is to come. First, to get you in a good mood I want you to watch the trailer for the first of three hilarious Naked Gun movies.

We have already talked about research questions and today I would like to introduce you to their children the research hypotheses. Essentially they are a version of their parents that summarize the main elements of a study – sample, predictor and outcome variables – in such a way that you are able to perform a test of statistical significance. These hypotheses are not required for descriptive studies like the ones we have been discussing in our blog so far. For instance if we were to ask how many people who read this blog enjoyed the Naked Gun series of movies we would end up with a proportion. We could then simply describe our findings as discussed in my Ogive post.

But what if you wanted to now if the proportion of gals differed from the proportion of guys who enjoyed the movies as you suspect that the type of humor will please guys more than gals? As we are research scientists we would want to test this “hypothesis” in order to compare the findings among the groups: this is a test of statistical significance. The brilliant statistician Ronald Fisher championed this approach. Only a single hypothesis is required: the null hypothesis. It simply states that no association of interest exists. So in this case whether you are a gal or a guy is not associated with whether you like the Naked Gun movies or not in the population of blog readers.

Break! Listen to the music of P-value Diddy (he has so many names already I thought it ok to add one more) with Jimmy page from the Godzilla soundtrack.

Welcome back. So the null hypothesis is always assumed to be true until shown to be false with a statistical test. When you analyze your data and perform the test you will determine the probability of seeing an effect as big or bigger than that in your study by chance alone if the null hypothesis were true. You would reject the null hypothesis  if the p-value is less than a predetermined level of significance – typically 5% or 1 in 20.

So what is a naked p-value? It is simply a p-value obtained from the statistical test you performed on the data from your study reported WITHOUT an effect size, its sign and precision. The effect size is simply an estimate of the size of the association that you are studying – 25% more guys liked the movies as compared to the gals. The sign and precision is simply the direction of the observed difference (are you comparing gals to guys or the other way around) and an estimate of how confident you are – generally reported as a confidence interval which we will talk about in a later post.

So what is the bottom line? In order to keep your p-value warm you need to report it with the measure of the size of the association (effect size) and how confident you are about your answer.

In a subsequent post we will talk about another similar approach, Pearson-Neyman hypothesis testing, which involves two competing hypotheses (the null and the alternate hypotheses). This approach is duductive as opposed to Fisher’s inductive statistical testing approach. Both approaches are valid. It is simply a matter of determining which is more appropriate in a given situation.

See you in the blogosphere,

Pascal Tyrrell

Are You My Type, Data?

So you have come up with a research question and now you must chose a method by which your responses will be obtained. For example, a question like ‘Are you a Trekky?’ leads to a simple yes/no answer. So, are you? No need to fess up. I understand. Don’t know what I am talking about? See the trailer for my favorite of the Star Trek movies: The Wrath of Khan Trailer

What if you were to ask, ‘How much of a Trekky are you?’. You are no longer able to use a simple two-category response but one that uses a continuous scale.

An important distinction to remember when dealing with responses in research is that in general some will be categorical, such as favorite TV series, race, or marital status, and others continuous variables like blood pressure, cholesterol levels, or how much you enjoy Star Trek shows on a scale of 1 to 10 recorded on a 100 mm line. For those of you who would score high here listen to Santana – You are my kind as a reward.

This brings us to the important concept of the level of measurement. If you are working with named categories – race for example – then you have a nominal variable. Categories that have an order to them – education level for example – are ordinal variables. What if the interval between your responses is fixed and known? Then you have an interval variable – temperature in Celcius or Fahrenheit is a good example. However, is zero degrees Celcius the same as zero degrees Fahrenheit? No. The latter is much colder! Now what if you are working in Kelvin which has a meaningful zero point? Then it is a ratio variable.

Ok, so why the big deal? The important difference is between nominal/ ordinal data and interval/ ratio data. The latter two can be used in what is termed: “parametric statistics” that gives us measures of center (mean) and spread (standard deviation). We have already touched on this in previous posts. See here: Great Expectations. It makes no sense to talk about the average sex of a sample students in your study. These data must be considered as frequencies in separate categories. We previously talked about this a little here: Ogive and this type of data leads to “non-parametric” analysis.

Enough already! I’ll let you get back to streaming Star Trek re-runs…

Next time lets talk a little about parametric statistics and how thy came to be. I’ll leave you with this quote as a teaser from one of the greatest statisticians to ever walk the earth – Ronald Fisher: “The analysis of variance is not a mathematical theorem, but rather a convenient method of arranging the arithmetic.”

Pascal Tyrrell

Pick me! Pick me! Pick me!

So now that you are a debutante research scientist you are eager to take your newly acquired skills for a test drive. But where? You could get a job as a research assistant, but jobs are scarce. Apply for a student research summer scholarship. Maybe, but how about that summer job as a lifeguard at Camp North Star (from the crazy movie Meatballs) that you have committed to?

How about volunteering? But why would you do that? Let me tell you why. First listen to CeeLo Green to get pumped (yes, it is about firefighters but just pretend he is signing about volunteer scientists…).

There are many, many, many reasons to be a volunteer for any organization in your community. I am certain you can think of a bunch within a short brain storming session. What I want to share with you is some of my reasons. I have volunteered with special needs in my community for most of my adult life (quick calculation puts me at about 5,000 hrs to date). People’s first reaction is that of surprise. What? How do you have the time? Then they think how admirable… Sure, I’ll take that. But really, I do it because it gives me opportunity. Opportunity to learn and grow. Not only am I happy to do it and it makes others happy doing it, but I accomplish something that I may not have had the opportunity to do otherwise. Think about that. Make others happy AND gain some experience in the process.

Just like in the fantastic movie Shrek Donkey volunteers his services. He is always enthusiastic and willing to help. Listen to him ask to be picked here: Pick me! What does it get him in the end? A couple of great friends and a dragon wife. Perfect.

Create opportunities for yourself by volunteering. You will be glad you did.

Happy Valentine’s Day,

Pascal Tyrrell

Interview with the (research) Devil

Interviews, the most loved and hated type of activity for all, from the powerful, skeptical, God-like interviewers seeking information to the innocent, intimidated and incapsulated interviewees, seeking a break. So many emotions happen when two people meet for the first time, in the interview setting. I definitely know what it’s like to be put in the hot seat, as the one word I felt coming into my own interview with the University of Toronto for this program – terrifying. I was completely terrified. New offices in the heart of Toronto, I felt like a small town girl moving to the big city alone. It was almost a coming of age experience – one small step into the building, yet one giant step for the adolescent-adulthood phase I am now transitioning into.

As I went up the elevator and pressed the fourth floor button, I almost could not contain myself. But the scariest part of the whole ordeal was probably the moment before I found the right office. Of course, I stumble into the wrong office, and when asking the woman working there for Dr. Tyrrell, the interviewer, when I saw the look on the woman’s face that I was in the wrong place, my heart dropped. Of course, when finally meeting with Dr. Tyrrell and discussing the program, all of this fear and anxiety disappeared at the drop of a hat, but the point is, interviews are a type of research, so research can be quite adventurous!

Faith Balshin

Ogive? What the what? Oh, “jive”… right!

Ahhh, the 80’s. Interesting years to be in high school. I think I never quite fully recovered. I don’t wear Corduroy pants anymore but the acid wash jean jacket… maybe. Not sure what I am talking about? Have a peek here:  80’s-fashion.

So in my last post we talked about the concept of expectation (see Great-expectations) and the importance of organizing our data. Ask me what I think is the most important step to understanding your data? Organizing and graphing it – always. It is such a simple thing to do and yet it gives you crazy perspective and insight for any analysis that may follow.

The concept of a frequency distribution in statistics is paramount. By organizing your data values into an appropriate number of classes we in fact make more explicit the information that is there in the data. The resulting frequency table can then provide us with some basic summary statistics such as class frequencies and proportions. By the way, classes have end marks. The upper and the lower. The average of these two marks is the mid-point and the interval is the difference between adjacent class mid-points. Lastly, the class mid-point plus or minus half the interval gives you the class boundaries… Boring? Maybe you need a break. Watch the trailer for the epic 1980 movie Airplane! to decompress a little: Airplane! movie trailer…

So what now? We need to present this data graphically. The first chart to think of is the bar chart. It is simply a plot of the frequency against class, where the class frequencies are represented by bars. Classes in this case are made up of SINGLE readings. How about an example using radiation counts?

If your classes are made up of a GROUP of readings than you would consider a histogram as in this example using velocity of light measurements.

Now if you were to join the mid-point of each class by a straight line you would obtain a frequency polygon. This would allow you to easily compare several distributions on a single graph.

Finally, if you were to plot the CUMULATIVE frequency against the upper class boundary you would produce a cumulative frequency polygon – AKA the “ogive” as it has the characteristic arch-like shape found in architecture.

If you ever find yourself using the term ogive in a public setting and getting blank stares from your friends then refer to the funny “jive” scene in the infamous movie Airplane! to diffuse the situation: Airplane! – Jive Scene.

Hopefully, everyone will say: “Oh, jive. I get it!”…

Let’s talk a little about data types next time. Ok?

See you in the blogosphere…

Pascal Tyrrell

Great Expectations and What the Dickens is Probability Distribution Anyway?

If you are feeling like Pip in Charles Dickens’ wonderful novel Great Expectations every time you think of statistics, you are not alone! Not sure who Pip is? Have a peek at the latest of many movies based on this book: Great Expectations trailer

Pip started life in a poor community raised by a much older cruel sister. He did, however, grow up to be a gentleman (and a scholar?) and come to realize that our great expectations in life won’t necessarily come true. We instead work hard all of our lives and ultimately have to accept what is. Getting too serious? Have a gander at Diggy Simmons music video “Great Expectations” to relax a bit: Diggy Simmons music video

Ok we’re back. So what is the link between Pip and statistics?

As a researcher we are often interested in “what to expect” in future experiments or trials. The methodology used to perform the research and analysis of results will help to obtain an estimate of the answer to your question – see my previous post if you are in the dark about this one (Allegory of the cave).

In statistics the term “expectation” is given a precise definition in terns of probabilities (the chance that something will happen – how likely is it that some event will happen). Thus, if we consider an experiment or trial as taking a variable x at random from some population of readings and recording its value then the value to expect for x is the mean µ of this population.

Here is the rub: the population mean is usually a quantity whose value we can NEVER determine exactly – it is the value to EXPECT. This is a VERY important concept in statistics.

*** Caution: stats talk below – skip if already feeling dizzy…

When we make predictions about future trials we have to keep in mind that we are working with a sample of results that will necessarily have a measure of uncertainty associated with them. By organizing our data into frequency tables we can then present its distribution graphically (ie: frequency curve, histogram) and get our first appreciation of where the center is (mean, median, mode) and scatter (variance and standard deviation). Finally, if we convert our frequency distributions to probability distributions (divide each class frequency by the sum of frequencies) we can obtain expected values from these distributions. Plural? There are different types? Yes, and we will chat about these in future posts…

*** Safe re-entry here:

I am ok with having to work with estimates and never knowing the truth. You? As Socrates once said (a long, long time ago!): “The only true wisdom is in knowing you know nothing.”

So what next? Maybe watch the movie “Great Expectations” this week-end and tell everyone that you were studying for your stats class. Let’s talk about organizing data next.

Enjoy the movie.

Pascal Tyrrell

The Truth? You Can’t Handle the Truth!

In “A Few Good Men” Jack Nicholson growls “You can’t handle the truth” to Tom Cruise in his Academy award winning performance. Watch a clip of his gritty performance:. Our pursuit of the truth leads to an interesting path indeed.

This series of posts has as objective to help you develop a scientific “sense”. Have a quick peek at my other posts  if you haven’t already and come back. So wanting to know the truth is something we all strive for on a daily basis. Finding the truth is another matter altogether and this philosophical conundrum has challenged many great minds for centuries.

The Roman Emperor Marcus Aurelius once stated many, many years ago: “Everything we hear is an opinion, not a fact. Everything we see is a perspective, not the truth”. Have a quick peek at the trailer for “Gladiator” to put you in the mood. Gladiator
Now Greek philosopher Plato, who predated Marcus a few centuries, got the ball rolling when he presented his Allegory of the Cave, in which he symbolically described his belief that the world revealed by our senses is not the real world but only a poor copy of it, and that the real world can only be apprehended intellectually. Plato used an analogy where we are represented as a gathering of people who live chained to the wall of a cave all of our lives, facing a blank wall. We watch shadows projected on the wall by things passing in front of a fire behind them, and begin to designate names to these shadows. The shadows are as close as we get to viewing reality.

Getting too serious? Take a break and listen to Siouxsie And The Banshees – Shadowtime Shadowtime

So why all the philosophy? Because the concept of getting as close to the truth as possible is important. We accept that the truth will never be known and, therefore, we must also accept as an answer an estimate (let’s say the mean of a sample) or “best guess”. As a scientist we will make sure to offer our reasoning and methodology as to how we obtained this estimate and more importantly we will offer a measure of how confident we are about this estimate – voila, biostatistics in a nutshell. Don’t believe me? Keep reading my posts and I will explain.

If we always knew the truth, would we need to measure anything? How boring would that be? As William Cowper aptly put it: “Variety’s the very spice of life, That gives it all its flavor”.

Here is what I suggest you do next in your endeavor to become a researcher: keep on asking crazy numbers of questions but now think of what factors will influence the estimate you will produce for your answer. Where does this “variety” that Cowper mentions come into play?

Next we will talk about the concept of “expectation” and how this is important in the world of scientific research.

How is that pocket protector working for you so far?

Pascal Tyrrell

Faith Balshin

To be, or not to be: what is in a research question?

So you now spend a minimum of an hour a week wearing your shirt with a pocket protector thinking, among other things, about what you can do to speed up your training to become a scientist. Don’t know what I am talking about? Go and see my previous post and come back.(Pocket protector)

Ok. You are now asking questions furiously at all times of the day (and night?) trying to get a handle on how to structure a question in order to best help with finding an answer. Why? It’s all about clarity. Not sure what that is? Listen to Zedd for some instruction: Clarity – Zedd

A great French author Marcel Proust – yes another French author, my first name is Pascal after all – said: “The voyage of discovery lies not in seeking new horizons, but in seeing
with new eyes.”
Maybe by asking the right questions we can inch ever so slowly towards the truth that lies right in front of our own eyes! So take a fresh look at what and how you do all things scientific.

Here is what I suggest for formulating your questions:

Use the PICO model (for a little more detail: PICO)

Patient, Population, Problem
Intervention
Comparison (optional. PIO when absent!)
Outcome

Essentially in a clinical setting – For a patient with (Problem), how does (Intervention) compare to (Comparison) with regard to (Outcome)?
• Is MR angiography more effective than a Doppler carotid ultrasound in diagnosing and describing carotid artery disease in obese middle-aged males and females?

or PIO – For a patient with (Problem), does (Intervention) affect (Outcome)?

• Is a MR angiography effective in diagnosing and describing carotid artery disease in obese middle-aged males and females?
PICO can be applied to most research questions that you may have – yes even outside of Medical Imaging and in the real world (see Scientific thinking in business).

Just remember that you will most probably want to formulate and test a hypothesis based on your research question. For quantitative statistical analysis you will want your question to be answerable by yes/no or a number. For qualitative analysis your question will typically start with: What is/are…?

Keep practicing and we will chat about testing hypotheses next post. Stay tuned…

Pascal Tyrrell

Research Gone Wrong – Mishap Madness

With every step taken toward the den of knowledge, there has to be that one click that doesn’t really bring you to where you wanted to go. I’ve had my fair share of research-gone-wrongs, as I like to call them. For instance, I learned the valuable lesson of not to use Twitter as the most accurate research hub, the hard way. Sure, a tweet here or there seems harmless, but birds sometimes do bite. As I innocently went on twitter’s homepage, looking at various tweets from people I follow, a tweet, “OBAMA HAS BEEN KILLED,” caught my attention. Shocked, I immediately text my dad, who then reassures me this catastrophe has not happened, and to check if I put my eye contacts in for that day, because Obama is fine in the oval office, but Osama Bin Laden has been captured and killed by the United States, the headlining news of the hour. Moral of the story? Read the news, from the news, and do not trust any ‘.com’ website that gets into your reach. The difference a letter makes….

Faith Balshin