Notes on Week 1 of Coursera Statistics One MOOC

Data is the lowest level of abstraction, from which information and then knowledge is obtained.

Descriptive versus Inferential statistics.  The latter case of course is inferred from samples to the general population.

There are descriptive, correlational, and experimental researches.  Descriptive: organize kids grades into a spreadsheet and get means, etc.  Correlational: Examine relationships among variables. Is Math correlated to History grades?  Experimental is the gold standard – (randomly assign students schedules. Is Achievement affected by schedule?)

It’s the international year of statistics at www.statistics2013.org

Inital sample of Salk Vaccine trials was 4,000 children from Virginia.  The independent variable was some kids got the vaccine, others got a placebo. Dependent variable is either per child or in the community. Double blind experiment meant the experimenter also didn’t know what they were getting (remember this as an essay from edge.org as one of the most important concepts.)  Treatment kids got 28 per 100k, Control 71 per 100k.  By 1994 Polio eradicated.

Strong causal claims require truly independent variables (in this case vaccine or placebo).  Need random and representative samples.  And “No confounds” – which is almost impossible.

Pre & post test design…n back procedure to test memory (is this a generic concept in memory testing?)

“Practice Effect” – even control group improved on IQ test but probably just from having recently taken a similar test.

Confounds in the memory training tests are high – control group doing nothing is a big difference from going to a training facility everyday; the trainees could just be feeling better about themselves.

Randomness needs both selection (from population) and assignment (to groups)

Big Five/Five Factor personality traits: Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism. (OCEAN) used to demonstrate usefulness of correlation research

Difference between study and experiment.

First theory of intelligence and general theory of intelligence (g) proposed in 1904.  Criticism is correlations are higher between certain types of intelligence than others.  This lead to proposal of a hierachical model: General (Verbal, Spatial); Verbal(History, Language); Spatial (Math, Science.)

Quasi-independent variable example in sports concussion research: you can’t assign people to get concussions or not!  (Also the confounds of previous concussions.)  Means you’re doing Correlational research (or “study”)

Princeton students are given “Digits Backwards” tests before they start sports.

Four kinds of variables, first identified in Science Magazines in 1946 by S.S. Stevens: On the Theory of Scales of Measurement: Nominal, Ordinal, Interval, and Ratio. Can only apply certain techniques to each type.

Nominal used to assign students to group

Ordinal: rank order cases

Interval: Orindal, but the distance between each value equal (like Longitude)

Ratio: variable but with a “true zero”

Histograms can reveal non-normal distributions: a skew, or a bi-modal distribution

z-scores: expresses below or above average

I had always thought “positively skewed” meant more on the positive side, not less.

Discussion of concepts like mean, median, and mode is just a little slow for what I’m expecting of this course and are a strike against continuing given how many other things to do.  Still I was also rusty on what “bi-modal” meant on a histogram.

Finding it worthwhile to have the lecture substantially sped up.

Intro to R at the end.  List just like Python Lists.  Vector and Matrix also introduced.

Hand-typed the R code to make the dataframe, misspelled “frame” and found it cool to serendipitous to type the up arrow and get the code back to edit.  These small revelations are fun and feel like only by hand-typing the code will I really get to practice it.

downloading packages looks easy in R but I did get hung up at 63% downloaded at first.

The quizzes were easy and the “Lab” was as well – it’s pretty much of the sky is green, what color is the sky variety but it still made me walk through simple binding of matrices into vertices.  I think more of these MOOCs need just more rote practice after the facile examples.

But homework is done (10/10) and enough for me to continue into next week and so far at least until other more pressing MOOCs start.

 

Leave a Reply