Do temperatures HAVE a mean?

This isn’t as ‘cheeky’ a question as it might seem.

All the statistical manipulation I’ve seen done on temperatures tends to presume they have a Standard Normal Distribution and that it is a valid statistical operation to compute a mean. While this might seem reasonable for a single thermometer in a single place ( but even there can ‘have issues’ ) as soon as you start doing an arithmetic mean over a geographic field of thermometers spread over 1200 km (as is done in codes like GIStemp) you are making the implicit assumption that the mean is defined.

For those of us who had some, but not extensive, statistics, that is a natural assumption as we spent months (years?) doing various kinds of problems all universally based on a Standard Normal Distribution. But there are other kinds of distribution…

Why this matters is that it is a property of the Standard Normal Distribution that allows the use of the Central Limit Theorem in the production of those numbers with astounding precision based on lousy input data.

The normal distribution is considered the most prominent probability distribution in statistics. There are several reasons for this: First, the normal distribution arises from the central limit theorem, which states that under mild conditions, the mean of a large number of random variables independently drawn from the same distribution is distributed approximately normally, irrespective of the form of the original distribution. This gives it exceptionally wide application in, for example, sampling. Secondly, the normal distribution is very tractable analytically, that is, a large number of results involving this distribution can be derived in explicit form.

Notice some of the ‘requirements’ of those statements. “Mild conditions”. “Large number”. “Random variables”. “Independent”. “From the same distribution”. “Approximately normal”. But per Hansen et. al. various temperatures in a region (up to 1200 km) are NOT “random variables” but are in fact co-variant. We have about 1280 thermometers currently in use globally in the GHCN last I looked. That means that any 1200 km ‘field’ is far less populated. That is NOT a ‘large number’ statistically. Thermometers near a large ocean have very different behaviours from those inland a couple of hundred miles in mountains or deserts. Those in the snow have different behaviours from those in the dry valleys. They are not “the same distribution”. Are conditions “Mild” in Antarctica or the Sahara? I don’t know if they meet the requirements of statistically “mild” or not. Right off the bat we have some concerns showing up in the use of the Reference Station Method and the choice to use the mean of a set of temperatures for statistical manipulation of other temperatures and the infilling of missing data into various specific locations and the larger “grid / boxes”.

But wait, there’s more…

A bit further down in that wiki we find the first hints that all might not be well:

Quantities that grow exponentially, such as prices, incomes or populations, are often skewed to the right, and hence may be better described by other distributions, such as the log-normal distribution or the Pareto distribution. In addition, the probability of seeing a normally distributed value that is far (i.e. more than a few standard deviations) from the mean drops off extremely rapidly. As a result, statistical inference using a normal distribution is not robust to the presence of outliers (data that are unexpectedly far from the mean, due to exceptional circumstances, observational error, etc.). When outliers are expected, data may be better described using a heavy-tailed distribution such as the Student’s t-distribution.

Do temperature readings have outliers? Well, yes! A.Watts found that there were a suspicious number of high temperature values reported from arctic and similar cold locations and traced it back to the encoding used (where "M" is used for "minus" meaning a negative value. 30 C is quite an outlier from -30 C. As one example.)

So we have an immediate flag that perhaps the Standard Normal Distribution is "less than right"…

The next little problem I note in passing is that temperature data is not "randomly drawn" from a distribution. Temperature data is selectively drawn from population centers (that are themselves non-randomly distributed, being biased to valley floors, flat plains, and places were water and land meet such as bays, harbors, and major rivers. More recently, as cities form where 2 or more modes of transportation intersect, we have more cities around airports, sea ports, and where railroads intersect those.) At present, the majority of our land temperature data is drawn from airports. By definition, airports are selectively located where weather conditions are most conducive to regular and safe operations.

All those kinds of "issues" have already been explored (and the temperature data found skewed and wanting due to them – Urban Heat Island effect and Airport Heat Island effect), though not in the context of their statistical validity. Here we start to note that perhaps the location selection bias itself might invalidate the statistical assumptions that underlay the use of an extended averaging process to increase the precision of temperatures. (Presently reducing a 1 F minimum error band in the raw data to 1/100 C of precision in the mean. IMHO False Precision; for just these kinds of reason.)

While I’ve explored in prior postings that point that such an averaging process can remove random error in measurements, it can NOT remove systemantic error. (Such as the systematic error introduced with the conversion to the MMTS thermometers. Both changing location to be closer to buildings due to the need for power and communications cables; and the “adjustment” to “remove a cooling bias” that was really locking in a slow warming trend from aging of paint on the prior Stevenson Screens). I’ve not asked the more basic question of “is that use of the mean and the central limit theorem statistically valid?”

This shows that there are already many reasons to doubt that the mean of temperature measurements is used in a valid way. But might there be more?

What if “mean” is undefined?

While looking into some things involving IR and “backradiation”, I was looking up pressure broadening and the nature of the distribution of energy between different species (such as water, ozone, nitrogen oxides, carbon dioxide, sulphur oxides) in the air. During that, I ran into the point that the nature of the broadening curve changes are NOT a standard normal distribution. Instead, they are a different kind of distribution. One for which the mean is not defined.

https://en.wikipedia.org/wiki/Cauchy_distribution

Cauchy / Lorentz Distribution

Cauchy / Lorentz Distribution

Original Image

Its importance in physics is the result of its being the solution to the differential equation describing forced resonance. In mathematics, it is closely related to the Poisson kernel, which is the fundamental solution for the Laplace equation in the upper half-plane. In spectroscopy, it is the description of the shape of spectral lines which are subject to homogeneous broadening in which all atoms interact in the same way with the frequency range contained in the line shape. Many mechanisms cause homogeneous broadening, most notably collision broadening, and Chantler–Alda radiation.

OK…

So pressure broadening, which we know is driving the nature of the interaction between species and the distribution of the infrared (and all other photons too) is not a standard normal distribution. It is a Cauchy Distribution. As all of the AGW thesis rests on the notion of IR redistributing heat via just this physics of molecules, I think “this matters” to the basic assumptions of AGW.

The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The simplest Cauchy distribution is called the standard Cauchy distribution. It has the distribution of a random variable that is the ratio of two independent standard normal random variables.

So here we see the kinds of things that cause this sort of distribution, and why they are so common in physics. The question to ask is simply: Are temperatures a function of two (or more) independent random variables?

I would assert “yes”. (Though the question of ‘are they standard normal’ I leave open…)

I was in Dallas once when it moved some 50 F in one day. A cold front from Canada swept over us. It is pretty well accepted that air motion is chaotic. While, in the short run, we can make modest weather predictions, the randomized chaotic nature means that the ability to forecast rapidly breaks down over a few days to weeks. So, as temperature data in GHCN is a ‘monthly mean’, there is one factor that already has a strongly random component in the ‘less than a month’ scale. Then there is cloud cover. Any given patch on the ground on any given day may have, or not have, a cloud overhead. IMHO that’s a second largely random term. The existence of just these two, IMHO, raises grave doubts that the temperature data are a standard normal distribution.

Now, layered on top of that, there are several non-random processes that change the data as well. There are seasonal cycles. There is the day / night cycle. There are ocean cycles and tidal cycles. Many of those push the weather to display what looks like “stochastic resonance”. And what was one of the properties typically leading to a Cauchy distribution? “Forced resonance”. IMHO there is a whole can of worms to open just in this one space. We know that there are resonance effects all over the weather / climate / ocean systems; and that those show up as temperature changes. To what extent do these matter? We don’t know. While I would assert they are THE dominant force changing temperatures, that’s just a naked assertion. Yet the ENSO / AMO / PDO and many other “Modes” and “Oscillations” of the air and water are main drivers of temperature changes over the scale of years to decades (and perhaps to centuries for longer cycles).

What does THAT mean? To me, it makes it highly likely that the temperature change distribution follows a Cauchy type distribution and highly unlikely that they follow a Standard Normal type distribution. That, then, brings up one small point about some of the other kinds of distributions. (A point that I vaguely remembered being mentioned in the first few weeks of my statistics classes just before we promptly “moved on” to the Standard Normal Distribution and never really looked back other than a brief dip into the Student-T distribution… but only remembered AFTER being reminded by this text…)

The Cauchy distribution is often used in statistics as the canonical example of a “pathological” distribution. Both its mean and its variance are undefined. (But see the section Explanation of undefined moments below.) The Cauchy distribution does not have finite moments of order greater than or equal to one; only fractional absolute moments exist The Cauchy distribution has no moment generating function.
[…]
Because the integrand is bounded and is not Lebesgue integrable, it is not even Henstock–Kurzweil integrable. Various results in probability theory about expected values, such as the strong law of large numbers, will not work in such cases.

Oh Dear! Hearing “pathological” and “undefined” are not things you want in the statistics that underlay the entire edifice of a Global Average Temperature statistic.

For me, this raises (confirms?) some strong doubts about the validity of the statistical manipulation done to the temperature data. All over the place it gets adjusted, homogenized, in-filled (interpolated / extrapolated / whatever) and anomalized and has precision extended out to 1/100 C in ways that look to me like they depend on assumptions about the kind of distribution in the data (in particular, Standard Normal ) that are unwarranted.

But that’s as far as I can take this muse. My statistics background ended with Student-T in an undergraduate course. This would take a much more skilled statistician that knows these other distributions and what can be done with them “cold”. That’s not me.

So if you happen to know a good skeptical statistician (or be one!), I think this would be a good “Dig Here!”. To simply ask “Are the necessary prerequisites for using means and averages present in the actual distribution?” and “Do we even know what the distribution might be?” To me, on the limited evaluation (example above) that I’ve done, temperature data looks like it is not a standard normal distribution type; and it looks like the “climate science” processes I’ve seen just assume that it is one. An assumption that is highly likely to be in error, IMHO.

Subscribe to feed

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in AGW Science and Background and tagged , , , , , , , , . Bookmark the permalink.

31 Responses to Do temperatures HAVE a mean?

  1. EM – thanks. I’m not a statistician of any sort, but I had noticed that they were assuming a standard distribution. I’d thought it was just my lack of knowledge.

    Temperature of a gas, at its base, is simply a measure of the kinetic energy. That is, the velocity changes as the square root of the temperature. As regards radiation, a gas will radiate as the fourth power of the temperature. For convection and conduction, things get more complex. I noted yesterday the ground temperature and the difficulty in deciding which number to take as the temperature.

    The mean (or average) temperature would, I think, depend on what you want to use it for. If you want to know average velocity of the air molecules, you get one number, but if you want to know how much IR it will radiate you get another. For average amount of energy available in the air, you get a third, but there’s also a fourth average energy that depends on the composition of the air (how much water vapour, for example) that tells you how much energy it can deliver or take up.

    Possibly a bit meandering there, but just taking the temperature over some time and then taking an average of it just doesn’t give enough information if you want to be precise about energy flows.

  2. Ben says:

    Posted a link to this at
    http://wmbriggs.com/blog/
    If he has the time I am sure he could answer many of your questions.
    Have a nice day

  3. John Robertson says:

    Pathological distribution, brilliant. Even if it turns out to be a minor problem(don’t know stats blind) the term is perfect for the team as self portrayed (CRU emails). Long been uncomfortable with the information value of the averaged daily temperature. Same with global average.
    Given Wegman’s conclusions and Steve McIntyre & Climate Audit crews eviseration of so much of the key team members statistical ability, I sense you are on target again.

  4. jim2 says:

    I think a better question is can you calculate climate sensitivity from paleo records. When there is a lot of ice, in a glacial, the climate sensitivity is very low. In an interglacial, it is higher. Obliquity will affect sensitivity. Calculating sensitivity from an agglomeration these various scenarios does not seem like the right way to go about it. Sensitivity at any given time will depend on where we are in the larger cycles including the physical configurations that drive the Milankovitch cycle.

  5. Don Matias says:

    If the climate scientists claim that their temperature data obey the normal distribution let us run some tests on their data set(s):

    “Hypothesis tests give quantitative answers to common questions, such as how good the fit is between data and a particular distribution, whether these distributions have the same mean or median, and whether these datasets have the same variability. Mathematica provides high-level functions for these types of questions and will automatically select the tests[*] applicable for the data and distributions given. The high-level functions typically run more than one test and are able to produce full reports, but there are also specific named hypothesis tests such as the Kolmogorov-Smirnov goodness-of-fit test, or paired -test. These give more direct control over settings and performance for specific tests.”

    *) “Goodness-of-Fit Tests
    DistributionFitTest — test for goodness-of-fit to a distribution of data
    LogRankTest — test whether hazard functions are equal
    AndersonDarlingTest ▪ CramerVonMisesTest ▪ JarqueBeraALMTest ▪ KolmogorovSmirnovTest ▪ KuiperTest ▪ MardiaCombinedTest ▪ MardiaKurtosisTest ▪ MardiaSkewnessTest ▪ PearsonChiSquareTest ▪ ShapiroWilkTest ▪ WatsonUSquareTest”

    Cf.: http://reference.wolfram.com/mathematica/guide/HypothesisTests.html

    If someone has MATHEMATICA or “R” – http://www.r-project.org/ – _and_ the data the above claim could easily be verified – or falsified.

    (MATHEMATICA by WOLFRAM RESEARCH is my favorite toy: http://www.wolfram.com/mathematica-home-edition/ )

  6. E.M.Smith says:

    @Jim2:

    My “sense of it” from looking at a lot of paleo charts and history, is that the very very long term trend is relentlessly down. I think this is partly due to the earth generating less internal heat (we are using up our fissionables) and the way the continents have arranged over many millions of years. (Antarctic and an enclosed Arctic with the equator cut off at Panama and Africa).

    Inside that, it looks like there is a floor of stability at ‘almost frozen’ where the equator doesn’t quite freeze, and another stability point at ‘melted’ where thunderstorms / hurricanes prevent going much over 30 C in large areas. In between, we wobble between those two based on orbital changes. At present, with the present continent layout, we only get Just The Right Conditions for an interglacial for about 10,000 out of each 120,000 years (or so). So I think you need to find ‘sensitivity’ number for “ice world” and for “interglacial” as distinct things. (Hold orbital mechanics constant, at two range points; and for specific land positions relative to the oceans.)

    What’s very clear is that it stops to the upside at about where we are now. No matter what. (To the downside is less clear, and the world might have had an ‘iceball earth’ stage with NO open water at one point… so if you can’t live below ice…)

    My guess is that it is an S shaped hysteresis curve with stability (low climate sensitivity) at lots of ice and at no ice on the Arctic. (As soon as multiyear ice forms and stays at the Arctic, we are on our way into the next glacial… The folks demanding the return of multiyear ice and permanent ice pack have no idea how the system works…)

    @John Robertson:

    Yeah, I liked the term too ;-) Gave me a new appreciation of statisticians ;-)

    @Simon:

    Good points. It’s that kind of “what is happening under the math” which tends to be forgotten by a lot of folks, and it matters.

    @Ben:

    Thanks! Here’s hoping he has time to take a look.

  7. jim2 says:

    Hysteresis curve makes sense. I’m just tired of the warmists touting the sensitivity as determined from paleo records.

  8. Richard Hill says:

    EM, when this topic was brought up with a Climate Scientist it was explained that CliSci’s work with anomalies, NOT temperatures, so the concern was unjustified. Topic explored in depth on Lucia’s Blackboard blog.

  9. E.M.Smith says:

    @Richard Hill:

    The problem is that the ASSERTION of using anomalies is in fact not true.

    The GHCN is Monthly average temperature per location. Made by an average of temperature data, not anomalies.

    The GIStemp code keeps the GHCN (and some UHCN) temperature data as temperatures all the the way to the end when in the very last step they make “grid / box” anomalies.

    They don’t even make a real temperature anomaly (a thermometer in one time period vs itself in another time period) but a comparison of a ‘grid / box’ average temperature in the baseline period with that same ‘grid / box’ in the present. One small problem…

    Depending on what era of GIStemp you run, they have either 8,000 or 16,000 ‘grid / boxes’ but only about 1280 currently active thermometers in GHCN. So it is mandatory that most of the ‘grid / boxes’ being compared one to the other over time are devoid of any thermometer and filled in via their averaging math… that is done on temperatures from actual thermometers….

    So yes, I’m sure the good folks at Lucia’s have admired to no end how wonderful anomalies are in a theoretical way. They just ignore what is actually done in the codes…. (A large part of why I don’t hang out there, BTW. A whole lot of ‘angels and pins’ and not a lot of ‘hammers and nails’…)

    Yes, I know, very ‘geeky’ of me. To have actually looked line by line at what is really done. I’m that kind of person…

    But just start with the GHCN description of the “monthly average temperature” in C and work forward from there…

  10. omanuel says:

    Do global temperatures HAVE a mean? Probably not a meaningful one.

    The temperature is different in different parts of the globe and the weighed mean would require information on the amount of the total globe at each temperature.

    Nobody has that information. Therefore the mean global temperature is an illusion.

    – Oliver

    PS – Postmodern science is Maya: http://omanuel.wordpress.com/about/#comment-1883

  11. Dr K.A. Rodgers says:

    But we have long moved beyond postmodern science in this area to postnormal.

  12. omanuel says:

    @Dr K.A. Rodgers

    Right! We moved further from reality, deeper into Maya (illusion).

    The return from Maya to reality will be painful, but it is inevitable. Modern politicians are now calling economic reality the fiscal cliff.

  13. Reblogged this on contrary2belief and commented:
    So glad I looked here today.
    And OMG … I was, just last night, trying to get my head wrapped around Breit–Wigner in another matter altogether.

  14. R. de Haan says:

    Our friend Burt Rutan has a nice presentation about global temps at wuwt tv. It really says it all.

  15. R. de Haan says:

    In the mean time: Californians theatened with double taxation thanks to carbon taxes http://dailycaller.com/2012/12/09/californians-could-face-double-taxation-with-state-federal-carbon-taxes/ Good luck with that.

  16. adolfogiurfa says:

    What SAVES temperature in the world is WATER….dip a finger and you´ll know it. It has the greatest heat capacity.

  17. ombzhch says:

    C. clear, and a bit of very useful lateral thinking … having found your blog, I am hooked by the ‘outlier’ insights also all the Math is spot on!
    Thanks,
    MFG, omb

  18. DirkH says:

    Can’t be normal distribution. Remember Willis Eschenbach’s Thunderstorm Governor hypothesis? Whereever there’s enough water an upper limit forms from which thunderstorms work to rapidly move heat upwards. Hysteresis must be the result. (jim2 is right) And a limit to the right side. Of course, only where thunderstorms actually form (varies with availability of water)

  19. Jason Calley says:

    @ DirkH “Can’t be normal distribution. Remember Willis Eschenbach’s Thunderstorm Governor hypothesis? Whereever there’s enough water an upper limit forms from which thunderstorms work to rapidly move heat upwards. ”

    Ooohhh. Yes, good point. In the same way, we know that the distribution of temperatures in far north or south is not normal. The phase change from ice to water flattens the top of the curve just as near the tropics the phase change (and associated side effects) from water to vapor flattens the curve there. Not normal.

  20. Espen says:

    DirkH: Exactly, the distribution is obviously skewed. For many skewed distributions, the median is a better estimate of the expected value than the mean. For satellite data where we have reasonably gridded data without too much fill-in, computing the median could even make some sense. But we don’t escape the more fundamental problem that temperature anomalies in the Arctic and the tropics are like pears and oranges – they can’t really be compared because they correspond to quite different enthalpies.

  21. omanuel says:

    Good news from an unexpected source: Nature (11 Dec 2012) acknowledges lack of public support for two government “settled science” dogmas:

    1. AGW Campaign Will Vanish on 1 Jan 2013:

    “On 1 January 2013, the world can go back to emitting greenhouse gases with abandon. The pollution-reduction commitments that nations made as part of the Kyoto Protocol will expire, leaving the planet without any international climate regulation and uncertain prospects for a future treaty. Nature explores the options for limiting — and living with — global warming.” http://tinyurl.com/amzrgyv

    2. Chinese Officials Fired for Promoting Genetically Modified Rice:

    “China has sacked three officials for breaching Chinese laws and ethical regulations during a trial in which children were fed genetically modified rice.” http://tinyurl.com/ahue2e6

    East and West, the public is fed up with tyrannical government science.

  22. Pingback: Tropopause Rules | Musings from the Chiefio

  23. Chuckles says:

    Some thoughts on similar matters

    Click to access s8863.pdf

  24. gareth says:

    Interesting post. I too was going to point you to William Briggs but I see that I’ve been beaten to it by Ben a couple of days ago. I think you might have an interesting discussion – I’ve noticed before that information arises at the meeting of different disciplines.
    Do temperatures have a mean? What’s the average temperature of a pot of tea at 90C and some milk at 5C? One hand clapping ??

  25. paulbaer says:

    At Dirk H’s suggestion on WUWT, I looked over this posting on the statistics of global mean temperature. As is typical on “skeptic” blogs, it shows that a little knowledge is a dangerous thing.

    There are several points here which are true, some trivially, some non-trivially:
    1) ANY reported temperature statistic is a “construction,” not a natural “fact.” This is true at the most basic level (how many decimal points did you choose to record, how accurate is your thermometer, what area are you defining your temperature to measure) to the conventional (average monthly temperature in Wichita in June) to the speculative (global mean surface temperature during the last ice age).
    2) There is no reason to think that every distribution of empirical measurements will be normal; the normal distribution is a good theoretical distribution for parameters whose variability is composed of numerous small and independent effects (classical “measurment error” in a laboratory experiment, or the variation in biological characteristics of an organism based on a wide range of genetic/environmental influences – hence the “bell curve” for IQ, etc.). For parameters generated by processes with other characteristics, there are other distributions.
    3) Calculating the average surface temperature of the earth requires powerful assumptions, in part (but not only) because of the spotty distribution of thermometers on the earth’s surface.

    But the point is, don’t you think that the people who study such things know this? Our host here was able to find a great deal of relevant information with a few hours and an Internet browser. Wouldn’t you think that people who have spent their entire careers considering such matters, and have been through incredibly rigorous training and competitive screening, would know more than a “technical managerial sort” who likes to blog on the subject?

    And, BTW, what ever made anyone think that since climatologists like to refer to global mean surface temperature, they were assuming that ANY distribution used in its calculation is normal? EVERY DISTRIBUTION HAS A MEAN.

    Paul Baer
    Assistant Professor (and teacher of undergraduate statistics)
    School of Public Policy
    Georgia Inst. of Technology

  26. E.M.Smith says:

    @Paulbaer:

    I note up front your tendency to snideitude and appeal to authority. You will find that neither of those is very useful and they are not going to serve you well here. However, your tendency to indulge in logical fallacies is noted…

    Yes, ANY collection of numbers has a mean. But not all means have a meaning. So the title of the article is ‘eye catching’, and it is short. (That’s what titles are, and what they are to do). Yes, I could have said “valid mean” (but that’s no better, really) or even “non-pathological mean” (but that’s too lumpy for a title) or even “defined mean” or a dozen other things that are not very good as a ‘tease’. (And also not informative of the actual problem. That the mean may well be pathological).

    To quote what you seem to have skipped over, about a non-stardard normal curve:

    The Cauchy distribution is often used in statistics as the canonical example of a “pathological” distribution. Both its mean and its variance are undefined.

    Note that the mean is undefined. For most folks, that’s the functional equivalent of “does not exist” as far as your ability to use it in computations that are to have meaning.

    One presumes you chose to ignore that point “for effect”, given the snippy and appeal to authority nature of your comment. Please don’t. It just makes you look a bit slow…

    So, to your points:

    1) That temperatures are a construction doesn’t make the method right. That’s the point of asking the question.

    2) Just recapitulates the point I was making. That there is no evidence that temperatures follow a standard normal distribution, and some reason to expect they do not. Once one realizes that, you’re in the land of “undefined” as a potential outcome…

    3) I’ve looked at a lot of those “Powerful assumptions” and found them very wanting. Much of it looks like science of the form “Given these conclusions what assumptions can I draw”. Not a very desirable point you have there, that those assumptions are “powerful”. Especially when they are the wrong ones to draw.

    Then you ask “don’t you think that the people who study such things know this?”

    In some cases, no, they don’t. In other cases, they may well, and be ignoring it. Hard to tell. Given what was seen in the Climategate Emails, it’s pretty clear they were quite willing to ignore inconvenient truths, cherry pick for effect, lie, manipulate, and conduct “outcome directed” research. Heck, Hansen even likes to get himself arrested for his “cause”. So no, I have ZERO trust that they know, care, or if they did know, would not bury it.

    Then again, that just isn’t relevant, now is it? Science doesn’t depend on what you like, don’t like, want, or expect. It depends on what can be proven, with open disclosure of data, methods etc.

    You then launch an insult against me, the host. I strongly recommend you read the “about” box up top. Attacks “to the person” are not a good idea. The desired atmosphere is “pool side party”. One usually doesn’t insult the host at a party. At least not more than once…

    Per the logical fallacy of your “Appeal To Authority”: Aside from also being a waste of space and showing poor reasoning skills, it’s just a bad idea. The folks who hang out here are bright enough to see that as a dumb move on the face of it.

    But, to answer the point: No, I think most Climate Scientists have spent very little of their careers thinking about statistics. They learned enough to use them, then moved on. I think they had an undergrad class in statistics, learned how to do standard deviations and few other tricks, probably how to handle Student-T, and then went off to other classes. Given what a professional statistician did to illustrate their dumbness about stats (i.e. The Most Powerful Tree in the World and Mannian upside down series) I suspect they didn’t even remember their one stats class very well.

    Per their “rigorous training” and “competitive screening”: Don’t make me laugh. Really. I’ve taught in a college. I’ve been through many screening processes. Near as can be seen from the evidence, a lot of “Climate Science” has degenerated into Pal Review and a political screening. We have folks, with a straight face, saying Global Warming is going to make the world hotter and colder, wetter and dryer, that it will threaten PASTA, and that both the Little Ice Age and MWP didn’t exist. If these folks have been “competitively screened” it is for the ability to suspend disbelief and tell whoppers with a strait face.

    Yes, getting a Ph.D. is a lot of work. In some fields, it even requires a lot of ability. Take Chem. E. or even Aeronautics (which my brother in law got) for example. But looks like nearly anyone can walk in and call themselves a Climate Scientist. I’ve also looked closely at the curriculum for several “green” majors at some schools (when the kids were shopping for a college). Whole lot of feel-good fuzzy, not a lot of thinking… Then there’s the problem of the politicization of “Climate Science”. It just reeks. Once a field is politicized, you can kiss off trust in the paper.

    Yeah, that is a long way to say “Paper doesn’t mean much”. (Hey, I have a load, so it can’t mean much ;-) I’ve lost track, but it’s somewhere around 7 or 8 letters I can string after my name.) It’s helpful, but not definitive. (Some of the best folks I’ve hired had little. Some of the least effective had Ph.D. after their names.) So I don’t de-facto denigrate it. It can indicate a lot about raw ability, or just determination. But it says nothing about ethics, about sound reasoning skills, about ‘common sense’, nor about the willingness to politicize. Nor does it say they won’t be lazy, and forget to use the one stats class they had before studying “Green Subjects” for the next 6 years.

    Now, per trying to denigrate me with quoting my self description at me “technical managerial sort”. And using “blog” as a semi-put-down. Noted. Says a lot about your character. BTW, I’m prone to modesty and NOT a lot of “ego”. But realize this: I qualify for Mensa (easily) and my personalty profile has been used to select every Shuttle Astronaut. (NASA study to find out what indicates “has the right stuff” along with “plays well with others”. I’m one of 9 who form the composite model.) So you are wasting your time with the put downs. I’m certified sane, smart, and ‘works well with others’. By NASA no less, so it must be right ;-)

    Finally, per the ‘assumption that they thought it had a normal distribution’: Well, show me where they state that it has some other distribution. I’ve been through every scrap of the GIStemp code. Never says boo about non-standard distribution. Does a lot of undergrad level math that looks like it expects things to be standard normal. And, as we saw above, if the distribution is not STATED, then it could well be Cauchy, and thus pathological.

    Finally, I note that you offer NO evidence, data, facts, reference, link, anything. Just name calling, appeal to authority logical faults, and denigration / attack the messenger. You know, after a half dozen years of it, I’m still surprised that that’s the typical Modus Operandi of “Warmers”. I keep hoping y’all will learn that it doesn’t work well and makes you look kind of stupid.

    But by all means, keep working on your paper trail, and maybe someday you can get an upgrade from “Assistant” to real professor…

    (FWIW, I don’t like being mean, nor using Sniditude. Frankly, that’s why the “about” box says not to do it. Part of my basic style is to ‘be a mirror’. That means that when you are mean and ‘attack the person’, I defend in equal style and force. As that’s not very nice, I don’t really like it. So please, do try to address real, technical issues, and drop the “attack the messenger” and insults “to the person”. IF it continues, the first step is you go to the moderation queue. The step after that is you get heavily edited and chastised. The next step is a ‘Carping’ posting – make an example. Eventually you end up on the pitch list if you can’t learn to be nice and speak to real issues and not ‘attack the person’.)

    Now, you have anything to say that has relevance and content? Like maybe some evidence that temperatures are any particular distribution that does NOT have a pathological mean?

  27. paulbaer says:

    @E.M. Smith: You’re correct, and I do owe you an apology. My sniditude was definitely unwarranted. In my (not very convincing) defense, I came here following a link from WUWT, where snideitude (mostly directed towards the community I am part of) is the currency of the realm. And of course much of what you say about the lack of correspondence between credentials and correctness is plainly true.

    On content, a few points.

    First, early on, you ask whether the siting of thermometers at the sahara or antartica reflects the “mild conditions” that are required for the central limit theorem to hold. The meaning of “mild conditions” in this context is not about the conditions in which the data was collected, but rather about the conditions (assumptions) about the data which have to be true. Notably, for the central limit theorem to hold, the underlying data does NOT have to be normally distributed.

    Second, you ask whether the data has outliers, and therefor maybe temperatures aren’t well described by the normal distribution. Outliers has two meanings, however; one, data which is suspect because it would not be found in a normal distribution, and the other, data which is suspect but must be concluded to be correct and thus included. When you find outliers, generally you do exactly what Watts did, and look for (typically) data coding errors or other measurement mistakes. If you can’t eliminate any source of “measurement error” taken broadly, you need to appropriately take account of the outliers in your analysis, which may indeed mean making different assumptions about the underlying distribution. But again, whether the underlying distribution is standard normal or not does not in itself invalidate the use of the CLT.

    Third, there is a basic difference between the use of the mean in the measurement of a single data point (e.g., the temperature at a single point) and the mean of a spatial field. The calculation of global mean temperature does require assumptions that there are no systematic errors in the individual data points, and questions about urban heat islands are absolutely relevant to this question. Debates over the adequacy of the compensation for these effects will likely continue, but it’s not a new issue. In a problem like estimating global mean surface temperature (hereafter MST for short), as the author knows, the measurements from which MST is calculated are the estimated mean temperatures of a grid box (you say 1200 km but not whether that’s 1200 km on a side or 1200 km^2). Now, the calculating the temperatures of those gridboxes involves extrapolations with errors which cannot be reduced simply to questions about thermometer presence or bias; as a practical matter, the real topography and vegetation of a grid cell would be determinate of the relationship between temperatures at measurement points and in the extrapolated areas. But it is precisely the case that it is at the grid-cell level that the validity of the assumptions would have to be examined. After that, taking the average of some number of grid-cells is straightforward. It may not come from a normal distribution, and thus the error term may not be normally distributed, but to say that the mean doesn’t exist – or to place the burden of proof on those who say it does – seems like a stretch.

    Ultimately, however, the question I’m concerned with is this: if you want to say that the MST is not well defined or not accurately measured, what are you suggesting we should be using instead as an indicator of global warming? The argument has the appearance of being yet one more reason to dismiss the broad evidence that such warming is occurring, is human caused, and poses major risks to the values we care about, because the measurement we have might somehow not be good enough.

  28. Paul Baer – One thing I have not seen, in any modelling effort, is to take a particular point or points in the past and the data associated with it (maybe 50, 100, 150 years ago, say) and to run the model with that start-point and thus predict what the temperatures and climate would be as of today. If there is such a study of the validity of the models, please point to it as an example of due diligence in the modelling process.

    If you go through EM’s work on various AGW topics, you will find that he has exposed glaring errors in both the gathering of the data and the mathematics of dealing with that data, and to me at least this makes the predictions based upon those calculations very questionable. Going through those posts, and understanding them, will take some time and may well change your mind.

    The role of Carbon Dioxide in the atmosphere is overplayed. If you look at the “greenhouse effect” which assumes that the gas is somehow kept where it is, it is obvious that there is a whole lot more effect from the massive transport of water vapour around the atmosphere. Each day, we have a variation in temperature at ground level, and this alone shows that the idea of “accumulation of temperature” is not valid – if it got hotter during the day, then at night, just before dawn, the temperature would be only slightly higher if the cloud cover were the same as the day before. You can measure this yourself, as I did a few days ago – blue sky temperature -35°C, clouds -2°C – which one will have a greater cooling effect? One climate model pointed to by EM a while back didn’t take into account the rotation of the Earth, thus the day/night cycle, and thus could accumulate temperature (energy) continuously – this does not happen and is obviously a bad model.

    About a year ago I accepted the Global Warming Because Of CO2 idea since it seemed to have a lot of authority behind it and I hadn’t looked at the data and assumptions – no need to since I can’t do much about it and I’m not a Climate Scientist. I even have Al Gore’s “Inconvenient Truth” on DVD. Since then I’ve read EM’s arguments and found them valid – here I can check the data, the code used to massage it, and the basic assumptions underlying the code. I thus no longer accept the IPCC “on authority”, and I don’t have to take EM’s work “on authority” since I have the capability of checking it myself – but he’s right.

    Is it reasonable to expect the Sun to be absolutely stable in its output when it is by nature a chaotic process? Variations of the Sun’s output by 1% will by definition change the global mean temperature by about 3 degrees Celcius (or Kelvin), unless there is a negative feedback from the water-cycle and biosphere, both of which appear to be doing this.

    As I see it, the scare over CO2 is bad science, based upon an invalid assumption and then compounded with invalid maths. We may not have a perfect explanation of why the climate changes, but water will have a massively greater effect so we really need to look at this more closely. Why clouds form and where they form is going to be important. Climate scientists have lost their authority by jumping on the AGW bandwagon, and need to recover it before it becomes obvious to the masses that they got it badly wrong. That gives them at most a decade to get the maths right.

  29. John Robertson says:

    Hi E.M Richard Courtney has offered me an explanation of the Global Mean Temperature,
    @ WUWT Dr David Whitehouse Dec 18th, Richard Courtney 12:24pm.
    As I read his 2010 submission to the UK govt, he is indicating the mean is undefined, as each of the 3 key groups has a different mean&temperature trend and seem willing to change the mean without explanation and notice.As there can only be one average global temperature(if such is possible) 2 of those means are rubbish and if I follow his logic its all GIGO.
    I think Richard covers most of the questions you raised but some has not sunk in yet, for me.
    What is it with climatology? That ,I do not know, is so distasteful a sentence?
    @ Paul Baer 6:08, last paragraph, so if a line of resigning challenges the belief that we humans can estimate the mean global temperature in a meaningful way, we must shun the reasoning?
    As for direct linkages sorry I’m not competent on this machine.

  30. John Robertson says:

    Line of reasoning challenges. This apple is too smart for me.

  31. DrPat says:

    Essex, McKitrick and Andresen “J Non-Equilibrium Thermodynamics” (2006) have clearly demonstrated that there is no such thing as a meaningful global temperature in the context of global warming. Adding temperatures (intrinsic quantities) to get a mean is meaningless – unlike adding mass or volume (extrinsic quantities). That quite a few of my fellow physicists are involved in climate modelling and related activities that influence the allocation of billions of dollars in taxpayers funds, knowing from physics that mean global temperature is a meaningless concept, borders on the amoral and the obscene.

Comments are closed.