In an earlier posting I created an anomalies table (temperature – average(temperature) by month by station) and then produced some SQL reports from it. Those are rather long and a mess to read (though the details can be very interesting).
In this posting I’ve used Python3 to graph that same data, average anomaly by year, both in aggregate and by continent. Easier to read, but a bit less detail in some ways. (The eye can’t see 1/100 C..)
I’d originally intended to make these line graphs, but the data are so scattered, especially in the early years, so to be better served by a scatter graph.
With that, here’s the graphs.
Total of all data in GHCN v3.3
To me, it looks like nothing much happens in terms of a warming trend until suddenly in the late 1990s the cold excursions get trimmed. There’s a general flattening of range over the years as more thermometers come in to the average (expected) and as the low excursions disappear (odd that…)
We haven’t gotten particularly more hot excursions than in the past, but you can see the climb out of the Little Ice Age as the early very cold excursions end.
Region 1 – Africa
Nothing much happens until about the 1990s when BOOM! it shows warming – just after the electronic thermometers got rolled out.
Region 2 – Asia
Same story here, a big BOOM! in the 1990s but perhaps also enhance by some massive industrialization driven UHI in China.
Region 3 – South America
This one actually looks like it has some tend over the years. Still doesn’t look significant to me as it is mostly one data point of ‘rise’ at the far right that makes the tops higher. Most of the trend still seems to come from loss of cold excursions. And again we see wild swings in the early years with few thermometers when one instrument can move the average a lot more. (Which raises the issue of changes to thermometer count suppressing excursions in the average…)
Region 4 – North America
With more thermometers than the rest of the world combined, the USA doesn’t seem to be any hotter than in the past. There is a loss of cold excursions in recent years, but “why” is not clear. Perhaps all that added asphalt at the airports with many of the thermometers along with all the snow removal and tons of kerosene being burned in the Jet Age?
Region 5 – Australia / Pacific Islands
Much like Africa and Asia, no hotter and not much at all going on, until the 1990s. Then a loss of cold excursions and general tilt up. As I recall it, we already had a LOT of CO2 inventory in the air by the 1990s, so IMHO this needs some other explanation.
Region 6 – Europe
Europe is just a mess. Highly volatile in the early years and the longest record of anyone (so the very earliest data is the first thermometer…) From about 1850 to about 1975 nothing much changes other than a bit of reduced volatility with higher thermometer counts. Then AGAIN in the 1990s with the rollout of the electronic thermometers and lots more air travel by jets, a big spike up. That doesn’t look AT ALL like the gradual accumulation of CO2 causing general warming with more in the early years and a log reduction in impact recently.
Region 7 – Antarctica
I had to make the lower bound on anomaly -4 on this graph as there’s an excursion below -3. Not much to say, really, other than it certainly isn’t warming. We get some reduction of range with added thermometers (as expected – you don’t expect a whole continent to all go to one extreme at the same time, where a single location can). The anomaly to the hot side was highest in the ’30s and now is just gone.
Here’s the Python that’s making these graphs. I’m just going to show the one for Africa. The rest vary only in the Region Number used (if any) and the titles. I commented out the setting of a constant legend of years so that the variation in years / region would show better.
It finally dawned on me I could get the years on the right axis by changing the order in which the SQL statement retrieved them ;-)
# -*- coding: utf-8 -*- import datetime import pandas as pd import numpy as np import matplotlib.pylab as plt import math import mysql.connector as MySQLdb plt.title("GHCN v3.3 Africa Anomaly by Years") plt.ylabel(" Region 1 Anomaly C") plt.xlabel("Year") plt.ylim(-3,2) #plt.ylim(1850,2020) try: db=MySQLdb.connect(user="chiefio",password="LetMeIn!",database='temps') cursor=db.cursor() sql="SELECT year,AVG(deg_c) FROM anom3 WHERE region=1 GROUP BY year;" print("stuffed SQL statement") cursor.execute(sql) print("Executed SQL") stn=cursor.fetchall() # print(stn) data = np.array(list(stn)) print("Got data") xs = data.transpose() # or xs = data.T or xs = data[:,0] ys = data.transpose() print("after the transpose") plt.scatter(xs,ys,s=1,color='red',alpha=1) plt.show() except: print ("This is the exception branch") finally: print ("All Done") if db: db.close()
Thanks EM. Good to watch how a real scientist analyses the real data and comes to conclusions. And all without any government funding. WOW !
@Bill In Oz:
You are most welcome! Science isn’t hard at all. I twigged to it at about 8 years old. Mr. Wizzard or some such on TV who did simple things kids could follow, and Industry On Parade that showed where it lead in implementation. Somewhere around 5th Grade we did some experiments in Science Class and I started doing my own at home.
By the time I had High School Chemistry, I was already pretty good at it. I’d been reading every Scientific American for about 4 years…maybe 6… Back when it was a real Science journal.
I’ve frequently mentioned that Mr. McGuire, my Chemistry teacher, had strict rules about recording data in your lab book. ANY erasure was an automatic F. IF you made an error or correction, you drew ONE thin line through it and wrote a note next to it why it was to be ignored. Then ADDED the corrected value also with the explanation. That was my more formal Science training.
Probably set my attitudes that carried forward as data archivist for various company backups over the years…
I dearly love the simple beauty of Real Science ™. Make a guess (hypothesis). Gather data. Test it. Wrong? Try again with a new guess.. Right? Enhance the guess, gather more data, and test some more. Repeat until done.
One of the things that Pisses Me Off is the Climate Clowns asserting that if you are not an Ordained By Them “Climate Scientist” you can’t possible do any science. That is just So Wrong. All of the history of Science is full of papers done by folks with no degree in the field, or no degree at all.
Science just requires strict care to follow the rules of the game. It does not require at all any particular degree nor publishing in any given place. It is a pure search for truth and anyone can play, anyone can find the truth. Academia is the field that cares about publications and credentials. NOT Science.
While it can help to have a lot of formal education in a given area to advance that field, most of the major leaps tend to come from folks from other fields as they are not already blinded to the possibilities. They have nothing to unlearn before they can see what is in front of them.
So I just “do what I do” and ignore the rest. It’s all out there for anyone who wants to to look it over, use it, enhance it, ignore it. Whatever.
EM, yes that’s exactly what I think science is about. Thanks for that reply !
BTW EM,I have been going to the Bureau of Misinformation’s blog site and posting links to your blog here so folks can doubecheck on what they doing.
But the Bureau of Misinformation does not like people going on it’s blog and trying to correct them or provide better sources of information.
As of now, all my comments have been removed and comments are now not allowed.
Our great & powerfully deluded Bureau of Misinformation !
EM, it would have been nice to have the baseline value that the anomalies refer to annotated on each graph to get an idea of the Continental temperatures.
@AC, Indeed. Show the baseline absolute value for each graph. Any difference by continent would be useful.
There is no “baseline”. As noted in the description:
EACH instrument has ALL data for EACH month averaged.
That average by instrument by month is subtracted from EACH monthly datum EACH year to give a monthly anomaly for that instrument for that reading in that month.
In this way, every instrument is ONLY ever compared to itself. Within a given month.
So, for example, a thermometer in San Francisco ought to have more or less the same value every June as an average. Comparison of a given JUNE to the average of ALL Junes for SFO will show if that June is hotter or colder than the average June at SFO.
Every instrument in each month has it’s own ‘baseline’ based on ALL data for that instrument for that month.
For longer term records this ought to be Just Fine. Where it will “have issues” is in very short records. So, for example, if you had a bunch of records for a 10 year period of dropping, inside a 40 year period of generally rising, they would have a negative anomaly value for the last 5 years of their run and a positive value in the first 5 years. Average that in with the generally rising line and it will bump up the 5 years at the start and pull down the 5 years at the end, which is either going to put an up bump in what ought to be a down trending 5 year chunk, or dampen that rollover into the drop decade.
Overall, that record will push down some of that 10 year period as expected. As long as the record isn’t too badly dominated by such short chunks, it ought to be OK. (This is also why I do the “long records only” graph versions – a QA check of sorts on what is being shown).
Is this any better or worse than comparing an entirely fictional value in a Grid Box to a non-existent value in another Grid Box? I think it is better…
The basic problem is that the data are so crappy that whatever method you choose it will give some kind of damaged result. We ought to just chuck the whole idea that anything of merit can be said from the data about climate length intervals; but we can’t because this is the cudgel they chose to use.
Part of the reason for choosing a method without a “baseline” is simply to catch the magician. When you see a magic act, it is designed to deceive the senses when seen from the audience as presented. To figure out “how the deed is done”, step off the line of control. Look from the wings, or behind the stage. Go into the basement and watch the floor for drop doors. Don’t just sit in the audience, go on stage and open the back of the boxes. CHANGE things, anything. That is where you get insight.
Yes, the Magician will SCREAM at you to get off the stage, that you are not a magician, that you are “breaking” the magic. Etc. etc. Really they just don’t want their illusion to be discovered and explained….
Expect to see screaming that I’m doing the “analysis” (magic) wrong as I’m not using a “proper baseline” (opening the back of the box)…
So, for longer records, say 100 years with a 50 year cycle in it; The average of all JUNE records ought to be about right for what to expect in a JUNE and the hotter 1/2 cycle will have positive values and the lower half values will be negative. In short, it ought to work very well.
This gets into the issue of Nyquist in the time domain. Too short a record in a cyclical thing will give bogus trend lines based on how long a bit of string you use and when in the cycle you start. I highlight that issue, they try to hide it and use very short (1/2 the 60 year cycle) baseline values to excess.
So, by comparing the “only long records” results with the “all data” results you can illuminate where the short records might well be having an unexpected effect. IFF either of these is similar to the NASA / NCDC / ICCP / “warmista” POV and the other one is not, you now have a big clue where to have a “Dig Here!” into how those other folks use short records (as dominate their chosen “baseline”…)
Why do I think this a valid approach?
What IS an anomaly? It is the difference between a single data item in a series and the average of that series. What ARE the REAL data series? Values for individual thermometers averaged in each month over years. More or less by definition the only REAL anomaly you can create is to compare EACH instrument only to itself, by comparing individual monthly averages to the average of those monthly values over all years available.
Otherwise you end up comparing San Francisco in January 2014 to Reno Nevada in August 1950 and that’s the source of the current nonsense. (Reference Station Method of fabricating non-data box values).
Essentially: The data are too broken to give a good value. You get to choose which broken method is least broken. They have the peer reviewed Reference Station Method of fabrication of fantasy temperatures and “grid box’ fictional anomalies. I have the definition of an anomaly and strict adherence to the rule to NEVER compare an instrument to a fiction and only compare it to itself for creating anomalies. Pick your horse, place your bet…
Still, I’d show it.
So you want to be deceived cdquarles ?
I don’t and I do not need or want EM deceiving me.
Thanks EM !
Pingback: Interesting Anomaly By Region By Calendar Month Graphs | Musings from the Chiefio
There are 7280 total instruments. 12 months. That’s 87,360 segments. EACH segment gets an average. That average gets subtracted from the temperatures in each segment.
So what would you like me to “show”? The 87,360 segments or the 87,360 averages?…
That was the point of the description: There is NO one baseline.
Each thermometer calendar month is compared ONLY with itself. San Francisco JUNE is ONLY with the average of all JUNE data in San Francisco. Paris DECember is compared ONLY with the average of all DECember data in Paris. Repeat 87,360 times….
I’d be glad to show it, just tell me where to put the file 87,360 values…
As there are roughly 5.5 million total data points spread over 12 months, so 462,850 / month, so the average segment is about 462,850/7280 = 63.57 or about 64 years long ( 64 data items).
My idea was to show WHAT the anomaly was diverging from, ie the Average Temperature of that complete set.
Exactly. Show each station’s mean. Seeing mean +/- range tells me something. Then graph it. Knowing the mean, median and mode helps me. The anomaly graph, by itself, leaves out information important to me. Or, I guess, I could make my own database :).
@A C Osborn:
The average of temperatures is an interesting thing to look at to get an idea of the nature of the data, but it doesn’t say anything about actual conditions. For example, in California in the period from about 1950 to 1990 there are date from stations up in the Sierra Nevada (deep snow in winter, blasted cold) while after that it’s “4 on the beach” (SFO, Santa Maria, down in LA/Sandiego somewhere and one other). Just averaging the data puts a big down in the middle (baseline) and a “go flat very stable warm” at the end. (My thesis to the extent I have one being that the Reference Station Method and Grid / Box Anomalies is not perfect at removing that bias).
So you either average all that muck together and get a single (or few) meaningless values / trends; or you keep each segment separate and have 87k things to look at… Neither one of which gives what you want.
That’s 7280 numbers… then if you want it “by month” you get 87 K numbers… As a graph it will be unreadable. As a table too long to post (or read…)
I suspect both of you are still stuck in the mindset of One Global Mean Temperature or One Regional Average; and the whole point of my approach is to avoid those meaningless averages. But yes, I can create them. Except that don’t have any meaning and have less use for comparing to anomalies based on each instrument in each month.
I could make an example of what you want. Pick some single station and show the 12 months and the 12 averages. But then to do all of them would be 7279 more graphs… At about 4 minutes each to modify the script, run, and save the graph, that’s 29,120 minutes or 485 days or 1.32 years (not including sleep time, food, breaks, and, oh yeah, a life… so figure about 4 years working on it full time day shift on a salary from someone…)
The size of the number of individual comparisons matters here. A Lot. It is very necessary to look at the size of the data, the number of segments compared, and what you are requesting.
There are 87,360 segments, each with its own average. 7280 thermometers x 12 months.
You either do that many processes / graphs, or you get a number unrelated to this process that doesn’t mean anything.
For example, an average of those averages for, say, South America would include averaging values from frozen mountains in the 1950-1990 period but NOT in the present (as they left the high cold stations out of present data). That will not be very meaningful. (The basic problem being that the segments that make up South America [like everywhere else] have different lengths and cover different parts of the history.) So mountains in in some years, out in others. I can make those numbers but they have no meaning nor use.
Thanks for that EM.
It is odd about the stations being dropped and changed because a few years back when they were discussing Tony Hellers use of Raw Actuals instead of anomalies and Zeke posted a US and a Global graph of Actuals and they showed really large Steps in the data compared to Raw gridded anomalies.
During another discussion last year I asked a warmist to explain why the current datasets did not show the steps.
Well he actually went and looked at the data for the 2 graphs and came back and said “the steps were not really in the Temperatures they were artifacts of changing the Stations”.
I think I trust this anomaly more than the massaged, adjusted, trimmed and edited temperature record.
My Tangelos usually ripen in February and by mid-March they have very loose skins and are getting a bit old. This year the skins are still tight and the sugar a bit low even now…
I’ll trust plants way more than any thermometer record, especially those as politically driven and manipulated as the ones we have at present.
@Larry & E.M.: The thing I like about observing and plotting plant data is it is a far better indicator of climate than temperature alone. That graph of the forsythia says brrrrrrr!
BTW, I don’t know how widespread this saying is, but around my neck of the woods it’s, “Three snows after the forsythia blooms.” One or four snows after the bloom is pretty unusual. Two and three snows after the bloom holds up pretty well to observation.
Sort of a tangent which may be suitable for a thread of its own.
Huge crop losses due to midwest flooding.
Keep in mind that the descent in to the little ice age began in 1306 with massive unrelenting rain in Europe which flooded fields, leaving live stock standing in knee deep mud and crops roots rotting in the field.
The Baltic Sea froze over twice, in 1303 and 1306-07; years followed of unseasonable cold, storms and rains, and a rise in the level of the Caspian Sea.
Interesting list of historical climate events (period around late 1200’s into the early 1300’s is very interesting to read)
Click to access weather1.pdf
Here, we don’t have a saying like that. What I have noticed is that daffodils typically bloom in January. Crocuses later. Fruit trees start blooming in February most often. It rarely freezes after Easter. Easter is variable, though; but most common on or after April 1. That said, I don’t see nearly as many daffodils now as I did when I was young. That’s about 60 years now.
Okay, I will have to make my own database, then.
From Steven Goddard on twitter
19 hours ago
On this date in 1907, Portsmouth, Ohio was 96 degrees.
Great Lakes states March 23 afternoon temperatures have declined nearly two degrees since the 19th century.
Mary Catherine O’Neall
5 minutes ago
Replying to @SteveSGoddard
Not sure how it is that the mean temperature for a single day of the year for a particular area or country would refute all the arguments for global climate change. I take it this post is aimed at the scientifically and statistically naive.
17 minutes ago
How about every single day of every year for the past century at all 1,218 USHCN stations? Would that help you out of your climate superstition?
In the interest of fairness, a rebuttal to the Goddard posts.
It would be really ! interesting to see a plot of that change in mean latitude of stations!
To see if that was just US stations, or world wide stations, and how was it accomplished, by adding new northern stations (at lower altitudes) or dropping high altitude southern latitude stations?
My gut tells me there is another abuse of statistics buried in there, that the northern drift of latitude is actually cover some other hanky panky.
Go for it! It isn’t hard and I’ve posted all the code (and happy to help with debugging if needed).
Do think a moment about the size of the job when doing things like “showing all the averages”. It is a very large set of numbers, one for each thermometer for each calendar month= ~86,000 …
An interesting idea I think I can cook up an “average lattitude” by year report…
FWIW, what i remember from back doing v2 about a decade back was that it looked like each country or region was treated a bit differently. Some changes get closer to water. Some to lower altitudes. Some drift the location (like Africa toward the equator. I suspect it is done that way so that in the gross averages there’s no one thing that stands out.
For California I found they hare removed all inland and mountain stations, leaving only 4 near the coast. Similarly, South America lost high altitude stations.
My guess for the added stations in v4 is that they will be a LOT of airports and cities. There just aren’t that many actual rural stations. Then they now “conveniently” no longer have an Airport Flag, nearness to water, or nearness to urban centers data. That right there is a giant “DIG HERE NOW!!!” flag. What is NOT visible is what is most worth digging…
I’m busy on my own set of stuff and I’m not that good with Google Maps (and it really wants a high end PC for fancy stuff with Google Earth) but I think it would be productive to pick a sample of stations and find them on the globe. Just the % Urban and % airports from their LAT / LONG would be interesting…
It is easy to capture lat long on google maps, if you give me a list I can verify a batch.
That said in the US WUWT stations report may already have a lot of that worked out.
I have not looked at it in detail since he put it out but it would be a starting place.
E M This statement of yours is important
” My guess for the added stations in v4 is that they will be a LOT of airports and cities. There just aren’t that many actual rural stations. Then they now “conveniently” no longer have an Airport Flag, nearness to water, or nearness to urban centers data. That right there is a giant “DIG HERE NOW!!!” flag. What is NOT visible is what is most worth digging…”
I wonder how much this has happened here in Oz.
It would explain a lot of things.
I found a text listing of all the stations with the lat / long info already tabulated
I pulled it from this page:
Text file of all stations
This gives the full lat / long associated with each station.
Hmmm surprise that text listing is not at all what I thought it was!
It lists 3287 stations in Colorado, and just about everyone of them appears to be back yard stations in residential locations, (like the Weather Underground stations people setup) not the official weather stations out at DIA.
The lat / long of the DIA weather sensor appears to be 39.8665 -104.6508
Put that string into the google maps search box (or bing maps which has clearer imagery) and it will jump to the location of the sensor in the north west quadrant of the airport.
It looks to me that that lat/long is wrong, as I do not see a stevenson screen sensor pad at or near that location. There are a couple pads off to the north west that could have the temp sensors on them though, but really sloppy location effort.
Assuming it is reasonably close to the actual location, note that with our prevailing westerly winds, that location is down wind of both the primary north south runways and almost in line (slightly north of due east) from all the airplane concourses where the planes sit and run up their engines and taxi in and out of the concourse areas.
The old weather station lat / long is
39.76850 -104.8681 as 10490 Smith Rd this was just off the runways about 1/4 mile at the old Stapleton airport. That location also appears to be wrong (unless they moved it after the official sensor moved out to DIA. Back in the mid 1980s the sensor was due south of the buildings on that long path well away from the building and parking lots (fairly good location).
As LAT and LONG are in the GHCN tapes, and loaded into the DB, it’s easy to get them. There’s a few thousand records though… so some “think time” ought to go into just which ones you want to have pulled.
Maybe first would just be a LAT / LONG average over time for some selected countries. See if, for example, they move from the mountains of France to the southern shore ;-)
Here’s the French inventory:
Would be a bit much to post all the station IDs in use in all the years. Divide the first number by 12 to get the thermometer count (as it is counted once for each of the 12 months of data). So starts with just one, rises to a peak of 100, then drops down to 576/12= 48 now.
So here’s the ones now:
If something like that is of interest, I can make a report with Station, LAT, LONG, Elevation for a few years for some particular country(s)….
Oh, and remember that newer stations will not be Stevenson Screens but those round canister things… or looking like a jungle gym of pipes with wind speed gear and more at Airports…
I know, this is v4 stuff on a v3 thread, but it started here so…
Here’s French Altitude over time in v4:
Pingback: GHCN v3.3 vs v4 – Top Level Entry Point | Musings from the Chiefio