By Andy May
This is the last in my current series of posts on ocean temperatures. In our previous posts we compared land-based measurements with sea surface temperatures (SSTs) and discussed problems and inconsistencies with land-based measurements (see here). This first post was too long and complicated, but I had to lay a foundation before I got to the interesting things. Next, we covered the basic thermal structure of the ocean and how it circulates its enormous heat content around the world (see here). This was followed by a description of the sea surface or the skin layer. Evaporation takes place on the skin of the ocean, and it is here that many wavelengths of solar radiation, especially the longer infrared wavelengths emitted by CO2, are absorbed, then evaporated or reflected.
The next post looked at the mixed layer, which is a uniform temperature layer just below the skin layer. The mixed layer is above the thermocline where the water temperature begins to drop rapidly. The next article discussed the differences between different estimates of the SST, the data used, and the problems with the measurements and corrections. The focus was on the two main datasets HadSST and NOAA ERSST. The most recent post deals with SST anomalies. The logic was, if all measurements are taken directly below the sea surface, why are anomalies needed? Why can't we just use the measure corrected to a useful depth like 8 inches?
The theme of all posts is to keep the analysis as close as possible to the measurements. Too many corrections and data manipulations confuse the interpretation, remove us from the measurements and homogenize the measurements to such an extent that we get a false feeling of trust in the result. This illusion of accuracy due to over-processing of data is discussed here by William Brigg. His contribution deals with smoothing data, but the arguments apply to homogenizing temperatures that produce anomalies from the mean from bias correction measurements using statistical techniques. All of these processes "make the data look better" and give us a false sense of trust. We don't know how much of the resulting graphs and maps are due to corrections and data manipulation, and how much is due to the underlying measurements. We went through the entire process step by step, examining what the temperatures looked like at each step. We're trying to pull the wizard's curtain back as much as possible, but probably not as well as Toto.
The data used in all of these contributions, with the exception of the earliest contributions to the land measurements in CONUS (the controversial United States or the "lower 48"), were all from latitude and longitude grids. The grid of measurements is required globally, as measurements are mainly concentrated in the northern hemisphere and very sparse elsewhere and in both polar regions. As we have already shown, the temperatures in the northern hemisphere are abnormal, the rest of the world is much less variable in its surface temperature. Here is a discussion of this and some graphics. For more information on hemispherical variations in temperature trends, see this post by Renee Hannon and this post by the same author.
While in areas like the United States, which is well covered with good weather stations, the grid is likely not necessary and potentially misleading to produce a global average SST, we need to grid the available data. This does not mean that the grid replaces good measurements or improves the measurements, just that it is necessary with the data we have.
Each grid cell represents a different area of the ocean. The difference is only a function of latitude. Each latitude is 111 km. A longitude at the equator is also 111 km. but drops to zero at the poles. To calculate the area of each grid cell we just need to know the latitude of the cell and the size of the grid cells in latitude and longitude. The solution here is provided by Dr. Math (National Council of Math Teachers) provided and derived. I'm not going to break the narrative with an equation, but the R code at the end of the post shows the equation and how it was used.
It might seem strange to correct the data so obviously recently, but I honestly wanted to see if it made that much of a difference. It turns out that the calculated average temperature differs significantly, but the trends hardly differ. Below in Figure 1 is the original picture from the mixed layer post comparing various estimates of the SST and the global mean temperature of the mixed layer. Usually, especially at night, the mixed layer and the SST are very close to each other, so it is valid to draw them all together.
Figure 1. The comparison of the global temperature estimates for mixed strata and SST from the mixed stratum post. You can find an explanation of the plot in the post here.
The curves in Figure 1 are all corrected to a depth between 20 cm. to one meter. These are grid mean values, ie average values of grid values, which, however, are not corrected for the area represented by each grid value. Figure 2 is the same diagram, but constructed with area-weighted grid cell values. We also added a new curve, the NOAA ERSST curve, after zeroing the ERSST values, which correspond to the null values in the HadSST dataset. In this way we compare ERSST and HadSST across the same global ocean areas. The normal ERSST dataset (the lower green line in Figure 1 and the lower brown line in Figure 2) uses interpolation and extrapolation to fill grid cells with insufficient data. These cells are null in HadSST.
At first glance, the two graphs look very similar, but notice that the vertical scaling has changed. Everything is shifted two to four degrees because the polar regions have cells with smaller areas. The NOAA ICOADS SST line is in the same location as it has already been area corrected. It's also the line that is closest to the measurements. The processes used to make this line are much simpler and less complicated than those used by HadSST and ERSST. The difference between HadSST and ERSST is still there, but smaller. These two temperature records use similar data, but as described above, their grid is different and they cover different areas. As soon as the ERSST raster is "masked" according to HadSST, it is raised from 18.2 to 22 degrees.
Figure 2. This is the same graph as Figure 1, except that all of the grid values in the data sets are weighted with the grid cell ranges they represent. Notice that the vertical scaling has changed.
The multi-year grid of the NOAA MIMOC and the University of Hamburg is shown graphically in both diagrams above the NOAA ERSST data set, but is around 4.5 ° C warmer after the area weighting algorithm is applied. NOAA MIMOC and the University of Hamburg create their grids with data from more than 12 years, so that they fill much more grids than the one-year data sets. They also weight argo and buoy data heavily, just like NOAA ERSST.
As we saw from the unweighted data in the last post, The HadSST measured the temperature trend. If the ERSST raster is masked with the HadSST zeros, it also goes down. This can be seen in Figure 3.
Figure 3. Decreasing SSTs over the area covered by HadSST can be seen in both the HadSST and ERSST datasets.
The HadSST dataset only contains grid values for which there are enough measurements to calculate one. They don't extrapolate data into neighboring grid cells like ERSST. Thus, the HadSST data represents the part of the ocean with the best data and the temperature in this area drops significantly if only the measurements are used. The ERSST suggests a drop of 2.5 degrees / century. The HadSST decline is 1.6 degrees / century.
The ERSST line without a mask (Figure 2) shows an increasing trend of 1.6 degrees / century. Thus, the interpolated and extrapolated areas show heating that does not occur in the cells with the best data. As we saw in the last post and shown again in this post as Figure 4, the HadSST and ERSST anomalies do not show any of the complexity that took up six posts. They show a common warming trend of around 1.7 degrees / century.
Figure 4. The HadSST and ERSST anomalies.
The best way to analyze data is to use the minimum necessary statistical manipulation to present it in usable form. Every correction, every calculation, every smoothing process, every grid step must be fully justified. It is an unfortunate fact in today's scientific and technical life that our colleagues keep coming up with: "This has to be corrected," This has to be corrected, "etc. Little thought is given about how the corrections affect our perception of the resulting graphics and maps. With enough corrections you can turn a dung heap into a castle, but is it really a castle?
I once had a boss who was very smart. He eventually became the CEO and Chairman of the Board of Directors of the company we worked for. I was much younger then and one of the scientists in the peanut gallery who kept asking, "What about this?" "What about it, did you correct these things?" My boss would say, "Let's not beat the science itself. What does the data say as it is?" After becoming CEO, he sold the company for five times the average exercise price of my accumulated stock options. I was still one Scientist, albeit a wealthy one. He was right, so was Dr. William Briggs. Study the raw data, keep it close to you and "out-science".
I reprocessed a lot of data to write this post. I think I got it right, but I make mistakes. For those who want to check out my work, here is my new R code for area weighting.
None of that is in my new book Politics and Climate Change: A Story, But Buy It Anyway.