I have prepared a comparison of weather data for Bordeaux-Merignac airport in a effort to reproduce the results of Ashenfelter et al., 1995 based on weather data I’ve procured.      Here are some preliminary thoughts and results:

Goal

My goal for conducting this exercise is to find weather data sources that could be used to monitor Bordeaux and other wine regions (i.e. Virginia) during the season to speculate the likelihood of high or low quality outcomes.     There are data sources that could be used, but I’m not convinced yet that the temperature data at Bordeaux is representative.  I think the station was moved sometime near 1987,  but need to prove it.

Weather Data Sets

1) Weather Data from the original paper, as provided by Liquid Assets – Data from 1952 to 1988.   I contacted Dr. Ashefelter to inquire about the original source of data for the paper.   Apparently, the data used in the first publication were from “a french journal”, in which the data was transcribed from the journal.   For the  updated paper,  the data came from “a dutch web site.”.   I presume the Dutch web site is, in fact, KNMI.

2) Global Historical Climatology Network (GHCN)  data, as provided by NCDC – Data from 1952 to 1999

3) Global Summary of the Day (GSOD) data, as provided by NCDC – Data from 1972 to 2008

The GHCN and GSOD data can be obtained as the daily weather recordings and then averaged by month to reproduce the variables presented by Ashenfelter.   Please note that that difference between the GHCN and GSOD data is not that data is coming from different weather stations.   It’s simply that the data is reported from the same weather station under potentially different reporting standards and made available via different data sets.   Therefore, when using observational data,  and especially meteorological obsevation data,  there is one critical assumption that everyone must start with.  Assume the data is wrong.   Observing the weather is complicated stuff and while a spreadsheet might contain data,  you usually have to be very careful to ensure it’s correct.

In  the paper,  Ashenfelter uses three meteorological variables.   They are:

WRAIN – Winter rainfall.  Accumulated precipitation averaged from October to March prior to the growing year.

DEGREES – Average temperature from April to September of each year.    This represents the temperature during the growing year.

HRAIN – Harvest Rain,  the accumulated precipitation in August and September of each year.

In the charts below,  I have labeled the new data with the same base name (i.e. WRAIN) but appended the source (i.e. GHCN).

The Data – WRAIN

The chart below shows the three time series for the WRAIN (winter rain fall).

Winter Rainfall in Bordeaux, France.

Winter Rainfall in Bordeaux, France.

A few points:

1) The Ashenfelter and GHCN data are largely similar in variation but have different mean values.

2) The GSOD and GHCN data are quite similar in the years that they overlap, but interestingly both GSOD and GHCN data significantly differnt means during this time.

The Data – DEGREES

The chart below shows the time series for the three DEGREES time series.

Average Temperature During the Bordeaux Growing Season

Average Temperature During the Bordeaux Growing Season

Notes:

1) As stated in an earlier post,  I believe there is information not reflected in this graph.  Namely, the station was moved sometime near 1987.

The Data – HRAIN

The chart below depicts the three Harvest time Rainfall for Bordeaux, France.

Harvest Rainfall in Bordeaux, France

Harvest Rainfall in Bordeaux, France

Notes:

1)  Where is HRAIN?   It’s covered by HRAIN GHCN.   They are identical for about 20 years.

2) Even the GSOD data is quite similar during the early time period in which all data overlaps.

A Revisit of WRAIN

Let’s take a look at WRAIN again.  There are 13 years where HRAIN, HRAIN GHCN, and HRAIN GSOD overlap.    For each of the time series I have, I calculated the means Winter Rainfall for each time series during the 13 years of overlap and removed it.   So,  I now have “zero centered” time series showing the year to year deviations.    It is shown below.

Click on Image to Enlarge

Click on Image to Enlarge

In general,  the three data sets show the same variability (even if their standard deviations are not identical).   Less than normal rain is reflected in all three.  Greater than normal precipitation is reflected in all three.   Fortunately,   the excellent quality vintages occur in non-normal situations.    From this analysis, I can gain some confidence in using GSOD precipitation data for more recent years (given that the GHCN data availability ends in 1999).

I conclude the following from my analysis.

1) Assume the data is wrong.    Blindly applying the regression coefficients from the Ashenfelter et al, 1985 paper could be a dangerous endeavor without understanding how data from different sources compare.   I showed above, for instance, the means of winter rain are different across three data sources but the variability is at least similar.   You might choose to speculate the quality of a wine is underestimated by the market based on the equations.  But, is it the quality that is underestimated or the weather data that is causing that under estimation.

2)  Assume the data is wrong. I am still unsettled about the temperature time series during the growing season.   I’ll be working up an analysis to see what can be said about it.