-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issues with EOBS data #9
Comments
Here is the fraction of eobs missing values from years 1980 to 2020: import xarray as xr
from dask.distributed import Client
client = Client(dashboard_address="localhost:8787")
store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/pangeo-forge/EOBS-feedstock/eobs-tg-tn-tx-rr-hu-pp.zarr"
ds = xr.open_dataset(store, engine="zarr", chunks={}).sel(time=slice("1980", "2020"))
def sum_nan(da):
return da.isnull().sum(dim="time") / da.time.size
%time tg_nan = sum_nan(ds.tg).compute()
%time pr_nan = sum_nan(ds.pp).compute()
tg_nan.plot() pr_nan.plot() There are definitely some regions to take care of when computing monthly or seasonal means. |
@larsbuntemeyer Just in case it might be helpful, for the Copernicus Atlas [1], we generated a mask for E-OBS to exclude areas with fewer number of stations or where the stations were not continuous over time. It was created manually, based on simple visual prospections... |
Thanks @JavierDiezSierra , that looks reasonable. I think, if we want to look at 1980-2020, we could be a little more relaxed. I would image some criteria, where the time range of interest should not contain more than 10% (or any other threshold we can define) of missing values, othewise, it's masked out. |
@larsbuntemeyer Yes, I agree that following a threshold criterion is less problematic. I think eliminating gridcells with more than10% missing values is appropriate :) I will try to include CORDEX-CMIP5 spatial map biases with respect to eobs in the coming days. |
The mask from the Copernicus atlas looks better (more consistent) for me, at least visually. I would only add all missing parts of the EU countries, e.g. southern Greece, Rhodes, Cyprus, Sicily with some clarifications in the paper. Northern Africa is not in the focus and can be excluded. The eastern Mediterranean is also not in the focus, although this region can be left. By the way, what version of E-OBS is used here ? |
@gnikulin If I'm right, we are using version 23.1 at 0.1 (https://catalog.leap.columbia.edu/feedstock/eobs-dataset), but version 30.e is already available and has been dowloaded in the JSC server (/mnt/CORDEX_CMIP6_tmp/aux_data/eobs). @larsbuntemeyer, why don't we use the latest verion of E-OBS? In Kotlarski (2014), they regridded EUR-11 to the 0.22 E-OBS grid for the period 1989-2008, but we are regridding both projections and E-OBS to the rotated reference grid for EUR-11. This is an example of the bias for CORDEX-CMIP5 models for the period 1989-2008 on the EUR-11 mesh. It reproduces Figure 2 from Kotlarski. The results are very similar, with slightly differences. |
Yes, agreed, it's not the latest version, but it is easily accessible without having to download more data to the filesystem. I'll update the workflow to use the latest version. |
As far as i see, the latest eobs is not available on 0.22 grid anymore, so i think the approach to use the EUR-11 grid for comparison seems reasonable (also lambert conformal models are regridded to EUR-11 rotated grid). |
There are some known issues with EOBS data that we have to deal with, e.g., there are some regions that have only limited observations and might have to be skipped for monthly and sesaonal means. We have some experience with it. Pinging @paindeer since you have worked a lot with EOBS data to evaluate REMO ERA5 output. I could not find any details in https://doi.org/10.5194/gmd-7-1297-2014 about that.
The text was updated successfully, but these errors were encountered: