Read NetCDF files in R

I recently help someone who needed some help with manipulating NetCDFs. I thought I would share the document I put together for him in case someone else has similar needs. Enjoy.

To read NetCF files with R, the package ncdf or ncdf4 is required. So make sure it is installed. The remainder of the document will assume ncdf4 is installed. As an example a wind speed NetCDF file will be read.

Let's load the ncdf4 package and load a netcdf file

# loading required library
library(ncdf4)
## Warning: package 'ncdf4' was built under R version 3.2.3
# list files contain in the working folder
dir()
## [1] "Assessment of the performance of CORDEX-South Asia experiments for monsoonal precipitation over the Himalayan region during present climate.pdf"
## [2] "NetCDFReadingInR.pdf"                                                                                                                           
## [3] "NetCDFReadingInR.Rmd"                                                                                                                           
## [4] "NetCDFReadingInR_RSnippet.html"                                                                                                                 
## [5] "NetCDFReadingInR_RSnippet.Rmd"                                                                                                                  
## [6] "sfcWindmax_WAS-44i_MPI-M-MPI-ESM-LR_historical_r1i1p1_MPI-CSC-REMO2009_v1_day_19660101-19701231.nc"
# Open a connection to the netcdf file
nc.con<-nc_open("sfcWindmax_WAS-44i_MPI-M-MPI-ESM-LR_historical_r1i1p1_MPI-CSC-REMO2009_v1_day_19660101-19701231.nc")

# Let's display some information about the netcdf file we just opened.
print(nc.con) 
## File sfcWindmax_WAS-44i_MPI-M-MPI-ESM-LR_historical_r1i1p1_MPI-CSC-REMO2009_v1_day_19660101-19701231.nc (NC_FORMAT_NETCDF4_CLASSIC):
## 
##      2 variables (excluding dimension variables):
##         double time_bnds[nb2,time]   
##         float sfcWindmax[lon,lat,time]   
##             standard_name: wind_speed
##             long_name: Daily Maximum Near-Surface Wind Speed
##             units: m s-1
##             _FillValue: 1.00000002004088e+20
##             missing_value: 1.00000002004088e+20
##             original_units: m/s
##             cell_methods: time: maximum
## 
##      4 dimensions:
##         lon  Size:195
##             standard_name: longitude
##             long_name: longitude
##             units: degrees_east
##             axis: X
##         lat  Size:124
##             standard_name: latitude
##             long_name: latitude
##             units: degrees_north
##             axis: Y
##         time  Size:1826   *** is unlimited ***
##             standard_name: time
##             bounds: time_bnds
##             units: days since 1949-12-01 00:00:00
##             calendar: proleptic_gregorian
##         nb2  Size:2
## 
##     24 global attributes:
##         CDI: Climate Data Interface version 1.6.4 (http://code.zmaw.de/projects/cdi)
##         source: REMO
##         institution: Helmholtz-Zentrum Geesthacht, Climate Service Center, Max Planck Institute for Meteorology
##         Conventions: CF-1.4
##         contact: csc-anfragen@hzg.de
##         comment: CORDEX SouthAsia MPI-CSC-REMO2009 0.44 deg MPI-M-MPI-ESM-LR historical
##         creation_date: 14-04-23T09:32:29Z
##         experiment_id: historical
##         experiment: historical
##         driving_experiment: MPI-M-MPI-ESM-LR, historical, r1i1p1
##         driving_model_id: MPI-M-MPI-ESM-LR
##         driving_model_ensemble_member: r1i1p1
##         driving_experiment_name: historical
##         frequency: day
##         institute_id: MPI-CSC
##         model_id: MPI-CSC-REMO2009
##         project_id: CORDEX
##         rcm_version_id: v1
##         product: output
##         references: http://www.remo-rcm.de
##         nco_openmp_thread_number: 1
##         NCO: 4.0.3
##         CDO: Climate Data Operators version 1.6.4 (http://code.zmaw.de/projects/cdo)
##         CORDEX_domain: WAS-44i

What the previous output tells us is that, there is a variable name “sfcWindmax” (wind speed we presume), which has three dimensions: lon, lat and time. With dimmensions 195 x 124 x 1826 respectively. The time attributes also tells us that the time is the number of days since 1949-12-01. We can extract just the first time step and plot it.

# Extract a subset of the netcdf
nc.dat<-ncvar_get(nc.con,varid="sfcWindmax",start=c(1,1,1),count=c(-1,-1,1))

# plot
image(nc.dat)

# we can extract the lat long and add them to the graph
lat<-ncvar_get(nc.con,varid="lat")
lon<-ncvar_get(nc.con,varid="lon")

# Plot with real coordinates
image(lon,lat,nc.dat)

If your computer has enough ram you can read the whole dataset and manipulqate it as well. If the whole dataset is read, you can plot any slice you want

# Extract the whole netcdf
nc.dat<-ncvar_get(nc.con,varid="sfcWindmax")

# plot time step 20
image(nc.dat[,,20])

# Plot time step 20 with real coordinates
image(lon,lat,nc.dat[,,20])

You can also extract one point and plot a time seties

# Extract a single point over time from the netcdf
nc.dat<-ncvar_get(nc.con,varid="sfcWindmax",start=c(50,50,1),count=c(1,1,-1))

# extract the time
ts.dat<-ncvar_get(nc.con,varid="time")

# Plot time series
plot(ts.dat,nc.dat,xlab="time in days since 1949-12-01",ylab="windspeed (m/s)",ty="l",col="blue")

There you have it. Remember that not all NetCDFs are perfect matrices so carefully examining the informations and attributes of the NetCDF after opening it is key.