#+OPTIONS: ':nil *:t -:t ::t <:t H:3 \n:nil ^:t arch:headline author:t c:nil
#+OPTIONS: creator:nil d:(not "LOGBOOK") date:t e:t email:nil f:t inline:t
#+OPTIONS: num:nil p:nil pri:nil prop:nil stat:t tags:t tasks:t tex:t timestamp:t
#+OPTIONS: title:t toc:nil todo:t |:t
#+TITLE: Exploring GOES 13 Satellite Data with Python and Emacs
#+DATE: <2017-01-31 Tue>
#+AUTHOR: Julien Chastang
#+EMAIL: julien dot c dot chastang at gmail
#+LANGUAGE: en
#+SELECT_TAGS: export
#+EXCLUDE_TAGS: noexport
#+CREATOR: Emacs 24.5.1 (Org mode 8.3.6)

#+SETUPFILE: ../../templates/level-1.org
#+INCLUDE: ../../templates/level-1.org

#+PROPERTY: header-args :tangle goes13.py :exports both

* <2017-02-01 Wed>

file:../../static/goes.png

I recently came across some GOES 13 satellite data (available here) I wanted to visualize. In order to do that, we can make use of the Emacs Docker container I wrote about previously that has a full-fledged Scientific Python environment. The data in question are in netCDF format. Everything that follows could be done with a Jupyter notebook, but it's often best to keep our workflows within Emacs and org-mode whenever possible.

#+BEGIN_SRC emacs-lisp :results silent :exports none  :tangle no
  (setq org-confirm-babel-evaluate nil)
  (setq org-export-babel-evaluate nil)
#+END_SRC

In order to crack open and visualize the data, we will need our usual suite of Scientific Python APIs.

#+BEGIN_SRC ipython :session :results silent
  import numpy as np
  import numpy.ma as ma
  import matplotlib.pyplot as plt
  import netCDF4
  from cartopy import config
  import cartopy.crs as ccrs

  # for inlining plots
  %matplotlib inline
#+END_SRC

Now, open the data via OpenDAP with the netcdf4-python API and print the header:

#+BEGIN_SRC ipython :session :results output
  opendapurl = "http://jetstream.unidata.ucar.edu/repository/opendap/5c36288d-208d-46f4-a8c7-228127a6e0f2/entry.das" 
  nc = netCDF4.Dataset(opendapurl)
  # if opening as a local file
  # nc = netCDF4.Dataset("goes13.2017.001.114518.BAND_04.nc")
  print(nc)
#+END_SRC

#+RESULTS:
#+begin_example
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format DAP2):
    Conventions: CF-1.4
    Source: McIDAS Area File
    Satellite_Sensor: G-13 IMG    
    DODS.strlen: 80
    DODS.dimName: auditSize
    dimensions(sizes): auditCount(3), maxStrlen64(64), time(1), xc(1304), yc(676)
    variables(dimensions): int32 version(), int32 sensorID(), int32 imageDate(), int32 imageTime(), int32 startLine(), int32 startElem(), int32 time(time), int32 dataWidth(), int32 lineRes(), int32 elemRes(), int32 prefixSize(), int32 crDate(), int32 crTime(), int32 bands(), |S1 auditTrail(auditCount,maxStrlen64), float32 data(time,yc,xc), float32 lat(yc,xc), float32 lon(yc,xc)
    groups: 

#+end_example


The header information reveals we are dealing with a McIDAS Area File that has been converted to netCDF. It has no coordinate variables. The ~Satellite Sensor~ global attribute hints that we have GOES-13 satellite data. Many of the variables (e.g., ~elemRes~) refer to Area file metadata. The most interesting variables are probably ~lat~, ~lon~ and ~data~. Let's look at the ~lat~ variable in more detail:

#+BEGIN_SRC ipython :session :results output
  ncvars = nc.variables
  lat = ncvars['lat']
  # print the variable attributes
  print(lat)
#+END_SRC

#+RESULTS:
: <class 'netCDF4._netCDF4.Variable'>
: float32 lat(yc, xc)
:     long_name: lat
:     units: degrees_north
: unlimited dimensions: 
: current shape = (676, 1304)
: filling off
: 

~lat~ variable contains:

#+BEGIN_SRC ipython :session :results output
  print(lat[:])
#+END_SRC

#+RESULTS:
#+begin_example
[[  2.14328934e+09   2.14328934e+09   2.14328934e+09 ...,   2.14328934e+09
    2.14328934e+09   2.14328934e+09]
 [  2.14328934e+09   2.14328934e+09   2.14328934e+09 ...,   2.14328934e+09
    2.14328934e+09   2.14328934e+09]
 [  2.14328934e+09   2.14328934e+09   2.14328934e+09 ...,   2.14328934e+09
    2.14328934e+09   2.14328934e+09]
 ..., 
 [  2.14328934e+09   2.14328934e+09   2.14328934e+09 ...,   2.14328934e+09
    2.14328934e+09   2.14328934e+09]
 [  2.14328934e+09   2.14328934e+09   2.14328934e+09 ...,   2.14328934e+09
    2.14328934e+09   2.14328934e+09]
 [  2.14328934e+09   2.14328934e+09   2.14328934e+09 ...,   2.14328934e+09
    2.14328934e+09   2.14328934e+09]]
#+end_example


Even though the variable metadata indicates no fill value, we've probably stumbled upon one with the ~2.14328934e+09~ repeated values. Let's create a numpy mask that will mask out that part of the two-dimensional array:

#+BEGIN_SRC ipython :session :results silent
  datamask = ma.masked_values(lat[:], 2.14328934e+09)
#+END_SRC

Before we make use of our mask, let's get the ~lon~ and ~data~ variables:

#+BEGIN_SRC ipython :session :results output
  lon = ncvars['lon']
  data = ncvars['data']
  print(data)
#+END_SRC

#+RESULTS:
: <class 'netCDF4._netCDF4.Variable'>
: float32 data(time, yc, xc)
:     long_name: 0-255 Brightness Temperature
:     type: VISR
:     coordinates: lon lat
: unlimited dimensions: 
: current shape = (1, 676, 1304)
: filling off
: 

The ~data~ variable contains satellite brightness temperature in a range of values between ~0~ and ~255~. Note the shape of the ~data~ variable is the same as the ~lat~ and ~lon~ but with an additional time dimension of length one. Let's now employ our ~datamask~ described earlier. For the ~data~ variable, we will retrieve the first and only time step (~data[0]~) before masking.

#+BEGIN_SRC ipython :session  :results silent
  datam = ma.masked_array(data[0], datamask.mask)
  latm = ma.masked_array(lat, datamask.mask)
  lonm = ma.masked_array(lon, datamask.mask)
#+END_SRC

At this point, all we have to do is plot up the data with the help of the Cartopy API.
#+BEGIN_SRC ipython :session :file ../../static/goes.png :exports both
  plt.figure(figsize=(7, 7))
  ax = plt.axes(projection=ccrs.PlateCarree())
  plt.contourf(lonm, latm, datam, 60, cmap='bone', transform=ccrs.PlateCarree())
  ax.coastlines()
  plt.show()
#+END_SRC

#+RESULTS:
file:../../static/goes.png