Jupyter Book and GitHub repo.
MODIS#
Medium resolution spectrometer
Introduction#
This notebook concerns comparison of MODIS and VIIRS remote sensing observations of the sea surface with data from OOI, particularly the shallow profiler. OB.DAAC is the ocean biology resource. There are multiple data types but of particular interest are SST (Sea Surface Temperature), Ocean Color (OC) and Inherent Optical Properties (IOPs).
Objectives#
Check: Identify, order and examine SST and chlorophyll data products
Sort observations by date; display as time series (discard unusable)
Extract data for sites of interest and compare to profiler data
Look into data attributes such as effective optical depth
MSLA
Identify surface chlorophyll concentration data products#
To commence: the Goddard data portal. The starting point for getting and viewing data is hosted by NASA Goddard.
This is a multi-section document on the nature of MODIS-derived chlorophyll estimates. At the bottom of the page there are two links for data
This is a thumbnail-plus-parameter search page. I narrowed the time window, to March 2021, selected ‘mapped’ (SMI?) data and selected 4km, the finer resolution option. The “go” button takes me to a series of URLs.
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021060.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021061.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021062.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021063.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021064.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021065.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021066.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021067.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021068.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021069.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021070.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021071.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021072.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021073.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021074.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021075.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021076.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021077.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021078.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021079.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021080.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021081.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021082.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021083.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021084.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021085.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021086.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021087.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021088.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021089.L3m_DAY_CHL_chlor_a_4km.nc
https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/A2021090.L3m_DAY_CHL_chlor_a_4km.nc
Ocean Biology Distributed Active Archive Center (OB.DAAC)
Ocean color file format web page
Binned data may include subordinate files. Binning is just quantization
SMI stands for Standard Mapped Image products; produced from binned data
Let’s try out the Level 3 browser
Do everything twice: Once for Aqua, once for Terra
Use Extract (not Download) to select a bounding box
Bounding box goes from coast out to the Juan de Fuca west plate boundary
North 49.0, West -130.5, East -122.5, South 43.0
Starting “easy” with monthly products; 8-day products still have a lot of missing data
Order can be confirmed in browser (takes a few minutes) or by email (ditto)
Results persist for a week or so; don’t forget to do Terra and Aqua both
Have not yet examined the Ocean Data Product Users Guide
Here is the helpful How to download data page.
Another option: NASA Earth Observations (NEO) site
No bounding box
Monthly or 8-day
Download raw data
yields 45MB NetCDF files with 0.1 deg (11km) resolution
Retrieve order data#
Download the file
http_manifest.txt
and place the contents in a browser address barThis opens a “save
requested_files_1.tar
dialog box: Save and usetar -xvf
Since I ordered 1 year at monthly intervals with end time = 01-JAN-2019… it gave me 13 files
Other aspects of data access#
MODIS Web Mapping Service Installation website
Revisit 2023#
minimize includes: See
modis.py
Noted
group
use in NetCDF filesOrder: IOP, SST and OC for July 2021, bounding box lat 43 – 49, lon (-123.5) – (-130.5)
This came to 5GB; dropped the IOP for now
External directory is ~/data/ocean/modis/jul21/oc and /sst
from matplotlib import pyplot as plt
from modis import *
print('\nJupyter Notebook running Python {}'.format(sys.version_info[0]))
Jupyter Notebook running Python 3
# to do: explain what is going on here (?) archaic
# from IPython.core.interactiveshell import InteractiveShell
# InteractiveShell.ast_node_interactivity = 'all' # default is ‘last_expr’
# automatically reload external modules such as golive_library if they have changed
# %load_ext autoreload
# %autoreload 2
Groups in NetCDF data#
Groups have been added to NetCDF file structure; and they can be viewed as mutually independent (if related). To access them requires knowing their names; which is either a documentation issue or a bootstrap issue. Here we take some bootstrap code:
def expand_var_dict(var_dict, group):
for var_key, _ in group.variables.items():
if group.name not in var_dict.keys():
var_dict[group.name] = list()
var_dict[group.name].append(var_key)
for _, sub_group in group.groups.items():
expand_var_dict(var_dict, sub_group)
all_vars = dict()
with nc.Dataset("fnm", "r") as root_group:
expand_var_dict(all_vars, root_group)
print(all_vars)
This produces a dictionary listing groups and sub-groups:
{
'sensor_band_parameters': ['wavelength', 'vcal_gain', 'vcal_offset', 'F0', 'aw', 'bbw', 'k_oz', 'k_no2', 'Tau_r'],
'scan_line_attributes': ['year', 'day', 'msec', 'detnum', 'mside', 'slon', 'clon', 'elon', 'slat', 'clat', 'elat', 'csol_z'],
'geophysical_data': ['sst', 'qual_sst', 'flags_sst', 'bias_sst', 'stdv_sst', 'sstref', 'l2_flags'],
'navigation_data': ['longitude', 'latitude', 'cntl_pt_cols', 'cntl_pt_rows', 'tilt']}
The important consequences are that (a) there is access to sensor data (derived SST in this case) and (b) there will be some means of translating pixel coordinates of SST to latitude / longitude.
base_dir = "../../../data/ocean/modis/jul21/sst/"
f = os.listdir(base_dir)
print('There are ' + str(len(f)) + ' data files')
for fnm in f[3:7]:
print(fnm)
ds_gd=xr.open_dataset(base_dir + fnm, group='geophysical_data')
ds_gd.sst.plot(vmin=4, vmax=22)
plt.show()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[3], line 2
1 base_dir = "../../../data/ocean/modis/jul21/sst/"
----> 2 f = os.listdir(base_dir)
3 print('There are ' + str(len(f)) + ' data files')
4 for fnm in f[3:7]:
FileNotFoundError: [Errno 2] No such file or directory: '../../../data/ocean/modis/jul21/sst/'
base_dir = "../../../data/ocean/modis/jul21/sst/"
fnm = 'SNPP_VIIRS.20210722T212401.L2.SST.x.nc'
ds_gd=xr.open_dataset(base_dir + fnm, group='geophysical_data')
ds_nd=xr.open_dataset(base_dir + fnm, group='navigation_data')
ds_nd, ds_gd
(<xarray.Dataset>
Dimensions: (number_of_lines: 1469, pixel_control_points: 1241)
Dimensions without coordinates: number_of_lines, pixel_control_points
Data variables:
longitude (number_of_lines, pixel_control_points) float32 ...
latitude (number_of_lines, pixel_control_points) float32 ...
cntl_pt_cols (pixel_control_points) float64 ...
cntl_pt_rows (number_of_lines) float64 ...
tilt (number_of_lines) float32 ...
Attributes:
gringpointlongitude: [-119.198685 -131.75633 -136.13136 -121.31009 ]
gringpointlatitude: [41.561226 39.849297 49.130085 51.105026]
gringpointsequence: [1 2 3 4],
<xarray.Dataset>
Dimensions: (number_of_lines: 1469, pixels_per_line: 1241)
Dimensions without coordinates: number_of_lines, pixels_per_line
Data variables:
sst (number_of_lines, pixels_per_line) float32 ...
qual_sst (number_of_lines, pixels_per_line) float32 ...
flags_sst (number_of_lines, pixels_per_line) float32 ...
bias_sst (number_of_lines, pixels_per_line) float32 ...
stdv_sst (number_of_lines, pixels_per_line) float32 ...
sstref (number_of_lines, pixels_per_line) float32 ...
l2_flags (number_of_lines, pixels_per_line) int32 ...)
ds_gd.sst.plot(vmin=4.,vmax=22.,xincrease=False)
<matplotlib.collections.QuadMesh at 0x7f78e8cd1950>

ds_nd.latitude.plot(vmin=39.,vmax=52,xincrease=False)
<matplotlib.collections.QuadMesh at 0x7f78e8baa050>

ds_nd.longitude.plot(vmin=-137,vmax=-119,xincrease=False)
<matplotlib.collections.QuadMesh at 0x7f78e8c75310>

ds_sbp = xr.open_dataset(fnm, group='sensor_band_parameters')
ds_sla = xr.open_dataset(fnm, group='scan_line_attributes')
ds_gd = xr.open_dataset(fnm, group='geophysical_data')
ds_nd = xr.open_dataset(fnm, group='navigation_data')
Sort sea surface temperature (SST) datasets by time#
Produces a sorted list of triples: (datetime-string, corresponding datetime64, original filename)
base_dir = "../../../data/ocean/modis/jul21/sst/"
f = os.listdir(base_dir)
sst_files_sorted = []
for fnm in f:
dtstring = fnm.rstrip('.L2.SST.x.nc')
if dtstring[0:12] == 'TERRA_MODIS.': dtstring = dtstring.lstrip('TERRA_MODIS.')
elif dtstring[0:11] == 'AQUA_MODIS.': dtstring = dtstring.lstrip('AQUA_MODIS.')
elif dtstring[0:11] == 'SNPP_VIIRS.': dtstring = dtstring.lstrip('SNPP_VIIRS.')
else:
print( 'NON-MATCH filename found ' + fnm)
break
expanded_dtstring = dtstring[0:4] + '-' + dtstring[4:6] + '-' + dtstring[6:11] + ':' + dtstring[11:13] + ':' + dtstring[13:15]
dtvalue = dt64(expanded_dtstring) # Example: dt64('2017-08-21T07:30:12')
sst_files_sorted.append((dtstring, dtvalue, fnm))
sst_files_sorted.sort(key=lambda a: a[0]) # Check: print(sst_files_sorted[0], sst_files_sorted[17])
# This cell sets site location lat/lon/depth
eoLat, eoLon, eoDepth = 44. + 22./60. + 10./3600., -(124. + 57./60. + 15./3600.), 582.
osbLat, osbLon, osbDepth = 44. + 30./60. + 55./3600., -(125. + 23./60. + 23./3600.), 2906.
axbLat, axbLon, axbDepth = 45. + 49./60. + 5./3600., -(129. + 45./60. + 13./3600.), 2605.
site_data = {'eo': { 'name' : 'Endurance Offshore', 'decimal_lat' : eoLat, 'decimal_lon' : eoLon, 'depth' : eoDepth },
'osb': { 'name' : 'Oregon Slope Base', 'decimal_lat' : osbLat, 'decimal_lon' : osbLon, 'depth' : osbDepth },
'axb': { 'name' : 'Axial Base', 'decimal_lat' : axbLat, 'decimal_lon' : axbLon, 'depth' : axbDepth }}
print('uh oh... OSB should give 44.52897 -125.38966; check this: ' + str(site_data['osb']['decimal_lat']) + ' ' + str(site_data['osb']['decimal_lon']))
uh oh... OSB should give 44.52897 -125.38966; check this: 44.515277777777776 -125.38972222222223
Southern Hydrate Ridge: shrLat, shrLon, shrDep = 44. + 34./60. + 9./3600., -(125 + 8./60. + 53./3600.), 778.
Axial ASHES Vent Field: ashesLat, ashesLon, ashesDep = 45. + 56./60. + 1./3600., -(130. + 0./60. + 50./3600.), 1543.
Axial Caldera Center: axcLat, axcLon, axcDep = 45. + 57./60. + 17./3600., -(130. + 0./60. + 32./3600.), 1528.
Axial Caldera East: axeLat, axeLon, axeDep = 45. + 56./60. + 23./3600., -(129. + 58./60. + 27./3600.), 1516.
Axial Caldera International: axiLat, axiLon, axiDep = 45. + 53./60. + 35./3600., -(129. + 58./60. + 44./3600.), 1520.
# for i in ca: print(i[0], i[1], i[2])
# for i in range(3): for j in range(3): print(m.MODISA_L3m_SST_2014_sst[0][i][j].values)
#
# testLat = ca[1][1] # 44.51528
# testLon = ca[1][2] # -125.38972
# print(testLat, testLon)
sst = []
for i in range(len(ca)):
# print(m.MODISA_L3m_SST_2014_sst.sel(lat=ca[i][1], lon=ca[i][2], method = 'nearest'))
temp = modis.MODISA_L3m_SST_2014_sst.sel(lat=ca[i][1], lon=ca[i][2], method = 'nearest').values
sst.append(round(float(temp[0]), 2))
# Here are MODIS-derived day-time sea surface temperatures at the cabled array locations on 2016-01-01T00:15:10
print(sst)
Convert line/sample to lat/lon#
On this GSFC ocean color page we have details on data groups and format. Quoting:
The data objects in the group
geophysical_data
are derived from the sensor science data. This group also includes the data quality flags. All of the data objects have dimensions number_of_lines x pixels_per_line. Most parameters are stored as integers, which are scaled according to the attributes scale_factor and add_offset attached to the data object.
Parameters that are wavelength-specific (e.g., Rrs) have separate data objects for each band used to derive the parameter; a list of wavelengths for each sensor provided in the Mission and Sensors entry. The SST and SST4 are stored in separate archive products with their respective quality fields. The specific fields are listed in the Level-2 File Format for each sensor. A complete list of parameters that can be output by the software is given in the L2gen User Guide.
The Level-2 data flags and quality fields are stored in the l2_flags data object in each Level-2 file. This object has attributes flag_masks and flag_meanings that provide the names of the algorithms used in determining the setting of the corresponding bits in l2_flags (the least significant bit being the first bit). The algorithms associated with these names, and the use of the corresponding bits as masks or as flags, are described in volumes of the SeaWiFS TM Series. Note that for SST, only the LAND, SSTWARN and SSTFAIL flags are meaningful, as the quality indicators are stored in sst_qual and sst4_qual, respectively.
Navigation:
The data objects in the group navigation_data contain the geolocation information for the data product. This includes the latitudes and longitudes of the observed locations, the control point indices (needed if the latitudes and longitudes are subsampled and not full-resolution), and the G-ring coordinates for the swath.
nc
ds_gd.sst.plot()
ds.chlor_a.plot()
# bounding box latitude 43 - 47, longitude -123 - -131
datadir = os.getenv("HOME") + "/modis2019/"
datafile = datadir + "A2019283.L3b_DAY_CHL.x.nc"
ds = xr.open_dataset(datafile)
ds
# ds.MODISA_L3b_DAY_CHL.plot(figsize=(12,12))
ds_modis = xr.open_mfdataset()
ds_modis_0 = ds_modis[0]
print(ds_modis_0.MODISA_L3m_SST_2014_sst)
print(ds_modis_0.lat.values[0], modis.lon.values[0], modis.lat.values[-1], modis.lon.values[-1])
modisplot = ds_modis_0.MODISA_L3m_SST_2014_sst.plot(figsize=(12,12))
# bug: This does not work properly yet
# plt.savefig(data + '/images/modis_SST.png')
import matplotlib
img=matplotlib.image.imread(image_d + 'modis_sst.png')
imgplot = plt.imshow(img)
# plt.show()
# The goal is to overlay an image (created earlier; .png file) with a controllable opacity
#
# How I got to opacity control through a sequence of print statements:
# print(thisMap)
# print(thisMap.layers)
# print(thisMap.layers[1]) which is cheating by hardcoding the layer index as 1
# print(thisMap.layers[1].opacity)
# therefore setting thisMap.layers[1].opacity = 0.2 works just fine
def ChangeMapOpacity(opacity):thisMap.layers[1].opacity = opacity
from ipyleaflet import Map, ImageOverlay, WMSLayer
opacity = 0.7
thisMap = Map(center=(47, -129), zoom=5, layout=Layout(width='100%', height='600px'))
# The path to the image for overlay is relative; I could not get an absolute path to work...
sourceImage = '../data/images/modis_sst.png'
image_layer = ImageOverlay(url=sourceImage, bounds=((40.2, -138.58), (53.8, -122.45)), opacity=opacity)
thisMap.add_layer(image_layer)
interact(ChangeMapOpacity, opacity = widgets.FloatSlider(min=0., max=1., step=0.025, value=opacity,
continuous_update=False, display='Opacity'))
# It seems that 'thisMap' must be the last line of code in the cell, hence placing interact() above
thisMap
# The following works but does not track zoom with new higher resolution tiles
# wms = WMSLayer(url="https://demo.boundlessgeo.com/geoserver/ows?",layers="nasa:bluemarble")
# thisMap.add_layer(wms)
# And this works properly albeit with limited tiling resolution
from ipyleaflet import Map, WMSLayer
wms = WMSLayer(
url="https://demo.boundlessgeo.com/geoserver/ows?",
layers="nasa:bluemarble"
)
m = Map(layers=(wms, ), center=(42.5531, -48.6914), zoom=3)
m
def pp(ds):
ds['time'] = xr.Variable('time', [pd.Timestamp(ds.attrs['time_coverage_start']) + pd.Timedelta(15, unit='d')])
return ds
cb=xr.open_mfdataset('modis_chlora_2017_*.nc', preprocess = pp, concat_dim='time').chlor_a.to_dataset()
# cb.chlor_a shows units as mg / cubic meter, numerically equivalent to ug / liter
p,a=plt.subplots(figsize=(13,9))
for x in np.arange(-125.38966-0.1, -125.38966+0.1, 0.1):
for y in np.arange(44.52897-0.1, 44.52897+0.1, 0.1):
a.plot(cb.time.values, cb.chlor_a.sel(lon=x, lat=y, method='nearest').values, 'o-')
# cb.to_netcdf('modis_chlora_5_month_timeseries.nc')
cb
latselect = np.arange(44.52897-1., 44.52897+1.00001, 0.10)
lonselect = np.arange(-125.38966-1., -125.38966+1.00001, 0.10)
local = cb.sel(lat = latselect, lon = lonselect, method = 'nearest')
local
%%time
p,a=plt.subplots(figsize=(13,9))
for i in range(9,12):
for j in range(9,12):
a.plot(local.time.values, local.chlor_a.isel(lat=i,lon=j).values, 'o-')
a.plot(local.time.values, local.chlor_a.isel(lat=10,lon=10).values, 'o-', linewidth=6)
local.to_netcdf('local.nc')
local=xr.open_dataset('local.nc')
local
l1
# p,a=plt.subplots(figsize=(13,9))
# for t in l1.chlor_a: print(t)
# a.plot(local.time.values, t.values, 'o-')
for i in range(21):
for j in range(21):
print(l1.chlor_a[0:5, i, j].values)
local = cb.reindex_like()
local.chlor_a[0].plot(figsize=(14,9),cmap=plt.cm.rainbow,vmin=0.00, vmax=4.0)
plt.xlim(-128,-124); plt.ylim(43.5,45.5); plt.plot(-125.38966, 44.52897, '^')
local.chlor_a[1].plot(figsize=(14,9),cmap=plt.cm.rainbow,vmin=0.00, vmax=4.0)
plt.xlim(-128,-124); plt.ylim(43.5,45.5); plt.plot(-125.38966, 44.52897, '^')
local.chlor_a[2].plot(figsize=(14,9),cmap=plt.cm.rainbow,vmin=0.00, vmax=4.0)
plt.xlim(-128,-124); plt.ylim(43.5,45.5); plt.plot(-125.38966, 44.52897, '^')
local.chlor_a[3].plot(figsize=(14,9),cmap=plt.cm.rainbow,vmin=0.00, vmax=4.0)
plt.xlim(-128,-124); plt.ylim(43.5,45.5); plt.plot(-125.38966, 44.52897, '^')
local.chlor_a[4].plot(figsize=(14,9),cmap=plt.cm.rainbow,vmin=0.00, vmax=4.0)
plt.xlim(-128,-124); plt.ylim(43.5,45.5); plt.plot(-125.38966, 44.52897, '^')
ca=xr.open_rasterio('./chlora.tif').to_dataset(name='chlora')
ca
ca.chlora
cad=ca.where(ca.chlora<1.4)
cad.chlora.plot(figsize=(14,9),cmap=plt.cm.rainbow,vmin=0.00, vmax=1.0)
plt.xlim(-128,-124)
plt.ylim(43.5,45.5)
plt.plot(-125.38966, 44.52897, '^')
# a=p.axes()
# a.set(xlim=(-155,-135))
float(ca.sel(band=0,y=44.52897,x=-125.38966, method='nearest').values)
for x in np.arange(-125.5, -125.0, 0.1):
for y in np.arange(44.3, 44.8, 0.1):
print(float(ca.sel(band=0,y=y,x=x, method='nearest').values))