Skip to main content
Question

Recommendations for workflows for opening Tanager HDF5 data?

  • April 14, 2026
  • 6 replies
  • 648 views

Emil Cherrington
Forum|alt.badge.img+2

Hi folks,

Do you have any recommendations regarding workflows [that work] for easily opening + reading the Tanager HDF5 data you have provided via: https://www.planet.com/data/stac/browser/tanager-core-imagery/catalog.json? I have managed to open some scenes in SNAP, but they don’t seem to be exporting properly to GeoTIF. Any suggestions would be welcome, as I assume that others participating in the competition might run into similar issues. Thanks!

Emil

6 replies

Emil Cherrington
Forum|alt.badge.img+2

UPDATE: I was able to get the data processed with SNAP (version 13), which required reprojecting from the “basic_sr_hdf5”, and then exporting as a GeoTIF. I will draft up some documentation on how that’s done and share it as it might help folks get over the first barrier of just opening the data.


amy.rosenthal
Planeteer 🌎
Forum|alt.badge.img+3
  • Planeteer 🌎
  • April 15, 2026

Oh, that’s brilliant, ​@Emil Cherrington!  I think our colleague ​@keely.roth will have some additional thoughts on this. But I’m just appreciative of your quick-thinking and reporting back -- thanks for all you do! 


Emil Cherrington
Forum|alt.badge.img+2

Thanks, Amy! I appreciate it. I look forward to hearing from Keely, but I have worked out the process of extracting the data using SNAP (which is open software) in a few steps. I now have the 4 Tanager scenes I intend to work with extracted as GeoTIFs, and I’m uploading them to Earth Engine where I’ll be able to compare them with open data from NASA’s EMIT and PACE OCI (along the guidelines of the competition, which asks for EMIT and PRISMA). I do need to develop some tooling to work with the cumulative 866 bands of the Tanager data, which is a wee different from EMIT / EO-1 Hyperion / PACE OCI.

FWIW, I appreciate that there’s at least one scene over Belize. 😉


Forum|alt.badge.img+3
  • Planeteer 🌎
  • May 20, 2026

QGIS, gdal, rasterio, etc also all support reading HDFEOS format data, so long as your gdal build has the HDF5 driver enabled.  You should be able to add ortho assets to QGIS directly, you’ll just need to select which dataset within the HDF5 file you want to view in QGIS.  A menu to select that will pop up after adding the HDF5 file.

If you’re working in python, you can use h5py or other HDF5 libraries to read the data as a generic HDF5 file.  It won’t automatically recognize anything special about the HDFEOS conventions, but you can read in data using standard paths in h5py.  I’ll skip an example for brevity, though.

If you’d like to read the data in using gdal, you’ll similarly need to specify the dataset you want to work with rather than only the filename.  E.g.  you can list all of the multiple subdatasets in the HDF5 file with gdalinfo:
$ gdalinfo 20250223_165538_00_4001_basic_sr_hdf5.h5

But to work with a specific dataset, you’d need to specify it:
$ gdalinfo HDF5:"20250223_165538_00_4001_basic_sr_hdf5.h5"://HDFEOS/SWATHS/HYP/Data_Fields/surface_reflectance

Note that you can combine this with gdal’s vsigs driver to avoid the need to download things. E.g. 
$ gdalinfo 'HDF5:"/vsigs/open-cogs/planet-stac/tanager1-release2-core-imagery/ortho_sr_hdf5/20250213_142737_31_4001_ortho_sr_hdf5.h5"://HDFEOS/GRIDS/HYP/Data_Fields/surface_reflectance'

 

If your rasterio’s gdal driver install has the HDF5 driver, you can also use rasterio to access Tanager data in the same way.  However, not all rasterio installs will have the HDF5 driver included. PyPi and other wheel-based solutions (e.g. uv, pip, etc) often won’t. On conda, you’ll need to install libgdal-hdf5. 

Here’s an example of plotting the spectra at a point in the center of one of the open data scenes using rasterio without downloading any files:
 

import matplotlib.pyplot as plt
import numpy as np
import rasterio as rio

scene = '20250213_142737_31_4001'
gs_loc = f'open-cogs/planet-stac/tanager1-release2-core-imagery/ortho_sr_hdf5/{scene}_ortho_sr_hdf5.h5'
# Note: this is _also_ specific to the ortho SR asset.
path = f'HDF5:"/vsigs/{gs_loc}"://HDFEOS/GRIDS/HYP/Data_Fields/surface_reflectance'

with rio.open(path, 'r') as src:
# Select a point in the middle of the file
spectra, = src.sample([src.xy(src.height // 2, src.width // 2)])

# Get the wavelength metadata + valid flag
idx = range(1, src.count + 1)
wavelengths = np.array([float(src.tags(i)['wavelengths']) for i in idx])
valid_sr = np.array([bool(int(src.tags(i)['good_wavelength'])) for i in idx])

# Quick plot...
spectra = np.ma.masked_where(~valid_sr, spectra)
fig, ax = plt.subplots()
ax.plot(wavelengths, spectra, color='lightblue', lw=2)
ax.set(xlabel='Wavelength (nm)', ylabel='Surface Reflectance (unitless)', title=scene)
ax.margins(x=0, y=0.05)
plt.show()

 


Emil Cherrington
Forum|alt.badge.img+2

Thanks, ​@joe.kington, including for the recommendation that the Tanager data can be read in QGIS. I would like to communicate an observation from my colleagues from NASA ARSET, from their various user feedback surveys - GIS users generally want data in GeoTIFF format, and not in HDF or NetCDF.  I realize that you want to keep the fidelity of the data hence you have the data in said formats.

That said, not having the data automatically available in GIS format (i.e., GeoTIFF) unfortunately makes users jump through additional hoops, and there are users who will just forego jumping through the hoops and will just avoid using HDF / NetCDF data altogether. That is, it’s good to meet users where they are, figuratively speaking. With the PlanetScope data, Planet readily provides the data in GeoTIFF format, so one would think that it wouldn’t be such a heavy lift to provide the Tanager open STAC data in GeoTIFF format.

I’m just sharing a user perspective. And on behalf of the user community, I’d also like to say that we really appreciate that Planet has been releasing Tanager data through the open STAC.


Forum|alt.badge.img+3
  • Planeteer 🌎
  • May 29, 2026

@Emil Cherrington - The issue is that GIS software won’t be able to use any geolocation information if the data is in geotiff form. Geotiff would actually be _less_ usable for desktop GIS users. This is not raster data in the conventional sense.  HDFEOS is a format designed by NASA, just for clarity. It’s not anything Tanager-specific, and was chosen partly for compatibility with NASA’s datasets.

Most hyperspectral data and Tanager data specifically uses geolocation arrays as the primary model for providing spatial information. That means it’s not raster data in the traditional sense (i.e. the pixels are not regularly sampled in ground/map space).  I’m not aware of any GIS software that will support geolocation arrays in external files. ArcGIS and QGIS certainly don't. Gdal can via custom transformer options + a VRT, but it’s tricky to set up (and QGIS using that VRT wouldn’t display things correctly without the transformer options, which you can’t supply in QGIS).  Newer QGIS versions at least support using basic HDFEOS products correctly, while nothing would support using the equivalent data in geotiff form.

In addition, each HDFEOS file is up to 17 different datasets that would be their own geotiffs, and you’d need to download many of those to use any one of them. Geotiff would be awkward for that reason, as well as not having any usable location information.

It’s true the ortho products could be represented as geotiffs, but those cannot reasonably be used for analysis.  They’ve been resampled using nearest neighbor and the pixel extent no longer correctly corresponds to the area that the spectra came from.  For hyperspectral data, it’s vital to use data in its basic form (i.e. not raster data at all) and not ortho products when doing analysis. 

I’d urge caution around trying to do anything other than visualization with these datasets in geotiff form.