I am new to Planet data and I am doing some exploratory work and research using Python. Usually I use libraries such as rasterio to manipulate raster data. However, it seems that with the way Planet data is delivered (I am using PSScene 1B data), it is hard to read in the raster with the metadata directly. I find myself trying to parse the relevant information from the XML metadata file manually.
Is there a recommended way to solve this issue, such that I don’t have to write my own parsing functions?
Any tips on working with Planet data in python in general would be much appreciated.
Best answer by Miguel Castro GomezView original
To better understand your situation, how are you getting the Planet data? Via Explorer or APIs (Data or Orders)?
Once you have your Planet data as a raster file in your local system you can read it with rasterio with:
Having the raster as a `rasterio object` allows you to explore some basic metadata with:
You can access other relevant metadata using rasterio builtin functionalities with:
If you want to access additional metadata for a given product, you will have to use the metadata files available for download.
If you are using Planet’s APIs to search and download your products, you can get plenty of metadata about a product from the API response. Can you share more of your workflow to guide you on that?
Additionally, have a look to our GitHub repo (also here) where you can find plenty of Notebooks on how to read and process Planet data with Python
Thanks for your reply. For the time being I am just downloading some sample images with the Planet Explorer in order to test the rest of the pipeline. Specifically I downloaded data of Planet bundle_type “basic_analytic_udm2” and item_id “20201124_031355_67_1064” and “20201126_031738_79_105d”.
I know that these are PLANETSCOPE BASIC ANALYTIC SCENE PRODUCTS, so they are not cartographically projected. However, in the metadata.xml the image corner coordinates are given in EPSG:4326, which would allow an approximate projection at least. When I open the tif files in QGIS it automatically recognizes the cordinate reference system and displays the images correctly. In rasterio no crs is recognized at all.
The end goal of all of this is to use these images for stereo analysis and test whether or not we can reconstruct a local DSM using the Planet data. That is why I am using the Basic Analytic Scene Product, because we need the RPC values.
Currently I am trying to parse the relevant metadata from the xml file manually, in order to reconstruct the cartographic projection using GDAL. However, it’s a bit tedious. Seeing as QGIS is able to read the metadata automatically, I was hoping maybe you know of some way to do the same in Python.
QGIS may by default read additional metadata from the xml files but if you want to accomplish that in Python you will have to code it. The advantage would be that, as Planet data follows a standard structure, you could write a function an get that information for any given product.
If you choose to get the data with the Data API for example, before activating and downloading a product you get information (see below) where it would be easier to extract metadata (provided as a dict).