Skip to main content

Hi all, S3 Inventory has been turned on for all the Sentinel buckets available via the AWS Public Datasets program. This includes sentinel-s1-l1c, sentinel-s2-l1c, sentinel-s2-l2a. The S3 bucket that contains these inventory files is sentinel-inventory and is in the eu-central-1 region.


These inventory files will provide, on a daily basis, the inventory of all files in the buckets along with their size and last updated time. More information on S3 Inventory files here.

I’m trying to use the s3 inventory files to catalog the archive and have the manifest.json downloaded. How do I know which csv.gz is the latest file to iterate over? Is each csv.gz an entire listing of the archive or just a diff? How would I iterate over the entire archive?


If you look in the manifest file, it should have a list of all the csv.gz files for that day. Let me know if that’s not the case.


yep I see all the csvs for that day. If I iterate over the csv’s listed under sentinel-s2-l1c/sentinel-s2-l1c-inventory/2018-09-11T08-00Z/manifest.json, would I get the entire archive? Particularly all the productInfo.json files is what I’m looking for.

See our catalog is missing the productPath for on each tile record.
For example (u’products/2016/10/18/S2A_OPER_PRD_MSIL1C_PDMC_20161019T103543_R050_V20161018T095052_20161018T095052’)

and I’m looking to regenerate all of them. The productInfo.json contains the information I need I just wanted to confirm that 1 manifest contains the entire archive.


Yep, the manifest for the day will have all the objects listed to access to get the full inventory.


How do I know what the latest inventory file is? I try to send a request to access:
sentinel-s2-l1c/sentinel-s2-l1c-inventory/2020-22-01T08-00Z/manifest.json but it says this key doesn’t exist.

I noticed the HH part of hte date changes, how do you know what the hour was that the manifest.json file was created?


Hi, your key format should look like sentinel-inventory/sentinel-s2-l1c/sentinel-s2-l1c-inventory/2020-01-22T04-00Z/manifest.json rather than what you posted. However, your point about the HH changing in the key is correct. I would recommend listing all the keys in the prefix sentinel-inventory/sentinel-s2-l1c/sentinel-s2-l1c-inventory/ which will get you a list of date keys and sorting them for the latest client side (via JS, Python, some gnarly bash scripting, etc.).


Reply