To all AWS Public Datasets users of S2 L1C data. This thread will be used to communicate changes happening to the S2 bucket. As we expect to gather some feedback from power users in this thread, we might change the approach a bit during the discussion in ongoing. The summarized and harmonized description will be available on this FAQ link.
End of June we will set S2 L1C bucket as “Requester pays” similarly as it is for S2 L2A and S1 GRD. This move is required to allow further increase of the shared data - for continuation of Sentinel-2 L1C coverage, to accommodate global rollout of Sentinel-2 L2A and to host European and then global Sentinel-2 analysis ready mosaics.
How might this affect your workflow:
- If you are using the data within AWS EU (Frankfurt) region using S3 protocol, there will be practically no change. There will be some very small additional charge incurred to your account, related to “GET, SELECT and all other Requests”, currently at 0.43$ per 1 million requests.
- If you are using the data within any other AWS region using S3 protocol, there will be a small data transfer charge incurred to your account - “Data transfer OUT from Amazon EC2 To Another AWS region”, currently at 0.02 $ per GB. You can avoid this by setting up part (or all) of your processing to AWS EU (Frankfurt) region.
- If you are using the data within AWS using HTTP protocol, you will need to sign requests (more info here). You will find examples on how to do it here. We will also upgrade our sentinelhub-py python libraries to work with S3 data soon (by mid of May latest) to make it easier for you.
- If you are using data outside of AWS, you should consider setting up AWS instance within EU-1 region. There is free tier available and you might be applicable for AWS research credits. Alternatively see this thread on how it is possible to access data directly over Internet. Note that there will be a small charge for this as well “Data transfer OUT from Amazon EC2 To Internet”, currently at 0.05-0.09 $ per GB
Note that whichever category you fall into above, you will need to make changes to your code to continue to access Sentinel-2 data on AWS. In most cases, it’s as simple as adding a flag to your object request.
As there are several web applications out there making use of Sentinel meta-data, we have decided to make some of these data available over HTTP permanently:
- L1C and L2A tiles
- readme (not yet there)
- preview.jpg
- tileInfo.json
- productInfo.json
- metadata.xml
- L1C and L2A products
- readme (not yet there)
- metadata.xml
- productInfo.json
In case you believe we should make some other meta-data available in a similar manner, describe the need in the thread bellow.
We understand that this change might be an inconvenience to some users. However, the goal of this experiment is to discover how best to stage data for analysis and processing in the cloud. When the Sentinel Public Dataset was established two years ago, getting hold of Sentinel imagery was quite a challenge. It was distributed in unwieldy chunks and Copernicus SciHub had trouble managing the demand. And there was no other place to get it. At that time, we made a decision to make the data available as easily as possible and as openly as possible. Things are changing now, with the collaborative ground segment running, four DIAS-es coming in a few months, and data generally being more easily accessible. So we believe the time is now right to go back to original purpose – experimenting with how best to stage data for analysis in the cloud.
Best Regards,
Sinergise (Sentinel custodian at AWS Public Datasets)