Hi there!
I’m running a ProcessPoolExecutor
(concurrent.futures.process
), to process data about various AOI in parallel. Each process (i.e. worker) is using SentinelHubCatalog
and calls .search()
method to query all tile URLs containing the given AOI. The worker looks something like this:
def worker(bbox, date_interval):
catalog = sentinelhub.SentinelHubCatalog()
# Generate bounding box.
search_bbox = sentinelhub.BBox(bbox=bbox, crs=sentinelhub.CRS.WGS84)
# Request list of tile S3 paths which contain the AOI and whose cloud coverage < 50%.
search_iterator = catalog.search(
sentinelhub.DataCollection.SENTINEL2_L2A,
bbox=search_bbox,
time=date_interval,
query={
'eo:cloud_cover': {
'lt': 50
}
},
fields={
'include': :
'properties.datetime',
'assets.data.href'
]
}
)
# Parse tile URLs generated by the Catalog API.
urls = =tilel'assets']''data']''href']':-1] for tile in list(search_iterator)]
# DO SOME PROCESSING...
Running ProcessPool with 10 or even 20 parallel processes works great. However, once I increase that number to >30. I get the following error: oauthlib.oauth2.rfc6749.errors.MissingTokenError: (missing_token) Missing access token parameter
.
Upgrading sentinelhub package to the latest version (sentinelhub-3.6.1) as suggested here didn’t help.
I also tried using locks (i.e. semaphores). This limited the number of errors but didn’t prevent them fully which makes me think this is due to a query rate limiting. Am I correct? If so, this error message is very confusing.
Is calling a .search()
method on SentinelHubCatalog
subject to the Requests/PU quota I can see in the dashboard? Could you suggest any workaround so I can process data using a higher number of workers?
Thanks!