Hello ,
I have table with ~3000 polygon. I need to retrieve for each polygon different vegetation indices for ~6 months, when each polygon has different dates,and the area/bounding box of all the polygons is very large (more than 2500x2500). so the table looks similar to this:
polygon name start_date end_date
POLYGON ((.... plot1 2020-09-01 2021-02-18
POLYGON ((..... plot2 2021-10-05 2022-03-13
...
My original methodology was to iterate through rows in my dataframe- each time take one polygon, get the specific dates, get the NDVI statistics, save it, repeat for next row ect… but this is super heavy and takes between 4 hours to 12.
The multiple request seems to be not relevant as each polygon has one date.
My question here is- what is the reason that it goes so slow? and what is the best methodology to use, as we have changing dates for each polygon? does batch processing using AWS is the only solution?