I’m working on getting an interesting derived dataset into Sentinel Hub, that’s distributed as 2600+ non-COG geotiffs, totaling 26gb, but only 834 mb zipped up. So it’s not well compressed - if I process those with ‘deflate’ then it comes down to 11gb. See https://zenodo.org/records/10907151 for the data.
I’m working with a few of these types of datasets, some with more data, and am mostly just wondering how I should reason about putting it all in one COG vs many small files / COG’s that I’d register as ‘tiles’?
Like at the extreme I could take these 2600 geotiff’s, turn them into COGs and make each of them a ‘tile’? But it seems like that probably won't work all that well as there’s no dataset-wide ‘overview’ (unless SH generates something like that? At the other end I could try to put them all into a single COG, maybe it ends up 15 gigabytes or something? With overviews it seems like it should perform decently, and is easier conceptually.
But then I’m curious if there’s an upper limit - if I have 8 band global data at 3-meters should I make a 300gb COG? Or is it a better practice to break it up in some way.
I've not yet managed to get the BYOC-tool working yet, so perhaps there’s some advice embedded in using that, but I couldn’t find anything online for this question. I may well be misunderstanding something, but just looking for advice on how to process my geotiff's to work well on BYOC.
thanks!