I am forever surprised with the somewhat poor integration that Polars has with S3, You would assume what with the prevalence of datasets sitting in remote stores this would be one of the first things that are incorporated and focused on.
j
jay
06/20/2024, 3:33 PM
They had really poor integrations until very recently actually… tbh I think it’s when we started putting out really strong S3 read benchmarks for Daft that Ritchie started focusing on it
c
Cory Grinstead
06/20/2024, 3:56 PM
IIRC it was on their roadmap for a while, but the original implementation was a bit of a mess and took a bit of finesse to get into a better state
🔥 2
j
jay
06/20/2024, 4:07 PM
That makes sense. Interestingly for us it was the other way around — we had really good S3 readers but had to put in additional work to get our local readers up to speed!