Community for the Daft project and all things distributed data

Distributed Data Community

<@U0664J23JMP> I cleaned up the separation  of `block_on` into `block_on_io_pool` and `block_on_current_thread` in this PR.

Should be ready for review!
<https://github.com/Eventual-Inc/Daft/pull/2687>

Approved! Also, I tested the url download workload, and it works perfectly:  cc <@U042126MG49>
```import daft
from daft.context import set_execution_config

set_execution_config(enable_native_executor=True, default_morsel_size=100)
df = daft.read_parquet(
    "<s3://daft-public-datasets/imagenet/sample-100k-deltalake/0-df403275-efee-4409-9ed6-8c25c4200655-0.parquet>"
)

df = df.with_column(
    "images",
    (
        "<s3://daft-public-datasets/imagenet/sample-100k-deltalake/images/>"
        + df["folder"]
        + "/"
        + df["filename"]
        + ".jpeg"
    )
    .url.download()
    .image.decode()
    .image.resize(128, 128),
)

NUM_ROWS_TO_DATALOAD = 1000
for i, row in enumerate(df):
    print(row)
    if i &gt;= NUM_ROWS_TO_DATALOAD:
        break```
Though to make it work, you still need to configure the `default_morsel_size`, otherwise it'll try run all 100k url downloads concurrently. I guess the next step would be to make the morsel sizing smarter.