Is there a standard way to go from `DataType.image...
# daft-dev
r
Is there a standard way to go from
DataType.image()
to
PIL.Image
? Unfortunately I have some annoying downstream stuff that is picky about the obj type
j
I believe right now
image -> Python
casting gives a numpy array, and then if you do
PIL.Image.fromarray(arr)
it would give you the PIL image (I think)
Lmk if that is the case for you?
r
yup! just tried this, it works great ty!
j
Is there some downstream library you’re using that requires PIL? Curious what your workload is like
r
Yeah working with a clip embedding model, I'm using their python library that wraps some logic for convenience. I'm sure I could get around the PIL Image req with some finangling, but I'm lazy 😅
j
Hmm interesting. Any interest in an expression like
df["image"].embedding.clip()
or something like that?
We could probably make it work for both the image type and the string types
And run the actual CLIP model in Rust
r
woah, yeah that would be sick!
would be cool if it accepted a string for the HF model as well -> a stateful UDF
cuz effectively that's what I'm doing
or
df["thing_to_embed"].embedding.from_hf("<model_name>")
not sure if that is syntactically consistent tho