Hey y'all, is there a specific way we should be ca...
# daft-dev
r
Hey y'all, is there a specific way we should be casting an iterable (either np.array or list) to the embedding type? I'm trying to apply a column level fn using
with_column
and am consistently running into
PanicException: ('not implemented: List casting not implemented for dtype: Embedding[Float32; 1536]',)
Copy code
df.with_column(
    "embedding",
    df["words"].apply(lambda word: get_embedding(word), return_dtype=daft.DataType.embedding(dtype=daft.DataType.float32(), size=1536))
).show(2)
example of usage,
get_embedding
rn outputs a list of float32s
j
Hmm, interesting. I think this was an oversight, we should probably write the casting logic from
List -> Embedding
! I just took a look and it seems like our
Python -> Embedding
casting is working. So something like this seems to work…
Copy code
df.with_column(
    "embedding",
    df["words"].apply(
        lambda word: get_embedding(word),
        return_dtype=daft.DataType.python()
    ).cast(
        daft.DataType.embedding(
            dtype=daft.DataType.float32(),
            size=1536,
        )
    )
).show(2)
πŸ™Œ 1
r
Ah, amazing! ty for the swift response!
j
πŸ˜› nice β€” lmk also if you want to take a stab at it!
I can help you out as you go along too
r
I'll take a look!
j
The logic is here! https://github.com/Eventual-Inc/Daft/blob/2290d030945c4df2c441b67268b078b4a238e549/src/daft-core/src/array/ops/cast.rs#L1691 We just have to add a new branch to handle the
Embedding
case, which likely will just go through the
FixedSizeList
case and then become wrapped in an
Embedding
logical type πŸ™‚
πŸ‘€ 1