Is daft's .hash() guaranteed to be the same across dataframes and clusters as long as the contents of the column being hashed is the same and no seed is specified?
k
Kevin Wang
09/05/2024, 6:52 AM
To my understanding, yes it should be the same
k
Kyle
09/05/2024, 7:26 AM
Thanks!
Kyle
09/05/2024, 7:26 AM
Do you know what the default seed used is?
k
Kevin Wang
09/05/2024, 5:52 PM
The default seed is 0
Kevin Wang
09/05/2024, 5:53 PM
I'm not sure why our Expression.hash function is not present in our documentation, will look into that