Hi, I am working on github issue #2074 as a first ...
# general
a
Hi, I am working on github issue #2074 as a first time contributor and a beginner to Rust. I am difficulties to get the ops/utf8.rs method compiled. Can someone help me out with this?
j
Hi! What issues are you having? Would you like to post the errors youโ€™re having?
a
Hi, I am trying to implement the functionality like this in path Daftsrcdaft-coresrcarrayopsutf8.rs
Copy code
pub fn repeat(&self, n: usize) -> DaftResult {
    let self_arrow = self.as_arrow();

    // Handle empty data case.
    if self.is_empty() {
        return Ok(Utf8Array::empty(self.name(), &DataType::Utf8));
    }
    let arrow_result = self_arrow
        .iter()
        .flat_map(|element| element.map(|w| w.repeat(n)))
        .collect::>>();
    
    Ok(Utf8Array::from((self.name(), Box::new(arrow_result?))))
}
For which I get the following Error output:
Copy code
rror[E0277]: a value of type `Result<arrow2::array::Utf8Array<i64>, DaftError>` cannot be built from an iterator over elements of type `std::string::String`
   --> src\daft-core\src\array\ops\utf8.rs:510:14
    |
510 |             .collect::<DaftResult<arrow2::array::Utf8Array<i64>>>();
    |              ^^^^^^^ value of type `Result<arrow2::array::Utf8Array<i64>, DaftError>` cannot be built from `std::iter::Iterator<Item=std::string::String>`
    |
    = help: the trait `FromIterator<std::string::String>` is not implemented for `Result<arrow2::array::Utf8Array<i64>, DaftError>`
    = help: the trait `FromIterator<Result<A, E>>` is implemented for `Result<V, E>`
c
Hey @Akshay Verma! Instead of
flat_map
, try using
map
instead, i.e.
Copy code
let arrow_result = self_arrow
            .iter()
            .map(|element| element.map(|w| w.repeat(n)))
            .collect::<arrow2::array::Utf8Array<i64>>();
what's happening with your code is that
flat_map
is filtering out the null values, resulting in an iterator of
Strings
. This won't work because arrow2 arrays are built from
Option<String>
, and also because we want to preserve cardinality in the result array.
a
Thanks. That makes sense. I am trying out the build. currently getting errors in openssl build (I am working on a Windows system). Let me get back on it.
๐Ÿ™Œ 1
HI, Sorry for the late response. I updated the code, it solved the last error. I am currently trying to fix daft-dsl build. It throws the following error,
Copy code
error[E0053]: method `to_field` has an incompatible type for trait
  --> src/daft-dsl/src/functions/utf8/repeat.rs:21:32
   |
21 |     fn to_field(&self, inputs: &[Expr], schema: &Schema, _: &Expr) -> DaftResult<Field> {
   |                                ^^^^^^^
   |                                |
   |                                expected `Arc<Expr>`, found `Expr`
   |                                help: change the parameter type to match the trait: `&[Arc<Expr>]`
   |
note: type in trait
  --> src/daft-dsl/src/functions/mod.rs:55:17
   |
55 |         inputs: &[ExprRef],
   |                 ^^^^^^^^^^
   = note: expected signature `fn(&RepeatEvaluator, &[Arc<Expr>], &Schema, &FunctionExpr) -> Result<_, _>`
              found signature `fn(&RepeatEvaluator, &[Expr], &Schema, &Expr) -> Result<_, _>`
j
Hi @Akshay Verma! You might be hitting a merge conflict for a recent change in our
to_field
function signature. You can reference the other utf8
to_field
implementations, but those inputs are now
ExprRef
!
If you follow the suggestions in the compiler error, and replace
&[Expr]
with
&[ExprRef]
that should do the trick ๐Ÿ™‚
๐Ÿ‘ 1
a
Let me try that
While updating the types of the inputs, I got in a bit of situation. I am implementing the dsl function at Daft/src/daft-dsl/src/functions/utf8/repeat.rs
In the
evaluate
function, the input is misintrepreted as
&Series
but should be
usize
, which part should be modified to get this fixed
Copy code
error[E0308]: mismatched types
  --> src/daft-dsl/src/functions/utf8/repeat.rs:50:34
   |
50 |                 data.utf8_repeat(n)
   |                      ----------- ^ expected `usize`, found `&Series`
   |                      |
   |                      arguments to this method are incorrect
   |
note: method defined here
  --> /home/akshay/projects/oss/Daft/src/daft-core/src/series/ops/utf8.rs:91:12
   |
91 |     pub fn utf8_repeat(&self, n: usize) -> DaftResult<Series> {
   |            ^^^^^^^^^^^
j
It seems that you should change your
utf8_repeat
function to take a Series instead of a single number You can reference the other utf8 kernels for an example. A good one to check out might be something like `left`:
Daft/src/daft-dsl/src/functions/utf8/left.rs
a
Thanks, I was looking at right.rs ๐Ÿ™‚
j
Yup! The function signature of right (in
Daft/src/daft-core/src/series/ops/utf8.rs
):
Copy code
pub fn utf8_right(&self, nchars: &Series) -> DaftResult<Series>