If ScanWithTask is requesting for { "memory": 3740...
# general
k
If ScanWithTask is requesting for { "memory": 37407051360, "CPU": 1 } by default which causes my worker to OOM, how can I reduce this?
j
Is it OOMing or failing to schedule? I’m guessing it’s failing to schedule since you’re able to see the error message? The reason it’s requesting for that memory is because the file it’s reading is pretty large. You likely need bigger machines to process this dataset!
How large is the dataset you’re reading and how many files are there? I’m guessing you have a pretty large one in there (something like a 10G+ file?)
k
I'm working on a 30TB dataset and each file is indeed >10G but I'm not sure why some tasks succeed and some don't..
j
How large are your machines, and how many machines are in your cluster?
k
I have 70 machines which have 200gb memory each
j
Gotcha — yeah this is a really large workload. Perhaps we should set up some time to chat about it? There are some things that we need to be doing here that we aren’t doing yet to work well with these workloads: • Splitting of these 10G files into smaller partitions (we currently have a limit of 10 files for scan splitting because of metadata fetching overheads) • When performing shuffles (joins, sorts, groupbys, repartitions), there will be a really large Ray graph being constructed. We are in the process of designing a new shuffle system to handle these larger shuffles. LMK what timezone you’re operating out of? I’d love to get some of the Daft team involved and learn more about your workload to see if we can help stabilize it.
k
I'm based in GMT +8 😅
j
Hey, I’m from Singapore 😛 very familiar with that tz
If you’re comfortable with a 9AM morning call, we could definitely do that!
k
Great okay! I'm also okay with night calls
j
LMK if tomorrow @9AM is good for you? That would be our Monday@6PM PST / Your Tuesday@9AM
k
Okay sure!
j
(Sent you a calendar invite — chat soon!)
k
Cool! See you soon!