how can i avoid OOMs error in AWS Glue Job in pyspark

Question

I am getting this error while running AWS Glue job using 40 workers and processing 40GB data

Caused by: org.apache.spark.memory.SparkOutOfMemoryError: error while calling spill() on org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@5fa14240 : No space left on device

How can i optimize my job to avoid such error on pyspark

Here is the pic of metrics glue_metrics

G.1X (Recommended for memory intensive jobs) @PrabhakarReddy — shubhamkakran, Aug 28, 2021 at 5:46
Have you tried G2.X workers? Each G2.X worker has 32GB memory, which is twice the memory of a G1.X worker. Also, enable job metrics (it's under monitoring options when you edit the job in the console) and look at the "Job Execution: Active Executors, Completed Stages & Maximum Needed Executors" metric after you execute the job. Compare the maximum needed executors to the active executors. If max needed > active, your job could benefit from additional workers. — jscott, Aug 29, 2021 at 22:42
I run it on standard and G2.X workers also, same results. @jscott, I am not getting what is wrong. Even in metrics, there is 50% usage and 50% load on cpu — shubhamkakran, Sep 2, 2021 at 5:28
@jscott I am using native spark to run the jobs on Glue, it is not creating the metrics related to Job Execution: Active Executors, Completed Stages & Maximum Needed Executors and Data Shuffle Across Executors. I found no cloudwatch metrics — shubhamkakran, Sep 8, 2021 at 3:03

semaphore · Accepted Answer · 2022-05-13 07:42:44Z

1

AWS Glue Spark shuffle manager with Amazon S3

Requires using Glue 2.0

See the following links.

answered May 13, 2022 at 7:42

semaphore

694 bronze badges

1

While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review
– Emi OB
May 18, 2022 at 11:33
I have practiced EMR and state machines, got to know much more, now i can have more control over spark configuration. Better than Glue for longer jobs.
– shubhamkakran
Jul 30, 2022 at 6:43

Add a comment |

Collectives™ on Stack Overflow

how can i avoid OOMs error in AWS Glue Job in pyspark

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
apache-spark
pyspark
aws-glue
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythonapache-sparkpysparkaws-glue or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
apache-spark
pyspark
aws-glue
or ask your own question.