Memory debugging for Embeddings node
Provisioning correct machine types
Initially, we noticed the machine type was not being allocated. This due to the ephemeral storage requirements not being satisfied by any of the nodes.
To solve this, we have now added a volume to each node as a Persistent Volume Claim (PVC). This size of this volume is calculated from the
ephemeral_storage_limit.
The volume code:
podSpecPatch: |
volumes:
- name: scratch
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: {% raw %}"{{inputs.parameters.ephemeral_storage_limit}}Gi"
{% endraw %}
And to mount this volume we do:
volumeMounts
- name: scratch
mountPath: /scratch
To make sure that the volumes are deleted after a pod succeeds/fails/error, we have set the
spec:
podGC:
strategy: OnPodCompletion
This ensures that the cleanup happens automatically.
Bumping memory spec
Next, we noticed the Neo4J container going OOM. It turned out this was due to the hardcoded values in the Neo4J container, essentially capping it's resources.
✅ This was solved by using the memory requirements as specified in the pod's limits when setting up the environment variables for the Neo4J container.

# Neo4J container
env:
...
- name: NEO4J_dbms_memory_heap_initial__size
value: "100G"
- name: NEO4J_dbms_memory_heap_max__size
value: "120G"
As the next step, we set the above NEO4J settings same as the pod's memory request. This resulted in the main container stuck at "Waiting for neo4j to be ready..." with the Pod's status "OOM Killed". To fix this, we applied 70% of the pod's memory request as the neo4j heap size settings, which solved this problem.
"{{= sprig.int(inputs.parameters.memory_limit) * 0.7 }}G"
Moreover, we noticed a similar problem in the Spark configuration, where the the Spark configuration was hardcoded to a specific value.
⛔️ This still requires a fix, in the ideal case we would expect the resources defined in the node to correctly propagate.
# spark.yaml
...
spark.driver.memory: 30g
Setting memory spec for Kedro container
Next, we've bumped the requests for Spark in the main container to 50Gb. This resulted in an OOM memory, presumably because Spark had been configured to use all the pods' memory, and when the Kedro/Python process started to use RAM, the container was killed.
spark.driver.memory: 50g
It's therefore important to leave a buffer between the memory configured for Spark, and the total memory allocated to the container.
# Argo workflow template
...
name: neo4j
podSpecPatch: |
containers:
- name: main
resources:
requests:
memory: 50Gi
limits:
memory: 50Gi
Follow-ups
- Ensure Spark Memory can be configured using environment variable
See PR 1185
- Deep dive into Neo4J Spark connector to learn why so much memory is used
From the run that was triggered this
morning
after our team discussion, it doesn't look out of the order:
The Spark process uses about
42GiB memory. Not insane for a dataset, that is being loaded as a single
partition. The Neo4J docs make the right claims and warnings about wanting
to write edges using multiple workers. An attempt at this can be made, in
order to lower the memory requirements, but I think it's less of a priority
seeing that the bottleneck is the graphsage algorithm which requires the
bulk of the resources.
Note that a simple alternative may actually be to not use the Spark connector at all, but load it in batches, only reading partitions as needed. In pseudocode:
for batch in batches(edges.toLocalIterator(prefetch=False), n=5k):
load_edges_in_neo4j(batch)
That would keep the memory Spark uses down to the smallest partition, which you can make arbitrarily small in the previous kedro node.
- (After few runs) investigate memory profile and right size the nodes