Hello,
I have been trying to submit a model in the last hour but I got an error message saying that it failed to allocate memory. I first thought something was wrong with my code but later on we tried to submit a zip that had been submitted successfully before but the same error was returned. Is it possible that there is an internal error ?
Kind regards,
Elliott Dubuisson.
The last part of the error log is the following :
2022-11-23 15:30:37.711481: W tensorflow/core/framework/op_kernel.cc:1768] RESOURCE_EXHAUSTED: failed to allocate memory
Traceback (most recent call last):
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2022/Competition1_running_dir/worker_gpu0_dir/tmp/codalab/tmpMlmSP0/run/program/score.py", line 129, in
M = model(submission_dir)
File "/multiverse/storage/lattari/Prj/postdoc/Courses/AN2DL_2022/Competition1_running_dir/worker_gpu0_dir/tmp/codalab/tmpMlmSP0/run/input/res/model.py", line 9, in _init_
self.model = tf.keras.models.load_model(os.path.join(path, 'SubmissionModel/model.h5'))
File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 2142, in truncated_normal
return tf.random.stateless_truncated_normal(
tensorflow.python.framework.errors_impl.ResourceExhaustedError: {{function_node _wrappedAddV2_device/job:localhost/replica:0/task:0/device:GPU:0}} failed to allocate memory [Op:AddV2]
Hello, it seems that submissions are properly working for your colleagues. Try to double-check from your side.
Posted by: an2dl.competitions @ Nov. 23, 2022, 5:03 p.m.