Today, for unknown reasons, trying to train any NN with any notebook resulted in an unknown error + kernel restart. I reinstalled CUDA, cudnn, anaconda, reinstalled tensorflow, changed batch sizes, resize layer, used a simpler model, but nothing seems to solve the issue. It also gives me an error because it can't find the path to the crash log... Yesterday all worked almost perfectly by the way. Unfortunately, the dedicated GPU and colab host take from 10 to 30 minutes to train a single epoch, I don't have time for that. Can someone help me? :)
Posted by: Emibono11 @ Nov. 21, 2021, 3:57 p.m.PROBLEM SOLVED
I downloaded a faulty version of cudnn -.-