After disabling Jumbo I do not see any crashes. RDMA is enabled. SR-IOV had to be disabled a couple of days ago, as sometimes virtual machines lose the network. I did not find the reason and solution.
Apologies for some delays in my response, the family's a little sick, and I've been a little sick myself as a result. I'm mostly fine, but not fully 100% yet.
Have you tried disabling RDMA and enabling Jumbo on one of the machines? Because a few things have changed in a short period (network card and a few settings), I'm trying to pinpoint this to a specific element so we can hopefully find a better solution.
Hi. I hope your family and you are okay now. I visit the hospital myself.
No, we can't disable RDMA as it causes a catastrophic drop in performance in our software. I can try to schedule it for the next maintenance window. Today I went to work and found out that we have a big audit and we have to cancel all scheduled work related to stopping services.