SQL Cluster Disk Witness Errors
We recently upgraded our Production SQL Server Cluster. We started with the passive node and removed the old operating system, which was Windows Server 2012 R2, and then installed Windows Server 2019. We then detached the databases from SQL Server, leaving all of the database files on the SAN intact on all of the LUNS. We wiped and installed the active node then and installed Windows Server 2019 and set up the cluster and SQL Server 2019.
We then configured SQL Server and attached all of the databases on the LUNS, which we reattached during cluster creation.
Shortly afterward, I noticed in our monitoring that our SQL Server was restarting, switching between nodes.
I couldn't find anything in the SQL Logs that indicated anything other than the server kept restarting, so I went looking at the cluster manager to see if there were any errors there.
There were a bunch of the same vague errors.
I found a blog post by Pinal Dave on how to get the cluster log to see if I could find more information.
Once I started looking at this, I noticed the times looked like they were in UTC. So I reran the command with a flag -UseLocalTime so it would write the log out in my time zone.
I found a post on the MSDN forum suggesting to check and make sure the 'Possible Owners' are checked for the disk witness which were selected. I did, and they were.
Another suggestion was to pull the disk witness out of the cluster, format it and place it back in the cluster. Since we didn't have a new drive when we started because we upgraded the servers and used the same SAN disks and with all of the errors surrounding the cluster disk witness, I gave this a try. It worked. No more errors, no more restarting.