Leo3 Unscheduled Maintenance

Due to persistent instabilities in the job scheduling system, the Leo3 Compute Cluster was shut down for maintenance from Thu, 9 July, 12:00 to Fri, 10 July, 16:00

All running jobs were terminated, and queued jobs were deleted. Other systems were not affected.

Background information

Starting at 1 July, we were experiencing instabilities in the SGE job scheduling system, leading to repeated job failures and worker nodes becoming unavailable. Despite intensive efforts, it was not possible to isolate and fix the problem under normal operation, making a restart of the whole cluster inevitable.