[Sciserver-users] Failed compute node

Jonas Haase jhaase at mpe.mpg.de
Wed Jan 17 12:33:05 CET 2024


Dear Sciserver users

Unfortunately we had an issue with the compute node sciserver-comp5, where docker and the individual container processes had become unresponsive and refused to shut down cleanly.
As a last resort I rebooted the machine. It has come back up, but unfortunately has the virtual disk which holds the container information become corrupted in the process.  

I will attempt to see if I can fix the disk, but if that does not work out I will have to replace it, which will lead to the loss of the containers which have been running on that machine.
Your data stored on the Storage and Temporary volumes remains unaffected. 

I have turned the node off for the moment, you can still start new containers in the SciServerMPE-Large domain, which then should run on sciserver-comp7 instead.

My apologies for the inconvenience
Jonas

—
Jonas Haase
Max Planck Institute for Extraterrestrial Physics (MPE)
Giessenbachstr. 1, 85748 Garching, Germany
X2 366
+49 89 30000 3706




More information about the Sciserver-users mailing list