<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">Dear all<div><br></div><div>I managed to restart sciserver-comp7 without issues this time and the disk hosting the Docker containers remained intact.</div><div>I have reinstated all containers on the compute nodes which have been accessed within the last year and you should be able to restart them now (they might take a little bit of time to start up again)</div><div><br></div><div>I am still looking into what caused the compute nodes to fail, but a reasonable guess is too intensive resource use.</div><div>Please run all processing which uses a lot of CPU, memory or time as a compute job instead of in a interactive container so you don’t disturb other users.</div><div>Comp-7 had 31 containers running, so even though it looks like you are alone in your container, there are other people using the server at all times as well.</div><div><br></div><div>cheers</div><div>Jonas<br><div><br><blockquote type="cite"><div>On 1. Feb 2024, at 16:50, Jonas Haase <jhaase@mpe.mpg.de> wrote:</div><br class="Apple-interchange-newline"><div><meta charset="UTF-8"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">Dear All</span><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">Unfortunately we have another similar case with the other large compute node.</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">The VM sciserver-comp7 is currently completely unreachable and is using all of its memory.</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">I am trying to avoid inducing another failed disk by doing a hard reset on the machine, so for now I have isolated the compute node from the system.</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">You will likely not be able to reach any existing container on sciserver-comp7 (giving a unhelpful 504 Gateway error), but new large containers can be started and should automatically run on sciserver-comp5 instead. </div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">I have also reduced the total memory a container is allowed to use to 100GB to reduce the risk of heavy processing in a few container gobbling up all the memory of a compute node.</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">If you are running tasks which need a lot of memory or processors please run them as a Compute Job instead.</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">(see p.14 of the getting started document)</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><a href="https://datashare.mpcdf.mpg.de/s/1e0CF3yRNcgDL4V">https://datashare.mpcdf.mpg.de/s/1e0CF3yRNcgDL4V</a></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">I will let you know as soon as I have made progress with the failed node.</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br></div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">cheers</div><div style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">Jonas<br id="lineBreakAtBeginningOfMessage"><div><br><blockquote type="cite"><div>On 22. Jan 2024, at 10:24, Jonas Haase <jhaase@mpe.mpg.de> wrote:</div><br class="Apple-interchange-newline"><div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">Dear All<div><br></div><div>Unforrtunately there was no way to recover the failed drive, so I had to reinitialize it (with higher safety settings this time, knock on wood).</div><div>That means the containers previous running on sciserver-comp5 were lost - they should have disappeared from your lists in compute already.</div><div>I hope this has not caused any undue trouble. </div><div><br></div><div>The compute node is back online </div><div><br></div><div>cheers</div><div>Jonas<br id="lineBreakAtBeginningOfMessage"><div><br><blockquote type="cite"><div>On 17. Jan 2024, at 12:33, Jonas Haase <jhaase@mpe.mpg.de> wrote:</div><br class="Apple-interchange-newline"><div><div>Dear Sciserver users<br><br>Unfortunately we had an issue with the compute node sciserver-comp5, where docker and the individual container processes had become unresponsive and refused to shut down cleanly.<br>As a last resort I rebooted the machine. It has come back up, but unfortunately has the virtual disk which holds the container information become corrupted in the process. <br><br>I will attempt to see if I can fix the disk, but if that does not work out I will have to replace it, which will lead to the loss of the containers which have been running on that machine.<br>Your data stored on the Storage and Temporary volumes remains unaffected.<span class="Apple-converted-space"> </span><br><br>I have turned the node off for the moment, you can still start new containers in the SciServerMPE-Large domain, which then should run on sciserver-comp7 instead.<br><br>My apologies for the inconvenience<br>Jonas<br><br>—<br>Jonas Haase<br>Max Planck Institute for Extraterrestrial Physics (MPE)<br>Giessenbachstr. 1, 85748 Garching, Germany<br>X2 366<br>+49 89 30000 3706<br><br><br>--<span class="Apple-converted-space"> </span><br>Sciserver-users mailing list<br>Sciserver-users@lists.mpe.mpg.de<br>https://lists.mpe.mpg.de/cgi-bin/mailman/listinfo/sciserver-users<br></div></div></blockquote></div><br><div><div dir="auto" style="caret-color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div dir="auto" style="caret-color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div dir="auto" style="caret-color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; font-family: Helvetica; font-variant-ligatures: normal; font-variant-position: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; line-height: normal; border-spacing: 0px; -webkit-text-decorations-in-effect: none;"><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">—<br>Jonas Haase<br>Max Planck Institute for Extraterrestrial Physics (MPE)<br>Giessenbachstr. 1, 85748 Garching, Germany</div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><span style="text-align: -webkit-auto; -webkit-text-decorations-in-effect: none;">X2 366</span></div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">+49 89 30000 3706</div></span></div></div></div></div></div></div></div><br></div></div></div></blockquote></div><br><div><div dir="auto" style="caret-color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div dir="auto" style="caret-color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div dir="auto" style="caret-color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: 0px;"><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">—<br>Jonas Haase<br>Max Planck Institute for Extraterrestrial Physics (MPE)<br>Giessenbachstr. 1, 85748 Garching, Germany</div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><span style="text-align: -webkit-auto; -webkit-text-decorations-in-effect: none;">X2 366</span></div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">+49 89 30000 3706</div></span></div></div></div></div></div></div></div><br></div><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">--<span class="Apple-converted-space"> </span></span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;">Sciserver-users mailing list</span><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><a href="mailto:Sciserver-users@lists.mpe.mpg.de" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">Sciserver-users@lists.mpe.mpg.de</a><br style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><a href="https://lists.mpe.mpg.de/cgi-bin/mailman/listinfo/sciserver-users" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">https://lists.mpe.mpg.de/cgi-bin/mailman/listinfo/sciserver-users</a></div></blockquote></div><br><div>
<meta charset="UTF-8"><div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div dir="auto" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="color: rgb(0, 0, 0); letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><div style="color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant-ligatures: normal; font-variant-position: normal; font-variant-caps: normal; font-variant-numeric: normal; font-variant-alternates: normal; font-variant-east-asian: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-stroke-width: 0px;"><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">—<br>Jonas Haase<br>Max Planck Institute for Extraterrestrial Physics (MPE)<br>Giessenbachstr. 1, 85748 Garching, Germany</div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;"><span style="text-align: -webkit-auto; -webkit-text-decorations-in-effect: none;">X2 366</span></div><div style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">+49 89 30000 3706</div></span></div></div></div></div></div></div>
</div>
<br></div></body></html>