Hello,
I was reading the adminstrators manual and I have a question about the section about setting up a judgehost: https://www.domjudge.org/docs/admin-manual-3.html#ss3.7
It recommends the 'swapaccount=1' kernel parameter, but that made me wonder if you actually want swap on a judgehost. I'm not fully familiar with how Linux processes and swap work in detail, but say a judgehost is judging a random submission while the RAM of the host is almost exhausted. If the memory is swapped to disk to clear space for the memory allocated by the submission, doesn't that potentially heavily impact the run time of the submission? And if that's true, wouldn't you see a TIMELIMIT that is not fully justified, because while you might not exceed CPU time, you will exceed wall time?
I hope you can shed some light on this.
Regards,
Wim
Hoi Wim,
On Fri, May 4, 2018 00:14, Wim de With wrote:
I was reading the adminstrators manual and I have a question about the section about setting up a judgehost: https://www.domjudge.org/docs/admin-manual-3.html#ss3.7
It recommends the 'swapaccount=1' kernel parameter, but that made me wonder if you actually want swap on a judgehost. I'm not fully familiar with how Linux processes and swap work in detail, but say a judgehost is judging a random submission while the RAM of the host is almost exhausted. If the memory is swapped to disk to clear space for the memory allocated by the submission, doesn't that potentially heavily impact the run time of the submission? And if that's true, wouldn't you see a TIMELIMIT that is not fully justified, because while you might not exceed CPU time, you will exceed wall time?
Yes, agreed. We set swapaccount so the cgroup actually tracks the amount of swap used. Cgroup has two memory limits: the actual memory and mem+swap. We set both to the same value. Therefore we're sure that no swapping will occur:
cgroup_add_value(int64, "memory.limit_in_bytes", memsize); cgroup_add_value(int64, "memory.memsw.limit_in_bytes", memsize);
So actually we need to have swapaccount in order to enforce that there's no swapping.
Cheers, Thijs
On Fri, May 04, 2018 at 08:54:01AM +0200, Thijs Kinkhorst wrote: Hi Thijs,
Hoi Wim,
On Fri, May 4, 2018 00:14, Wim de With wrote:
I was reading the adminstrators manual and I have a question about the section about setting up a judgehost: https://www.domjudge.org/docs/admin-manual-3.html#ss3.7
It recommends the 'swapaccount=1' kernel parameter, but that made me wonder if you actually want swap on a judgehost. I'm not fully familiar with how Linux processes and swap work in detail, but say a judgehost is judging a random submission while the RAM of the host is almost exhausted. If the memory is swapped to disk to clear space for the memory allocated by the submission, doesn't that potentially heavily impact the run time of the submission? And if that's true, wouldn't you see a TIMELIMIT that is not fully justified, because while you might not exceed CPU time, you will exceed wall time?
Yes, agreed. We set swapaccount so the cgroup actually tracks the amount of swap used. Cgroup has two memory limits: the actual memory and mem+swap. We set both to the same value. Therefore we're sure that no swapping will occur:
cgroup_add_value(int64, "memory.limit_in_bytes", memsize); cgroup_add_value(int64, "memory.memsw.limit_in_bytes", memsize);
So actually we need to have swapaccount in order to enforce that there's no swapping.
Ah right, that will prevent the submission itself from using swap, and that is the only thing that is relevant within DOMjudge scope.
Correct me if I'm wrong but if a judgehost is so low on RAM that it needs to swap other memory to accommodate for the submission, you have a problem anyway, even if this swapping takes time. Without swap it would just crash or (maybe?) the OOM killer would kill processes. Even though all outcomes are undesirable, crashes seem more obvious to me.
On 04/05/18 12:17, Wim de With wrote:
On Fri, May 04, 2018 at 08:54:01AM +0200, Thijs Kinkhorst wrote: Hi Thijs,
Hoi Wim,
On Fri, May 4, 2018 00:14, Wim de With wrote:
I was reading the adminstrators manual and I have a question about the section about setting up a judgehost: https://www.domjudge.org/docs/admin-manual-3.html#ss3.7
It recommends the 'swapaccount=1' kernel parameter, but that made me wonder if you actually want swap on a judgehost. I'm not fully familiar with how Linux processes and swap work in detail, but say a judgehost is judging a random submission while the RAM of the host is almost exhausted. If the memory is swapped to disk to clear space for the memory allocated by the submission, doesn't that potentially heavily impact the run time of the submission? And if that's true, wouldn't you see a TIMELIMIT that is not fully justified, because while you might not exceed CPU time, you will exceed wall time?
Yes, agreed. We set swapaccount so the cgroup actually tracks the amount of swap used. Cgroup has two memory limits: the actual memory and mem+swap. We set both to the same value. Therefore we're sure that no swapping will occur:
cgroup_add_value(int64, "memory.limit_in_bytes", memsize); cgroup_add_value(int64, "memory.memsw.limit_in_bytes", memsize);
So actually we need to have swapaccount in order to enforce that there's no swapping.
Ah right, that will prevent the submission itself from using swap, and that is the only thing that is relevant within DOMjudge scope.
If I'm interpreting the description of the option correctly, see
https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
then 'swapaccount=1' does not actually disable swapping for cgroups with "memory.memsw.limit_in_bytes" set, but only makes sure that the total memory _including swap_ is accounted.
Correct me if I'm wrong but if a judgehost is so low on RAM that it needs to swap other memory to accommodate for the submission, you have a problem anyway, even if this swapping takes time. Without swap it would just crash or (maybe?) the OOM killer would kill processes. Even though all outcomes are undesirable, crashes seem more obvious to me.
If you want to make sure that the kernel doesn't swap (neither the submission or any other program), then the best is probably to disable swap altogether. Secondly, make sure that you do not run anything else on that machine and that it has sufficient memory to support the maximum memory limit set. Just that should be enough to not make the submission be swapped out, and optionally reduce the sysctl setting vm.swappiness to reduce the chances of stuff getting swapped out.
Best, Jaap
On Sun, May 6, 2018 at 7:45 AM, Jaap Eldering jaap@jaapeldering.nl wrote:
On Fri, May 4, 2018 00:14, Wim de With wrote:
If the memory is swapped to disk to clear space for the memory allocated by the submission, doesn't that potentially heavily impact the run time of the submission? And if that's true, wouldn't you see a TIMELIMIT that is not fully justified, because while you might not exceed CPU time, you will exceed wall time?
Yes, agreed.
Sorry for sidetracking, but I am curious -- why is this situation considered abnormal? AFAIU, this is precisely the reason why time limit is calculated based on user CPU time, but neither on wallclock not on system CPU time. Reasonable amount of swapping should not affect judgement at all.
On the other hand, if the memory is so low that OS is continuously thrashed, then this judge instance is doomed anyway, and should stop and notify the admin.