Thank you for your reply.
I asked in another message about which Apache MPM module to use. Apache is currently set to "event"--is that OK, or should it be set to "prefork"? I can set it to prefork and raise the limits substantially from last year.
Is the domjudge-fpm.conf file supposed to be inserted into /etc/php.fpm.d? It looks like it but I don't see any reference to that in the documentation.
Any assistance you can provide will be greatly appreciated.
Marc
----- Original Message ----- From: "Keith Johnson" kj@ubergeek42.com To: "marcf @dslextreme.com" marcf@dslextreme.com, "domjudge-devel" domjudge-devel@domjudge.org Sent: Friday, November 9, 2018 3:19:59 PM Subject: Re: Apache/DOMjudge Tuning
Probably I've broken the mailing list thread, but I wanted to reply to you before its too late.
In the interest of avoiding the issue we had last year we want to ensure we have plenty of web service resources available for DOMjudge this weekend. I have installed DOMjudge 6.0.2 and it seems to be working.
When I look at the current processes running (with the default configuration) I see five httpd processes and five php-fpm processes. From what I can tell, it looks like I should use php-fpm tuning parameters to greatly increase the number of php-fpm processes running. (We have 16 GB of RAM, so I see no problem with doing so.)
Does the DOMjudge team have any recommendations on which parameters to change to support 100 teams? Am I looking in the right place to ensure that we don't run out of web service resources again?
100 teams should be pretty easy to support on hardware with those specifications.
For PHP-FPM, there are a couple of relevant parameters to tune based on your system. The first is pm = static. This disables scaling of fpm workers so there are always a fixed number of workers available. To control how many fixed workers, that's the pm.max_children variable. It looks like the domjudge-fpm.conf file that ships with domjudge seems to set these, and gives some guidance for how to set it, but I think the comment might be a little bit out of date since the move to symfony which requires a bit more memory.
Running domjudge on master a few weeks back, I seem to recall 100-150mb per fpm worker on average. So the value you choose for max_children should be based on this. E.g. if you want to use 10gb of your server memory then you'd want to set it to around 70(which is probably plenty for 100 teams). You should also make sure that pm.max_requests is set to some reasonably large non-zero value. This just protects you from unknown memory leaks by restarting fpm worker processes after some number of requests(I use 5000). I've never used apache in front of php-fpm, but you may want to check the MaxRequestWorkers/KeepAliveTimeout/KeepAlive setting. If you are on a local network/not using ssl probably you can just disable keepalive to ensure you don't run into the errors encountered last year.
The best thing you can do though is to do some load testing to see what happens under load. You can probably accomplish this with just apachebench, but make sure you point it at a real page(i.e. something that isn't just a 302 redirect or a static file or something like that). A good page is probably the login page; do a curl -I on it first to make sure it returns a 200 OK + some actual content. For 100 teams you're probably only looking at like 10 requests/second tops(team pages refresh every 30 seconds, which is like 3 reqs/sec. Add in some spectators and people who open multiple tabs and that gets you to 10ish). That said your server should be able to do way more requests/sec than that.
An apachebench command that would simulate using keepalive + 150 concurrent users would be something like this: ab -k -c 150 -n 15000 example.com/domjudge/login
In the results you'll want to pay attention to a few things: The 99th/all percentiles at the bottom, and the max time for each. You'd want to see something like 95% of requests less than a couple hundred ms, and 100% of requests under 1-2 seconds or so. I think there is also a count for errors(if it happens to blow up), but you want that to be zero. Play with the values for the -c argument to test more/less concurrent "users" so you can get a good idea for what your server can handle(I would keep increasing this until you discover at what point it starts to fail). You might also want to watch memory usage of the php-fpm workers and tune the pm.max_children values as needed.
An aside, I don’t run apache(opting for nginx instead), but I try to configure it to set caching headers for css/png/jpg/etc, as this will reduce the load on the server. Make sure you’re all set with customizations as it’ll be difficult to css/js after the fact. You can skip this step if you’d like, 100 teams is not that many and a properly configured server should be able to easily serve that many requests.
The judge hosts check in periodically, so make sure you account for them in your load testing. More judge hosts = more requests making it to the domjudge server. Sometimes substantially so, e.g. in the case of a problem with ~100 test cases and a tiny runtime you’ll probably see multiple requests per second just to judge that. So if you have lots of judge hosts that can negatively impact the performance of your web server.
I hope that rambling is helpful in some way and provides you some useful tuning options.
-Keith