We run a dozen fairly low-traffic concrete cms sites (currently using 9.4.5), all with page-caching enabled, alongside a bespoke, mission-critical ERP on an enterprise-level cluster of high-end servers. For this reason, the biggest performance issue of using concrete cms - the elephant-in-the-room of the high number of sql queries per page render - has not been a particular issue for us. Until now.
We appreciate and are grateful for the fact that concrete cms has always been an open source project, we donât take Franz Marunaâs sterling efforts for granted, and twelve years ago I took on board his comments in response to this article in 2013 > Concrete5 âEnterprise Readyâ? Not entirelyâŚ
However, now the high number of queries is becoming an issue, particularly in regard to rendering the 404 page: page_not_found.php.
Bot attacks are becoming evermore common, often concurrent across different sites on the same server. These are automated, high-frequency page requests, usually looking for known-vulnerable scripts found in systems such as WordPress. Examples include: /.env.prod /.env.backup /admin/.env.save /config/.env.save /lib/.env.bak /application/.env.bak /system/.env.save /core/.env.save /storage/.env.old âŚand on and on ad infinitum!
We run very simple âpage_not_foundâ pages with just one block area containing one image, some text and a link-button. No header, no footer and no menus. And yet the rendering of this simplest possible of pages typically involves the following number of sql queries:
404 page db queries
- logged in with admin bar/tools: 804
- logged out without page caching: 242
- logged out with page caching: 167
So what are these 167 queries doing? Well a quick glance at some of them suggests that they are often looking for permissions, and more surprisingly, they are looking in all block areas defined for other page templates but which do not exist in the 404 page template. Furthermore, most of the queries include complex joins. The page usually takes about 200ms to render (according to phpdebugbar), but more important is the loading on mysql as an aggressive botnet attack on multiple sites can see 100 requests per second, and if they are all for pages that donât exist, even with page caching we are expecting the server to cope with 16,700 sql queries per second, which is too much to ask of even a high-end dedicated server, and which, when replicated, becomes even more of a headache.
If the bot attack hits genuine pages then things really get wild because these are the query figures for a typical home page, with the best scenario still involving a massive 225 queries, and thatâs before we start serving any content:
Home page db queries
- logged in with admin bar/tools: 2399
- logged out without page caching: 991
- logged out with page caching: 225
We donât want to run varnish because aside from the fact that it should not be necessary for simple public CMSs running under very modest loads, it has to be used for everything on the server, it requires a reverse terminating proxy for ssl traffic, and can introduce a host of secondary complexities, especially on a replicated cluster platform. We do run varnish for our Magento sites, so we do understand the issues.
Before I post this on GitHub as a serious issue i.e. botnet calls to the 404 page dragging servers to a standstill, am I missing anything obvious? Is there some config option that I am unaware of? Have I failed to read some instruction manual that explains how to minimise sql queries?
As an aside, I have come up with a quick-and-dirty source hack to minimise the queries for a 404 call by adding a simple static âpage not foundâ file to the root directory (/my404page.php) and then intercepting and diverting requests for unknown pages at line 74 of /concrete/src/Http/ResponseFactory.php as follows:
public function notFound($content, $code = Response::HTTP_NOT_FOUND, $headers = )
{
return $this->redirect(âmy404page.phpâ);
[snip]
}
I havenât logged/counted the resultant queries but as the rendering is being bypassed, I would expect it to be in single figures.
This hack could, if it is considered to be worthy for devs who wish to create their own static 404 pages, be surrounded by a conditional that references a new config option of something worded like: âDo you wish to use a bespoke static 404 page located in the root directory and called âmy404page.phpâ [or whatever] âŚY/N?â