Node.JS - Duplicate SQL queries and efficiency improvements on COUNT queries.
Node.JS - Duplicate SQL queries and efficiency improvements on COUNT queries.
Hi,
I have noticed that when SQL command is executed by the NodeJS backend, that 3 queries are executed,
1st. a query to COUNT the total records. ( 1,448,364 in my example)
2nd. a query to COUNT the records 'filtered' (16,788 in my example)
3rd. the query to obtain the data.
I think this is used to get the counts for the info line:
Showing 1 to 23 of 16,788 entries (filtered from 1,448,364 total entries)
If there are zero filters being applied, the 1st and 2nd queries are identical (however BOTH queries are run). It would be great if there was someway to eliminate the need to run two identical queries back to back when no filtering is applied.
I mentioned this before with examples a while back (https://datatables.net/forums/discussion/72018) and at that time I thought it was just a plain duplicate query, now I realize the queries are only duplicated if there are no filtered being applied.
Also, if you are using a JOIN to get OPTIONS for one or more of the fields, these JOINS don't impact the COUNT, but do cause speed issues when work with large tables.
Here is an example of some SQL being run :
select count(`recruits`.`recruitID`) as `cnt` from `recruits`
left join `stages` on `stages`.`stageID` = `recruits`.`statusID`
left join `recruiters` as `rAsst` on `rAsst`.`agentref` = `recruits`.`RecruitmentAsst`
left join `recruiters` as `recruiter` on `recruiter`.`agentref` = `recruits`.`Recruiter`
where (`recruits`.`_search_current` > 0);
This can take 1.5sec to make the roundtrip to the SQL server and back, but the query below :
select count(`recruits`.`recruitID`) as `cnt` from `recruits`
where (`recruits`.`_search_current` > 0);
This gives the same Count, but takes 0.2secs .
I understand some JOINS might change the total record count, but I think there are some situations that you can trust it will not.. Or perhaps you might consider allowing a configuration switch to allow the designers to set which behavior to use.
Once again, thanks for your product and thanks for looking at my ramblings.
Answers
That's a good point - thanks for pointing that out. You are spot on, if there is no filter, then the second count query is redundant and can be optimised out. I'll look into that.
That's the trouble! The way to address this would be to understand the SQL better and decide if the joins are going to effect the count or not (i.e. based on the WHERE conditions that are applied). An option to enable and disable it, I think is a footgun that is going to cause someone a whole lot of grief when they forget about it!
It is a significant difference 15S to 0.2S! I've got it in my tracker to look into options for this as well.
Regards,
Allan
THanks., the filter query was taking 1.5sec (not 15s - that would be awful ), but as the query runs twice it takes 3secs total. I am looking at adding REDIS to provide better cache to see if that helps too. MY users complain about delays and I was able to find some indexing improvements but I am just trying to get it as quick as possible .
Does datatables actually need to know how many filtered records exists? could a config option disable that count if it is only needed for cosmetics ?
Oops - I missed the decimal place!
It needs to know how many there are after filtering (so the paging calculations can happen). The unfiltered case is just for information in the
info
option. Perhaps the unfiltered count should be made optional...Allan
thanks for the update. I will stay tuned to see if you make any changes.. It certainly isn't the most urgent thing I am sure, so I am happy to wait and see..
thanks again for all you guys do.