Issue:Modifiedby and createdby produce inefficient SQL
From FollowTheScore
Description: | modifiedby and createdby produce inefficient SQL |
Extension / Version: | DPL / 2.0 |
Type / Status: | Bug / open |
Problem
This ticket is about inefficient SQL queries in DPL that take very long to run. I provide some new queries that are up to 1000x faster.
The following DPL to list articles modified by a user takes over 10 seconds on our wiki:
modifiedby=Smith namespace=
Its query produces the following SQL (obtained using "debug=6"):
SELECT DISTINCT `vpw_page`.page_namespace as page_namespace ,`vpw_page`.page_title as page_title ,`vpw_page`.page_id as page_id ,`vpw_page`.page_title as sortkey FROM `vpw_page` WHERE 1=1 AND `vpw_page`.page_is_redirect=0 AND 'Smith' in ( select rev_user_text from `vpw_revision` where `vpw_revision`.rev_page=page_id) ORDER BY page_title ASC LIMIT 0, 1000
Here is a better query for modified articles that runs in 0.02 seconds:
SELECT DISTINCT `vpw_page`.page_namespace as page_namespace ,`vpw_page`.page_title as page_title ,`vpw_page`.page_id as page_id ,`vpw_page`.page_title as sortkey FROM `vpw_page` INNER JOIN `vpw_revision` ON (`vpw_page`.page_id = `vpw_revision`.rev_page) WHERE 1=1 AND `vpw_page`.page_is_redirect=0 AND `vpw_revision`.rev_user_text = 'Smith' ORDER BY page_title ASC LIMIT 0, 1000
Here is an even worse DPL query to get created articles by a user. It takes 2 minutes and 40 seconds to run on our wiki:
<dpl> createdby=Smith namespace= </dpl>
And here is better SQL that runs in 0.05 seconds:
SELECT DISTINCT `vpw_page`.page_namespace as page_namespace ,`vpw_page`.page_title as page_title ,`vpw_page`.page_id as page_id , `vpw_page`.page_title as sortkey FROM `vpw_page` INNER JOIN `vpw_revision` ON (`vpw_page`.page_id = `vpw_revision`.rev_page) WHERE 1=1 AND `vpw_page`.page_namespace IN ('0') AND `vpw_page`.page_is_redirect=0 AND `vpw_revision`.rev_user_text = 'Kolson' AND `vpw_revision`.rev_parent_id = 0 ORDER BY page_title ASC LIMIT 0, 1000