Difference between revisions of "Issue:SQL lockup"

From FollowTheScore
Jump to: navigation, search
 
Line 228: Line 228:
  
 
Note the reversed order in the second query. I set allowUnlimitedResults to true to avoid the limit (at least when it is not specified manually), and now it works. This of course isn't a good solution, but at least it's something. --&nbsp;[[User:Nx|<span style="color:teal">'''''Nx'''''</span>]]&nbsp;/&nbsp;[[User talk:Nx|''talk'']] 11:48, 29 March 2010 (UTC)
 
Note the reversed order in the second query. I set allowUnlimitedResults to true to avoid the limit (at least when it is not specified manually), and now it works. This of course isn't a good solution, but at least it's something. --&nbsp;[[User:Nx|<span style="color:teal">'''''Nx'''''</span>]]&nbsp;/&nbsp;[[User talk:Nx|''talk'']] 11:48, 29 March 2010 (UTC)
 +
 +
: Very interesting. Congratulations that you found out how something that was intended to LIMIT the use of CPO power in fact INCREASES the load in some situations ... 08:05, 30 March 2010 (UTC)

Latest revision as of 10:05, 30 March 2010

Description: SQL lockup
Extension / Version: DPL   /   1.9.0
Type / Status: Bug   /   open

Problem

The following dpl:


{{#dpl:
|createdby = LArron
|namespace = 
|namespace = Conservapedia
|namespace = Essay
|ordermethod=lastedit
|order=descending
|addpagecounter = true
|allowcachedresults = true
}}

creates this sql query:

SELECT /*  Nx */  DISTINCT `rw_page`.page_namespace as page_namespace,`rw_page`.page_title as page_title,`rw_page`.page_id as page_id, `rw_page`.page_counter as page_counter, rev_user, rev_user_text, rev_comment, rev_timestamp 
FROM `rw_revision` AS rev, `rw_page` 
WHERE 1=1  AND `rw_page`.page_namespace IN ('0','100','102') AND `rw_page`.page_is_redirect=0 AND `rw_page`.page_id=rev.rev_page 
AND rev.rev_timestamp=( SELECT MAX(rev_aux.rev_timestamp) FROM `rw_revision` AS rev_aux WHERE rev_aux.rev_page=rev.rev_page ) 
AND 'LArron' = (select rev_user_text from `rw_revision` where `rw_revision`.rev_page=page_id order by `rw_revision`.rev_timestamp ASC limit 1) 
ORDER BY rev_timestamp DESC LIMIT 0, 500;

which locks up our database at RationalWiki. The source of the problem is rev.rev_timestamp=( SELECT MAX(rev_aux.rev_timestamp) FROM `rw_revision` AS rev_aux WHERE rev_aux.rev_page=rev.rev_page ). Doing this subquery half a million times is apparently too taxing for MySQL. Now, this has worked before, and I have no idea why it suddenly stopped working. However, I managed to change the query so that it works:

SELECT /*  Nx */  DISTINCT `rw_page`.page_namespace as page_namespace,`rw_page`.page_title as page_title,`rw_page`.page_id as page_id, `rw_page`.page_counter as page_counter, rev_user, rev_user_text, rev_comment, rev_timestamp 
FROM `rw_revision` AS rev, `rw_page`, ( select max(rev_aux.rev_timestamp) as max_timestamp, rev_aux.rev_page as max_page from rw_revision as rev_aux group by rev_page ) as rev_aux
WHERE 1=1  AND `rw_page`.page_namespace IN ('0','100','102') AND `rw_page`.page_is_redirect=0 AND `rw_page`.page_id=rev.rev_page 
AND rev.rev_page = rev_aux.max_page AND rev.rev_timestamp = rev_aux.max_timestamp
AND 'LArron' = (select rev_user_text from `rw_revision` where `rw_revision`.rev_page=page_id order by `rw_revision`.rev_timestamp ASC limit 1) 
ORDER BY rev_timestamp DESC LIMIT 0, 500;

Instead of doing the subquery for every line, it does it only once and then joins it. Here's the patch to DPL 1.9.0 that does this:

--- DPLMain.php.ori	2009-11-28 02:48:11.000000000 -0500
+++ DPLMain.php	2009-11-28 02:51:05.000000000 -0500
@@ -1817,10 +1817,11 @@
                         $sSqlWhere .= " AND NOT (cl_head.cl_to IN (" . $dbr->makeList( $aCatNotHeadings ) . "))";

                     break;

                 case 'firstedit':

-                    $sSqlRevisionTable = $sRevisionTable . ' AS rev, ';

+                    $sSqlRevisionTable = $sRevisionTable . ' AS rev, ' . 
+                                         '( select max(rev_aux.rev_timestamp) as max_timestamp, min(rev_aux.rev_timestamp) as min_timestamp, rev_aux.rev_page as aux_page from rw_revision as rev_aux group by rev_page ) as rev_aux, ';

                     $sSqlRev_timestamp = ', rev_timestamp';

                     // deleted because of conflict with revsion-parameters

-                    $sSqlCond_page_rev = ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp=( SELECT MIN(rev_aux.rev_timestamp) FROM ' . $sRevisionTable . ' AS rev_aux WHERE rev_aux.rev_page=rev.rev_page )';

+                    $sSqlCond_page_rev = ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_page = rev_aux.aux_page AND rev.rev_timestamp = rev_aux.min_timestamp ';

                     break;

                 case 'pagetouched':

 						$sSqlPage_touched = ", $sPageTable.page_touched as page_touched";

@@ -1830,10 +1831,11 @@
 						$sSqlPage_touched = ", $sPageTable.page_touched as page_touched";

 					}

 					else {

-						$sSqlRevisionTable = $sRevisionTable . ' AS rev, ';

+						$sSqlRevisionTable = $sRevisionTable . ' AS rev, ' . 
+						                     '( select max(rev_aux.rev_timestamp) as max_timestamp, min(rev_aux.rev_timestamp) as min_timestamp, rev_aux.rev_page as aux_page from rw_revision as rev_aux group by rev_page ) as rev_aux, ';

 						$sSqlRev_timestamp = ', rev_timestamp';

 						// deleted because of conflict with revision-parameters

-						$sSqlCond_page_rev = ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp=( SELECT MAX(rev_aux.rev_timestamp) FROM ' . $sRevisionTable . ' AS rev_aux WHERE rev_aux.rev_page=rev.rev_page )';

+						$sSqlCond_page_rev = ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_page = rev_aux.aux_page AND rev.rev_timestamp = rev_aux.max_timestamp ';

 					}

                     break;

                 case 'sortkey':

@@ -2163,12 +2165,14 @@
         }

         

         if ($bAddAuthor && $sSqlRevisionTable =='') {

-            $sSqlRevisionTable = $sRevisionTable . ' AS rev, ';

-            $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp=( SELECT MIN(rev_aux_min.rev_timestamp) FROM ' . $sRevisionTable . ' AS rev_aux_min WHERE rev_aux_min.rev_page=rev.rev_page )';

+            $sSqlRevisionTable = $sRevisionTable . ' AS rev, ' . 
+                                 '( select max(rev_aux.rev_timestamp) as max_timestamp, min(rev_aux.rev_timestamp) as min_timestamp, rev_aux.rev_page as aux_page from rw_revision as rev_aux group by rev_page ) as rev_aux, ';

+            $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_page = rev_aux.aux_page AND rev.rev_timestamp = rev_aux.min_timestamp ';

         }

         if ($bAddLastEditor && $sSqlRevisionTable =='') {

-            $sSqlRevisionTable = $sRevisionTable . ' AS rev, ';

-            $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp=( SELECT MAX(rev_aux_max.rev_timestamp) FROM ' . $sRevisionTable . ' AS rev_aux_max WHERE rev_aux_max.rev_page=rev.rev_page )';

+            $sSqlRevisionTable = $sRevisionTable . ' AS rev, ' .
+                                 '( select max(rev_aux.rev_timestamp) as max_timestamp, min(rev_aux.rev_timestamp) as min_timestamp, rev_aux.rev_page as aux_page from rw_revision as rev_aux group by rev_page ) as rev_aux, ';

+            $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_page = rev_aux.aux_page AND rev.rev_timestamp = rev_aux.max_timestamp ';

         }

         

         if ($sLastRevisionBefore.$sAllRevisionsBefore.$sFirstRevisionSince.$sAllRevisionsSince != '') {

@@ -2178,17 +2182,20 @@
             global $wgRawHtml;

             $wgRawHtml = true;

     

-            $sSqlRevisionTable = $sRevisionTable . ' AS rev, ';

+            $sSqlRevisionTable = $sRevisionTable . ' AS rev, ' . 
+                                 '( select max(rev_aux.rev_timestamp) as max_timestamp, min(rev_aux.rev_timestamp) as min_timestamp, rev_aux.rev_page as aux_page from rw_revision as rev_aux group by rev_page ) as rev_aux, ' .
+                                 '( select max(rev_aux_bef.rev_timestamp) as max_timestamp, rev_aux_bef.rev_page as aux_page from rw_revision as rev_aux_bef where rev_aux_bef.rev_timestamp < '.$sLastRevisionBefore.' group by rev_page ) as rev_aux_bef, ' .
+                                 '( select min(rev_aux_snc.rev_timestamp) as min_timestamp, rev_aux_snc.rev_page as aux_page from rw_revision as rev_aux_snc where rev_aux_snc.rev_timestamp >= '.$sFirstRevisionSince.' group by rev_page ) as rev_aux_snc, ';

             $sSqlRev_timestamp = ', rev_timestamp';

             $sSqlRev_id = ', rev_id';

 			if ($sLastRevisionBefore!='') {

-                $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp=( SELECT MAX(rev_aux_bef.rev_timestamp) FROM ' . $sRevisionTable . ' AS rev_aux_bef WHERE rev_aux_bef.rev_page=rev.rev_page AND rev_aux_bef.rev_timestamp < '.$sLastRevisionBefore.')';

+                $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_page = rev_aux_bef.aux_page AND rev.rev_timestamp = rev_aux_bef.max_timestamp ';

             }

             if  ($sAllRevisionsBefore!='') {

                 $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp < '.$sAllRevisionsBefore;

             }

             if ($sFirstRevisionSince!='') {

-                $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp=( SELECT MIN(rev_aux_snc.rev_timestamp) FROM ' . $sRevisionTable . ' AS rev_aux_snc WHERE rev_aux_snc.rev_page=rev.rev_page AND rev_aux_snc.rev_timestamp >= '.$sFirstRevisionSince.')';

+                $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_page = rev_aux_snc.aux_page AND rev.rev_timestamp = rev_aux_snc.min_timestamp ';

             }

             if ($sAllRevisionsSince!='') {

                 $sSqlCond_page_rev .= ' AND '.$sPageTable.'.page_id=rev.rev_page AND rev.rev_timestamp >= '.$sAllRevisionsSince;

The createdby condition could be changed in a similar way, but it doesn't lock up the database, even though it too runs a similar subquery. -- Nx / talk 08:09, 28 November 2009 (UTC)

Reply

Thank you very much for finding this out.

If you see other cases where the efficiency of SQL statements can be improved this will be highly welcome!

I will put it into the next release. Gero 13:21, 4 December 2009 (UTC)

Don't do that yet. It looks like this is terribly inefficient in some cases. -- Nx / talk 19:10, 17 December 2009 (UTC)
And it looks like our server can once again execute the above query, in fact my version is much much less efficient (around 3 seconds vs less than 1). I really have no idea what is going on. -- Nx / talk 19:13, 17 December 2009 (UTC)

AARGH!!

Found the true source of the problem: limit. Without the "limit 0, 500" at the end, the query works.

Here's the explain extended output of the query without limit:

+----+--------------------+-------------+------+------------------------+----------------+---------+-------------------------------+-------+----------------------------------------------+
| id | select_type        | table       | type | possible_keys          | key            | key_len | ref                           | rows  | Extra                                        |
+----+--------------------+-------------+------+------------------------+----------------+---------+-------------------------------+-------+----------------------------------------------+
|  1 | PRIMARY            | rw_page     | ALL  | PRIMARY                | NULL           | NULL    | NULL                          | 38806 | Using where; Using temporary; Using filesort | 
|  1 | PRIMARY            | rev         | ref  | PRIMARY,page_timestamp | PRIMARY        | 4       | rationa1_wiki.rw_page.page_id |    13 | Using where                                  | 
|  3 | DEPENDENT SUBQUERY | rw_revision | ref  | PRIMARY,page_timestamp | PRIMARY        | 4       | rationa1_wiki.rw_page.page_id |    13 | Using where; Using filesort                  | 
|  2 | DEPENDENT SUBQUERY | rev_aux     | ref  | PRIMARY,page_timestamp | page_timestamp | 4       | rationa1_wiki.rev.rev_page    |    13 | Using index                                  | 
+----+--------------------+-------------+------+------------------------+----------------+---------+-------------------------------+-------+----------------------------------------------+

And here it is with limit

+----+--------------------+-------------+--------+------------------------+----------------+---------+-------------------------------+--------+----------------------------------------------+
| id | select_type        | table       | type   | possible_keys          | key            | key_len | ref                           | rows   | Extra                                        |
+----+--------------------+-------------+--------+------------------------+----------------+---------+-------------------------------+--------+----------------------------------------------+
|  1 | PRIMARY            | rev         | ALL    | PRIMARY,page_timestamp | NULL           | NULL    | NULL                          | 533968 | Using where; Using temporary; Using filesort | 
|  1 | PRIMARY            | rw_page     | eq_ref | PRIMARY                | PRIMARY        | 4       | rationa1_wiki.rev.rev_page    |      1 | Using where                                  | 
|  3 | DEPENDENT SUBQUERY | rw_revision | ref    | PRIMARY,page_timestamp | PRIMARY        | 4       | rationa1_wiki.rw_page.page_id |     13 | Using where; Using filesort                  | 
|  2 | DEPENDENT SUBQUERY | rev_aux     | ref    | PRIMARY,page_timestamp | page_timestamp | 4       | rationa1_wiki.rev.rev_page    |     13 | Using index                                  | 
+----+--------------------+-------------+--------+------------------------+----------------+---------+-------------------------------+--------+----------------------------------------------+

Note the reversed order in the second query. I set allowUnlimitedResults to true to avoid the limit (at least when it is not specified manually), and now it works. This of course isn't a good solution, but at least it's something. -- Nx / talk 11:48, 29 March 2010 (UTC)

Very interesting. Congratulations that you found out how something that was intended to LIMIT the use of CPO power in fact INCREASES the load in some situations ... 08:05, 30 March 2010 (UTC)