Difference between revisions of "Allowing true caching by DPL dev page"
(→Changed my mind) |
|||
Line 6: | Line 6: | ||
:This sounds very interesting. Do you make progress, is there something to dicsuss at the moment? [[User:Gero|Gero]] 17:05, 8 August 2008 (CEST) | :This sounds very interesting. Do you make progress, is there something to dicsuss at the moment? [[User:Gero|Gero]] 17:05, 8 August 2008 (CEST) | ||
+ | |||
+ | :Nop, nothing to discuss, I now moved the whole system to this new method, just a couple of update rules to adjust and I'll be ready for you to look at it... It's muuuuuuuch more lightweight. -- EmuWikiAdmin1, 10th august 2008 | ||
=What's been done= | =What's been done= |
Revision as of 17:14, 10 August 2008
Hello, my name is EmuWikiAdmin1, and I realised lately that my website will need DPL to cache its output if I want to have high traffic and not overload the server with too many DPL requests. I thus decided to start this page here and start modifying the DPL code so we have a caching system for DPL. Eventually this code could be included in the official DPL releases if that's what Gero wants, or we can just keep it separated and used only by those who really need it, it will be the decision of Gero.
Changed my mind
I just realised that mediawiki already has every we want for a cache system in their objectcache table. I'm now modifying my project so that instead of developing our own caching system, we just use the general mediawiki caching system (which we already use when we choose the option allowcachedresults=true) but just modify some things that will keep this caching system up to date and that will make the appropriate items expire. The rest of the page is about the old project, the part about dpldependencies will be kept but we won't use a dplcache table.
- This sounds very interesting. Do you make progress, is there something to dicsuss at the moment? Gero 17:05, 8 August 2008 (CEST)
- Nop, nothing to discuss, I now moved the whole system to this new method, just a couple of update rules to adjust and I'll be ready for you to look at it... It's muuuuuuuch more lightweight. -- EmuWikiAdmin1, 10th august 2008
What's been done
- A table named dplcache is added to the mediawiki database following DPL installation. (if not already created) This contains a cacheID (mediumInt) and output field (long text) which contains DPL output.
- Another table named dpldependencies is added to the mediawiki database. This contains 4 fields : cacheids, titledeps, categorydeps, and templatedeps. cacheids contains a list of ':' separated cache ids that need to be updated either when an article with title same as titledeps is updated, or in a category same as categorydeps is deleted, created, or edited.
- A new parameter is added to DPL : |CacheID=. User can either specify a 6 number (ex : 924924) cache ID, or he can just put CacheID=true. In the case where the user specifies a cache ID, DPL tries to find the cached content associated with this ID. The advantage is, for example, if you have multiple pages with the same DPL invocations, you could just give them the same cacheID, thus preventing from having to maintain multiple caches that would actually contain the same thing. In case the paramater is CacheID=true, then DPL chooses a random available cache ID and actually replaces the CacheID=true string with CacheID=xxxxxx in the wikitext.
- The render function of the DPL parser function is modified : before rendering, check if the CacheID parameter is used. If it is used, then don't refresh the DPL content : just output the cache content from the database. In case the CacheID parameter is present but the database does not have cache content for this cacheID, then just refresh the DPL, store it in the cache, and display the refreshed content to the user. So basically, how the cache works is that it always look for the content in the database, if it's not there, it creates it. So our expiration system will be very simple : just delete the field values associated with a cacheID in the database if we want it to expire. The next time a user goes to a page with the DPL invocation, it will get refreshed.
- Hooks have been added to delete content from the cache when necessary. Hooks look for : category change, article deletion in categories, article creation or edit in categories, and title of article (right now this one stores only the title of the article in which the DPL invocation is present. templatedeps is not used right now but could contain a list of templates that can be modified and that would affect the DPL invocation.
What's left
- Special page that would allow the administrators to empty all the DPL cache
- What's left would be Article Move hook. Unfortunately the hooks of articles moves in mediawiki don't get passed the &$article object and we just have a title, we don't have access to the content. so unless mediawiki decides to pass this parameter, we would be stuck to keep huge article title lists in the tables and scan for these titles when an article is moved. There could be also a possibilities to start from title and detect the categories using different classes but I don't know how right now.
- Unfortunately, for now the way I implemented it, if people edit or save articles with Category:{{{VariableCategory}}}, category detection will work but category removal won't work. We need to find a way to get &$article object information at the InternalParseBeforeLinks hook level if we want to extract last revision content and compare it to new revision content. It will not work either for deletion of variable category for the same reason.
- Maybe a user-controllable bouton that would make the content of the cache expire.
- Detect templates that could affect the DPL invocation and add these templates names to templatedeps
- Add support for the logical & and | of category relations in the DPL invocation. Right now let's say your DPL invocation calls Category = 1&2&3, the caching system says : ok this cacheID is dependant on category 1, it's dependant on category 2, and it's dependant on category 3. So if one edits an article in category 1 only, the cache will get purged. Which is not good : we only want to purge the cache when an article of category 1 AND 2 AND 3 is edited.
- I added $parser->disableCache(); at the beginning of the parser function because I had problems with some caching that was left in mediawiki. I don't know if that breaks anything, tell me what you think. It certainly should slow down people that do not use cache, so if you have an alternative tell me!
- Clean the code.
- I tried to just copy paste your version check for backward compatibility of the parser function. It didn't work, it breaks the output in 1.12. I don't know what I've done wrong if you want to have a look at it that would be nice.
- Extend the functionality to the <DPL> tags, right now it's limited to parser function.
- Create a cron job that would clean the cache ?
- Find a way to keep dpldependencies table clean : there is no problem with the table Dplcache, because content often gets cleaned because as soon as there's a change in the page where DPL is, or in the categories concerned, it gets deleted. But the dpldependencies table as it is right now is an accumulator that will grow to infinity.
Discussion
Add your input here.
Code
I'm now at a point where I will need some review because this is only my 3rd php program ever and you will find the code is probably very dirty. Go ahead and test it if you have the time, and don't hesitate to clean that up or give me comments about what to do.
This is the current state of the code. Improvments are welcome. If you want to see only the differences from the original DPL code, search for //EmuWikiAdmin. This is alpha-quality code. Do not use on a production server. :
cachecode.txt -- Updated July 28th 2008 - 11:30 AM (New York time)
Note
Gresat that you started this!
-
check if the MW12 hack ("parser must be coaxed ..") is really needed. I think you can just return the cached content.Yep, really needed on 1.12. Actually the output seems ok, except if your DPL invocation you call a parser function, then the parser functions refuses to output.- I was thinking about downward compatibility. We must makje sure, that your solution also runs with MW1.9, 1.10, 1.11. Gero 00:22, 27 July 2008 (CEST) - Ok I'll just put the same version check that was in the original DPL.
-
Please use a new revision number, I recommend 1.8.0Done -
I don´t think the ArticleSave hook will do the job. Try ParserAfterTidy or something similar. You must get access to the parser and you need access to data structures which tell you to which categories the article being edited belongs.Great, finally realised that ParserAfterTidy do not have category information. I used InternalParseBeforeLinks instead. -
you must also catch the "delete article" operation.Done. Article save, article edit, and change between last revision and current revision have been implemented too.
Good luck!
Gero 18:44, 26 July 2008 (CEST)