DPL:Manual - DPL parameters: Criteria for page selection

From FollowTheScore
Revision as of 14:23, 8 September 2007 by Eep² (talk | contribs) (distinct: spelling, links, monospace)
Jump to: navigation, search

Template:Type:Manual

  • You can select articles based on
    • the category/categories they are assigned to
    • the number of categories they are assigned to
    • their namespace
    • their usage of templates
    • their title
    • their references to other articles
    • their character (#redirect or normal article)
    • their revision date
  • You can restrict the number of articles to a certain limit
    • via configuration settings within the DPL2 source
    • via a specific parameter for a given invocation of DPL2
  • You can select a subset from the result list by random

category

category Select articles based on categories. You can specify more than one category with the pipe '

Syntax:

category=1st category name|2nd category name|3rd category name|...


Example 1:

<DPL>
  category=Africa|Europe
  category=Politics and conflicts
</DPL>

This list will output pages that have [[Category:Africa]] OR [[Category:Europe]], AND [[Category:Politics and conflicts]] listed.

You can specify the set of Uncategorized pages as a normal category, with an empty string (e.g. 'category=' for uncategorized pages only, 'category=|Animals' or 'namespace=Animals|' for the Uncategorized or the Animals category, 'category=Mammals||Insects' for the Mammals category, uncategorized pages or the Insects category, etc.). See Source and Installation for the extra required installation steps.

If ordermethod=category,... and headingmode are enabled, you can restrict the categories you want as headings in the result by preceding the list of categories (specified with the category parameter) with a '+' or '-'. See the example below.

  • A '+' means that only the categories listed in that statement are allowed to appear as headings in the output.
  • A '-' means that the categories listed in that statement are NOT allowed to appear as headings in the output (but all others)

If you put a "*" before the name of a category, DPL will add all DIRECT subcategories of that category to your statement. This provides some minimal support for hierarchies of categories. The syntax and/or semantics of this feature might be chagned in a future version.

Example 2:

<DPL>
  category=+Africa|Europe
  category=*Politics and conflicts
  ordermethod=category,sortkey
  headingmode=ordered
</DPL>

This list will output pages that have [[Category:Africa]] OR [[Category:Europe]], AND ( [[Category:Politics and conflicts]] or a direct subcategory of the latter ) listed. The list will be ordered (OL tag) and organized in 2 main items/headings: the Africa one and the Europe one (Politics and conflicts or its resp. subcategories will not be used as a heading). Under each item/heading you will see a sublist of pages ordered by their sortkey for the category used as heading.

Notes:

If you want to use magic words like {{CURRENTMONTHNAME}}, {{CURRENTDAY}}, {{CURRENTYEAR}} etc in the category name, you must use the parser function syntax variant.

To prevent a DPL query from returning huge output (or consuming too many resources) there are some configuration variables in the source code of the extension module like $wgDPL2MaxCategoryCount, $wgDPL2AllowUnlimitedCategories, $wgDPL2MinCategoryCount.

categorymatch

categorymatch Select articles based on categories. You can specify one or more patterns (SQL LIKE); a page will be selected if at least one of its categories matches at least one of the patterns.

Syntax:

categorymatch=1st category pattern|..

A "%" is used to denote "any number of any characters".

Example 1:

<DPL>
  categorymatch=Africa%|Europe%
</DPL>

This list will output pages that belong to categories like Africa, Africans, Europe, Europeans etc.

categoryregexp

categoryregexp select pages with a category matching a regular expression

The complete text behind "categoryregexp" will be taken as ONE argument and used in a SQL REGEXP clause, i.e. "|" characters can be used as a normal part of the regexp.

notcategory

notcategory Much like the category parameter, but requires that every page listed not be in a particular category. Unlike in 'category' you cannot combine several categories using logical OR in this parameter.

Syntax:

notcategory=category name

Example:

<DPL>
  category=Africa
  notcategory=Zimbabwe
  notcategory=Kenya
</DPL>

This list will output pages that have [[Category:Africa]] but do not have either [[Category:Zimbabwe]] or [[Category:Kenya]] listed.

Notes:

If you use the parser function syntax you will be able to use Magic words like {{CURRENTMONTHNAME}}, {{CURRENTDAY}}, {{CURRENTYEAR}} etc in the category name.

Related DPL extension variables: $wgDPL2MaxCategoryCount, $wgDPL2AllowUnlimitedCategories, $wgDPL2MinCategoryCount.

notcategorymatch

notcategorymatch Works like notcategory but based on SQL LIKE

notcategoryregexp

notcategoryregexp Works like notcategory but based on SQL REGEXP

categoriesminmax

categoriesminmax To restrict the search to articles which are assigned to at least [min] and at most to [max] categories.

Syntax:

categoriesminmax=[min],[max]

Example:

<DPL>
  category=Africa
  categoriesminmax=3
</DPL>

The list will only contain articles which belong to category "Africa" and at least to two other categories

<DPL>
  category=Africa
  categoriesminmax=,1
</DPL>

The list will only contain articles which belong to category "Africa" and are not assigned to any other category.

namespace

namespace To restrict the articles in the list to only be in one of the given namespaces.

Syntax:

namespace=1st namespace name|2nd namespace name|3rd namespace name|...

The namespace name may be any one, assuming it represents a valid namespace in the system, including custom ones, BUT no pseudo-namespace such as Media, Special which have negative namespace ids. The empty string is the main article namespace (e.g. 'namespace=' for pages in Main ns only, 'namespace=|Help' or 'namespace=Talk|' for Main or Talk ns, 'namespace=User||Category' for User, Main or Category, etc.).

Name spaces are case sensitive namespace=User_talk will work, but namespace=User_Talk will not.

Namespace ids are no longer allowed as namespace arguments, because it caused a conflict between namespace given a number as name on one hand, and namespace with that same number as namespace id on the other hand. E.g. if you created a custom namespace with '1' as title (yes, it's possible), namespace=1 would give you pages in '1' and not pages in namespace with id=1 (Talk). In that case, you could not get pages in the Talk ns anymore. Solution: use only ids or only names. Names are more user-friendly.

Example 1:

<DPL>
  category=Policy
  namespace=Wikinews|Discussion
</DPL>

This list will output pages that are in the Wikinews or Discussion namespace and belong to [[Category:Policy]].

Example 2 (with magic word):

  {{#dpl: category = Policy | namespace= {{NAMESPACE}} }}

This list will output pages that are in the namespace the current page is in - whatever it is - and belong to [[Category:Policy]].

notnamespace

notnamespace Much like the notcategory parameter, but for namespaces. Requires that every page listed not be in one of given namespaces.

Syntax:

notnamespace=namespace name

Example 1:

<DPL>
  notnamespace=Wikinews
  notnamespace=Discussion
</DPL>

This list will output pages that are NEITHER in the Wikinews NOR in the Discussion namespace.

Example 2 (with magic word):

  {{#dpl: notnamespace = Wikinews | notnamespace = {{NAMESPACE}} }}

This list will output pages that are NEITHER in the Wikinews NOR in the namespace the current page is in.

linksfrom

linksfrom Selects articles which are referenced from at least one of the specified pages.

Syntax:

linksfrom=full page name|..

The page mentioned in the DPL query can be retrieved via %PAGESEL%.

Example 1:

<DPL>
  category = Poets
  linksfrom  = Dublin|Cork
</DPL>

This list will output pages that are mentioned (with a hyperlink) in article Dublin or Cork in the Main namespace and which belong to category "Poets".

Example 2 (with magic word):

  {{#dpl: category = Poets | linksfrom = {{FULLPAGENAME}} }}

This list will output pages that are in category "Poets" and which are referenced by the current page, whatever it is. Note that normally 'linksfrom' will only show existing pages. With 'openreferences=yes' this can be changed.

openreferences

openreferences extends the 'linksfrom' to unresolved references.

Syntax:

openreferences=yes

Example 1:

<DPL>
  linksfrom  = Dublin|Cork
  openreferences=yes
</DPL>

This list will output pages that are mentioned (with a hyperlink) in article Dublin or Cork in the Main namespace, regardless whether these pages exist or not.

Note that the vast majority of DPL parameters depend on the existence of a page. If you set openreferences to 'yes' none of those parameters can be used. Examples for conflicting parameters are all parameters which relate to categories, revisions, authors, redirections and some other parameters.

notlinksfrom

notlinksfrom Selects articles which are NOT referenced from any of the specified pages.

Syntax:

notlinksfrom=full page name|..

linksto

linksto Selects articles which link to at least one of the specified pages.

Syntax:

linksto=full page name|..

The page mentioned in the DPL query can be retrieved via %PAGESEL%.

Example 1

Parser extension
<DPL>
  category = Poets
  linksto  = Dublin|Cork
</DPL>
Parser function
{{#dpl: category = Poets | linksto = Dublin{{!}}Cork}}

This list will output pages that are in category "Poets" and link to a page with title Dublin or Cork in the Main namespace (by default). Note the use of {{!}} as a template call to "|" in order for multiple values in parameters when using DPL as a parser function (otherwise DPL would interpret "|Cork" as another parameter, and give an error).

Example 2 (with magic word)

  {{#dpl: category = Poets | linksto = {{FULLPAGENAME}} }}

This list will output pages that are in category "Poets" and link to the current page, whatever it is.

notlinksto

notlinksto Selects articles which do NOT link to any of the specified pages.

Syntax:

notlinksto=full page name|..

Example:

<DPL>
  category   = Poets
  notlinksto = London|Paris
</DPL>

This list will output pages that are in category "Poets" and do not have a link pointing to a page with title London or Paris in the Main namespace.

Note:

The implementation of this feature is not very efficient. Use with care and avoid huge result sets.


uses

uses Selects articles which use at least one of the specified template ( wiki syntax: {{...}} ).

Syntax:

uses=Template:name|Template:.. The Template namespace must be specified. You can also specify another namespace if you like.

Example 1:

<DPL>
  uses = Template:Poet|Template:Painter
</DPL>

This list will output pages that use a template called Poet and/or another template called "Painter".

notuses

notuses Selects articles which do not use any of the specified template.

Syntax:

notuses=Template:name|Template:..

Example 1:

<DPL>
  category = Poet
  notuses = Template:Poet
</DPL>

This list will output pages about poets which do not use the corresponding template.

Caution:

The implementation of this feature is not very efficient. Use with care and avoid huge result sets.


createdby

createdby Selects articles which were created by the specified user.

Syntax:

createdby=username

Note (applies for all user related selection criteria):

  • You can combine user related selections. For example you could search for pages which were not created by user1 but modified by him, or you could search for pages which were created by user1 and lastmodified by user. You can also show several or all versions of such articles by specifiying one or more of the "revision" group of parameters like allrevisionsbefore.
  • currently there is no mechanism to make a distinction between minor edits and normal modifications


notcreatedby

notcreatedby Selects articles which were NOT created by the specified user.

Syntax:

notcreatedby=username

Note:

To avoid huge result sets this will typically be accompanied by other selection criteria.


modifiedby

modifiedby Selects articles which were created or at least once modified by the specified user.

Syntax:

modifiedby=username

Note:

modifiedby will always be a superset of createdby as the creation of a page is interpreted as its first modification.

notmodifiedby

notmodifiedby Selects articles which were NOT (created or) modified by the specified user.

Syntax:

notmodifiedby=username

Note:

To avoid huge result sets this will typically be accompanied by other selection criteria.


lastmodifiedby

lastmodifiedby Selects articles where the last modification was done by the specified user.

Syntax:

lastmodifiedby=username

notlastmodifiedby

notlastmodifiedby Selects articles where the last modification was NOT done by the specified user.

Syntax:

notlastmodifiedby=username

Note:

To avoid huge result sets this will typically be accompanied by other selection criteria.

title

title Select one single page by its (namespace and) title.

Syntax:

title=pagetitle

If you specify a "title", the "mode" will be automatically set to "userformat" which means that you will get no output by default. Specifying an exact "title" makes sense if you want to transclude contents from one specific other page, e.g. the whole text, a chapter, labeled sections or template calls.

Thus DPL may serve as a more flexible alternative to Labeled Section Transclusion.

Examples:

{{#dpl:title=My Page|include=#First Chapter}}
{{#dpl:title=My Page|include={My Template}.dpl|multisecseparator=\n----\n}}

The first example will include the contents of "My Chapter" of an article named "My Page" in the main namespace.

The second example will take all invocations of template "My Template" in article "My Page" and apply "Template:My Template.dpl" instead of "Template:My Template". The output will be separated by horizontal lines.

titlematch

titlematch Select pages with a title matching at least one of the specified patterns. The patterns are used as a LIKE argument in an SQL query. Namespaces are ignored as the namespace parameter can be used to further narrow the selection.

Syntax:

titlematch=pattern|..

Example:

<DPL>
  titlematch=%foo%|bar%
</DPL>

This will output all pages (regardless of namespace) which have a name that contains "foo" somewhere in the title or start with "bar"

Example:

<DPL>
  namespace=
  titlematch=A%
</DPL>

This will output all pages in the main namespace which begin with "A".

The match is case-sensitive, even with respect to the first character.

titleregexp

titleregexp Select pages with a title matching the specified regular expressions. The pattern will be used as a REGEXP argument in a SQL query. Namespaces are ignored as the namespace= parameter can be used to further narrow the selection.

Syntax:

titleregexp=regular expression

Example:

<DPL>
  titleregexp=[0-9]+.*y$
</DPL>

This will output all pages (regardless of namespace) which have a digit in their name and end with a "y".

nottitlematch

nottitlematch Select pages with a title NOT matching any of the specified patterns. The patterns are used as a LIKE argument in a SQL query. Namespaces are ignored as the namespace= parameter can be used to further narrow the selection. Normally you would want to use this selection only in combination with other criteria. Otherwise output could become huge.

Syntax:

nottitlematch=pattern|..

Example:

<DPL>
  nottitlematch=%e%|%u%
</DPL>

This will output all pages (regardless of namespace) which do not contain an "e" or a "u" in their title.


nottitleregexp

nottitleregexp Select pages with a title that does NOT match the specified regular expression. The expression will be used as a REGEXP argument in a SQL query. Namespaces are ignored as the namespace= parameter can be used to further narrow the selection. Normally you would want to use this selection only in combination with other criteria. Otherwise output could become huge.

Syntax:

nottitleregexp=regular expression

includesubpages

includesubpages Controls the inclusion or exclusion of pages which have a '/' in their name. Default is true.

Syntax:

includesubpages=false

As subpages are by default always included, only 'no' or 'false' makes sense as an argument for includesubpages.

redirects

redirects Controls the inclusion or exclusion of redirect pages in the output. By default redirections are NOT shown.

Syntax:

redirects=criteria

criteria can be one of:

  • exclude — excludes redirect pages from lists — (default)
  • include — allows redirect pages to appear in lists
  • only — lists only redirect pages in lists

Example:

<DPL>
  category  = Africa
  redirects = include
</DPL>

The result will consist of content pages and redirect pages tagged with [[Category:Africa]]. Note: this parameter does not show pages that link to the redirect (as Special:Whatlinkshere/DPL:Discussion does); only redirect pages themselves.

includematch

includematch Controls the selection of pages based on contents which shall be included from these pages.

Syntax:

includematch=regexp1,regexp2,..

The idea is that a page will only be selected (and its contents included) if the contents to be included matches a regular expression. In case of (heading based) chapter inclusion and labeled section inclusion the relevant contents of the page must match the pattern; in case of template based matching it is the complete wikitext of the calling code of your template which is tested against your regular expression. Be careful to design your regexp in a proper way so that it can match all syntactical variations and note that we use Perl regular expressions. This means that you must delimit your regexp with two identical characters that are not part of the regexp itself, e.g. with '/'. Otrherwise you will see strange error messages from the php interpreter...

If you are not familiar with regular expressions and/or do not know the specifics of Perl regexp used in PHP, you should definitely have a look into the PHP manual before using 'includematch'.

You may want to match named parameters or unnamed parameters. In the first case you should use something like

includematch=/\|\s*myParameter\s*=\s*myPattern/s 

to be on the safe side. Thus you can put spaces around the '=' and use linebreaks in your original article when calling the template - and still the pattern will match.

In case the template expects unnamed parameters you would specify something like

includematch=/\|\s*myPattern/s

If the parameter is not the last one in your template call you might use

includematch=/\|\s*myPattern\s*\|/s

See the includepage parameter.

Example:

<DPL>
  category  = Africa
  includepage  = #myChapter,{countryProfile}.dpl
  includematch = ,/Name\s*=\s*[Kk]amerun/s
</DPL>

This will match articles which contain a call to the template "countryProfile" and use the "Name" parameter of that template with an argument that contains "Kamerun" or "kamerun" as a text string. Note that there is no pattern specified for the first element of the includepage statement. "KAMERUN" would not match; we could use the "i" modifier with the regexp to match without case sensitivity if we wanted so.

includenotmatch

includenotmatch Controls the selection of pages based on contents which shall be included from these pages.

Syntax:

includenotmatch=regexp1,regexp2,..

The idea is that a page will only be selected (and its contents included) if the contents to be included does not match a given regular expression. In case of (heading based) chapter inclusion and labeled section inclusion the relevant contents of the page must not match the pattern; in case of template based matching it is the calling code of your template which must not match the regular expression. Be careful to design your regexp in a proper way so that it covers all syntactical variations. You should use something like

includenotmatch=myParameter\s*=\s*myPattern/s

to be on the safe side. Thus you can put spaces around the '=' and use linebreaks in your original article when calling the template - and still the pattern will do its job.

See the includepage parameter.

Example:

<DPL>
  category  = Africa
  include  = #myChapter,{countryProfile}.dpl
  includenotmatch = ,/Name\s*=\s*[Kk]amerun/s
</DPL>

This will match articles which contain a call to the template "countryProfile" and use the "Name" parameter of that template with an argument that does not contain "Kamerun" or "kamerun" as a text string. Note that there is no pattern specified for the first element of the includepage statement. "KAMERUN" would not match; we could use the "i" modifier with the regexp to match without case sensitivity if we wanted so.

lastrevisionbefore

lastrevisionbefore shows only articles which existed before the specified date. The date of the last revision

before that date will be shown (and will be available as %REVISION% in mode=userformat).

Syntax:

lastrevisionbefore=dateandoptionaltime

dateandoptionaltime is a numeric string of up to 14 digits, like "200812041300" (4th of Dec, 2008, 13:00). The string may contain separation characters like "2008/12/04--13:00".

Note: if this parameter is used the variable %REVISION% will contain the revision of the selected page(s).

firstrevisionsince

firstrevisionsince The date of the first revision after the specified date will be shown (and will be available as %REVISION% in mode=userformat).

Syntax:

firstrevisionsince=dateandoptionaltime

dateandoptionaltime is a numeric string of up to 14 digits, like "200812041300" (4th of Dec, 2008, 13:00) The string may contain separation characters like "2008/12/04--13:00".

Note: if this parameter is used the variable %REVISION% will contain the revision of the selected page(s).

allrevisionsbefore

allrevisionsbefore shows all revisions which existed before the specified date. The date of each revision will be shown (and will be available as %REVISION% in mode=userformat).

Syntax:

allrevisionsbefore=dateandoptionaltime

dateandoptionaltime is a numeric string of up to 14 digits, like "200812041300" (4th of Dec, 2008, 13:00) The string may contain separation characters like "2008/12/04--13:00".

Note: if this parameter is used the variable %REVISION% will contain the revision of the selected page(s).

allrevisionssince

allrevisionssince shows all revisions which were created after the specified date. The date of each revision will be shown (and will be available as %REVISION% in mode=userformat). If there was no new revision of an existing article after the specified date that article will not appear in the output.

Syntax:

allrevisionssince=dateandoptionaltime

dateandoptionaltime is a numeric string of up to 14 digits, like "200812041300" (4th of Dec, 2008, 13:00) The string may contain separation characters like "2008/12/04--13:00".

Note: if this parameter is used the variable %REVISION% will contain the revision of the selected page(s).

count

count Controls the number of results that are shown.

Syntax:

count=n, with n a positive integer

A blank value (count=) for unlimited. It is limited to 500 by default, depending on extension variables: $wgDPL2MaxResultCount, $wgDPL2AllowUnlimitedResults.

Example:

<DPL>
category=Africa
ordermethod=pagetouched
count=2
</DPL>

This list will output the two pages most recently changed that have [[Category:Africa]].

offset

offset show only a portion of a big result list; typically used in combination with "count="

Syntax:

offset = n with n = number of result lines to skip, (integer), default = 0

Example:

<DPL>
  category=Africa
  offset = 10
  count  =  5
</DPL>

This will show articles #11 .. #15 of category Africa; order is determined by alphabet as we did not give any specific ordermethod.

Note:

  1. You could put a DPL query into a template and make count & offset parameters. Calling this template with different values will allow you to display different portions of the result list.
  2. Implementation of "offset=" is very primitive as we create a list of (offset+count) entries and skip the first (offset) entries when formatting the result. Performance should nevertheless be acceptable in normal cases, though.

randomcount

randomcount create the complete result set and then select a subset for display by random.

Syntax:

randomcount=n, with n a positive integer

If randomcount is larger than the number of results, the complete result set will be displayed.

Example:

<DPL>
  category=Africa
  ordermethod=size
  count=20
  randomcount=3
</DPL>

This list will output three random articles from the group of the 20 largest articles on Africa.

distinct

distinct allow / suppresses duplicates in the output

Syntax:

distinct=true | false | strict

Normally distinct is set to true. This means that a page will occur only once in the output.

In connection with linksto and linksfrom, however, a page can occur more than once in the output. This happens if you specify more than one page for the linksfrom/linksto parameter and the same page contains links to more than one of them (linksto) or is referenced by more than one of them (linksfrom). If you want see a page only once also in these cases, use distinct=strict.

On the other hand, if you wish to see mutiple results entries you should switch this to false. This may make sense in combination with linksto or linksfrom if you want to see how many links from one document to another document exist.