Difference between revisions of "DocTypes"
m (1 revision(s)) |
(→Technology) |
||
(7 intermediate revisions by the same user not shown) | |||
Line 73: | Line 73: | ||
{{DocTypes}} is a simple and conservative approach to represent semantically meaningful objects and relations within the world of MediaWiki. | {{DocTypes}} is a simple and conservative approach to represent semantically meaningful objects and relations within the world of MediaWiki. | ||
− | * Objects are defined by calling a MediaWiki '''template''' which is named after the Type. There is also a help | + | * Objects are defined by calling a MediaWiki '''template''' which is named after the Type. There is also a help page for the user which explains the semantics of the Type. |
* Apart from that there are some other Type-related templates which care for XML export and reporting. | * Apart from that there are some other Type-related templates which care for XML export and reporting. | ||
* Articles are seen as containers which store one or more objects (usually of the same class). | * Articles are seen as containers which store one or more objects (usually of the same class). | ||
Line 144: | Line 144: | ||
{| | {| | ||
|bgcolor=yellow| | |bgcolor=yellow| | ||
− | The core idea of {{DocTypes}} is to reverse that principle | + | The core idea of {{DocTypes}} is to reverse that principle. Using {{DocTypes}} you put your whole piece of knowledge into a template call. While some of your parameters may be quite simple (a word, a number, a sentence, a link) others may consist of several text paragraphs including headlines on various levels and images. |
|} | |} | ||
Line 198: | Line 198: | ||
* high effort because | * high effort because | ||
** .. he must invent proper categories and assign them to existing articles | ** .. he must invent proper categories and assign them to existing articles | ||
− | ** .. he must invent and apply templates after having recognized similarities in | + | ** .. he must invent and apply templates after having recognized similarities in certain articles |
** .. systematic changes must be done manually | ** .. systematic changes must be done manually | ||
| | | | ||
Line 210: | Line 210: | ||
|---- | |---- | ||
|Import / Export | |Import / Export | ||
− | |The contents can technically be exported as XML but the contents is opaque, i.e. it is nothing more than a sequence of characters in the XML scheme. | + | |The contents can technically be exported as ''XML'' but the contents is opaque, i.e. it is nothing more than a sequence of characters in the XML scheme. |
− | |The text can be exported as semantically structured XML. | + | |The text can be exported as semantically structured ''XML'' or as a ''csv'' with named columns. |
|} | |} | ||
Line 225: | Line 225: | ||
== Technology == | == Technology == | ||
− | {{DocTypes}} is basically a series of clever templates which use standard Mediawiki features and some existing MediaWiki extensions | + | {{DocTypes}} is basically a series of clever templates which use standard Mediawiki features and some existing MediaWiki extensions like [[mw:Extension:DynamicPageList|DPL]] and [[mw:Extension:Variables|Variables]]. |
{{DocTypes}} is more a certain way to use existing MediaWiki technology than a new technology. | {{DocTypes}} is more a certain way to use existing MediaWiki technology than a new technology. | ||
Line 232: | Line 232: | ||
Continue reading with [[DocTypes Design]] or [[DocTypes Example]]. | Continue reading with [[DocTypes Design]] or [[DocTypes Example]]. | ||
− | + | Access {{DocTypes}} defined in this wiki: [[:Category:DocType]] | |
Look at the template scripts which implement {{DocTypes}}: [[:Category:DocTypeScript]] | Look at the template scripts which implement {{DocTypes}}: [[:Category:DocTypeScript]] | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | == | + | == Other Approaches == |
− | + | If you are interested in Semantic Mediawiki, you can play around in this wiki, too. | |
+ | See [[SMW Demo]]. |
Latest revision as of 14:57, 28 January 2008
Contents
DocTypes: A light-weight approach to semantics
MediaWiki uses text articles to represent knowlede. It offers ways to assign articles to categories and supports links between articles. Templates can be used to add a set of parameters to an article.
{{#wgraph: name=DocTypes_1| svg|thumb=50|
node p1 {label "Page 1" type page } node p2 {label "Page 2" type page } node p3 {label "Page 3" type page } node cA {label "Cat A" type cat} node cB {label "Cat B" type cat} node cC {label "Cat C" type cat} node Tx {label "Template X" type tpl } edge p1 cA { type is_a } edge p2 cB { type is_a } edge p3 cB { type is_a } edge p3 cC { type is_a } edge p3 Tx { type uses } edge p1 p2 { type link } nodetype page { color lightgray bordercolor darkgray } nodetype cat { shape ellipse color lightred bordercolor red } nodetype tpl { color lightgreen bordercolor green } edgetype is_a { color darkred } edgetype uses { color darkgreen } edgetype link { color black kind near}
}}
Sometimes this is not sufficient. To represent knowledge in a more structured way a typing concept is needed. Instead of Articles you want to have semantically meaningful entities like Person, Trip or Location. Instead of Links you want to have semantically meaningful relationships like takes part in or starts at. A conceptual scheme for our example might look like this:
{{#wgraph: name=DocTypes_2| svg|thumb=60|
node Trip { type Type shape hexagon } node Person { type Type shape lparallelogram } node Location { type Type shape trapezoid } nodetype Type { color lightyellow bordercolor darkyellow }
edge Person Trip { type reference label takes_part_in } edge Trip Location { type reference label starts_at } edgetype reference { color black }
orientation left_to_right
}} While the above diagram acts on type level ('class level', 'relation type level'), the real pieces of knowledge are instances (objects) of the above types and they are related by instances (relations) of the above relation types.
So on the level of individual instances ('objects', 'relations') we would see the following:
{{#wgraph: name=DocTypes_3| svg|thumb=60|
node Trip_4711 { type Trip;Object } node Henry { type Person;Object } node Paris { type Location;Object } node Trip_8552 { type Trip;Object } node Maria { type Person;Object } node Rome { type Location;Object } nodetype Trip { shape hexagon } nodetype Person { shape lparallelogram } nodetype Location { shape trapezoid } nodetype Object { color lightmagenta bordercolor darkmagenta }
edge Henry Trip_4711 { type takes_part_in;reference } edge Maria Trip_8552 { type takes_part_in;reference } edge Trip_4711 Paris { type starts_at;reference } edge Trip_8552 Rome { type starts_at;reference } edgetype takes_part_in { color blue label takes_part_in} edgetype starts_at { color blue label starts_at} edgetype reference { color black }
orientation left_to_right
}}
How does it map to MediaWiki?
DocTypes is a simple and conservative approach to represent semantically meaningful objects and relations within the world of MediaWiki.
- Objects are defined by calling a MediaWiki template which is named after the Type. There is also a help page for the user which explains the semantics of the Type.
- Apart from that there are some other Type-related templates which care for XML export and reporting.
- Articles are seen as containers which store one or more objects (usually of the same class).
- As soon as an Article contains an object of some class the article will become part of a category which has the same name as the template used to define the object.
- Relations are basically links between pages, but they point directly to objects using the object ID as a link target.
That´s all.
{{#wgraph: name=DocTypes_4| svg|thumb=60|
node dTrip { type Document label Document(s)\ncontaining\nobjects\nof_Type\nTrip } node dPerson { type Document label Document(s)\ncontaining\nobjects\nof_Type\nPerson } node dLocation { type Document label Document(s)\ncontaining\nobjects\nof_Type\nLocation }
node Trip_4711 { type Trip;Object } node Henry { type Person;Object } node Paris { type Location;Object } node Trip_8552 { type Trip;Object } node Maria { type Person;Object } node Rome { type Location;Object } node tTrip { type Template label Template\nTrip } node tPerson { type Template label Template\nPerson } node tLocation { type Template label Template\nLocation } node cTrip { type Category label Category\nTrip } node cPerson { type Category label Category\nPerson } node cLocation { type Category label Category\nLocation } nodetype Category { shape ellipse color lightred bordercolor red } nodetype Template { color lightgreen bordercolor green } nodetype Trip { shape hexagon } nodetype Person { shape lparallelogram } nodetype Location { shape trapezoid } nodetype Document { color lightgray bordercolor darkgray } nodetype Object { color lightmagenta bordercolor darkmagenta }
edge Henry Trip_4711 { type takes_part_in;reference } edge Maria Trip_8552 { type takes_part_in;reference } edge Trip_4711 Paris { type starts_at;reference } edge Trip_8552 Rome { type starts_at;reference }
edge dPerson dTrip { type takes_part_in;dummy} edge dTrip dLocation { type starts_at;dummy} edge tPerson tTrip { type takes_part_in;dummy} edge tTrip tLocation { type starts_at;dummy} edge cPerson cTrip { type takes_part_in;dummy} edge cTrip cLocation { type starts_at;dummy}
edgetype dummy { textcolor white color white } edgetype takes_part_in { color blue label takes_part_in} edgetype starts_at { color blue label starts_at} edgetype reference { color black }
orientation left_to_right
}}
How about OWL, RDF, Semantic Wiki etc. ?
DocTypes is somewhat less abstract and less generic than these concepts. It does not introduce ontologies and annotations and there is no general abstract query language for traversing relations. DocTypes is based on the idea of semantic triples but it does not put them in the foreground.
Instead, DocTypes is very much straight forward and rather easy to use for the average MediaWiki user as there is nothing new to learn for him. There is no additional syntax, no need to qualify relationships while writing documents. Instead the author fills his text into the parameter list of a template. So he is essentially being guided by a 'form' but still has the full power of expressing himself with rich text and embedded media.
Note that we are not talking about a traditional screen form. This would be too rigid and would put too much burden onto the DocTypes-Designer. Rather we talk of creating a template which essentially means to list the attributes which will make up an object.
In general, you should not expect the full power of semantic modelling (OWL/RDF etc.) from DocTypes, but you may be astonished how much can be done. The biggest benefit of DocTypes is probably its simplicity.
Comparison between the traditional way and DocTypes
Today a wiki author uses basically rich text when writing. If he wants to add a set of standardized descriptive attributes to his text he will create a template and use the attribute values as parameters. The template will insert theses values into his text, typically as a nice little table.
The core idea of DocTypes is to reverse that principle. Using DocTypes you put your whole piece of knowledge into a template call. While some of your parameters may be quite simple (a word, a number, a sentence, a link) others may consist of several text paragraphs including headlines on various levels and images. |
Of course this only makes sense if there is an appropriate structure which will be accepted by the authors because it is considered to be helpful for a certain knowledge domain. A typical wiki may have 70% articles in traditional form and 30% of the articles containing DocTypes.
The good thing is that it doesn´t make a difference to the authors. But, of course, it makes a difference for the designer of the wiki.
The following table gives a summary:
Aspect | Standard Wiki | DocTypes |
---|---|---|
Paradigm 1 | a collection of stories | a collection of fact sheets |
Paradigm 2 | things are somehow connected to each other by 'free association' | objects have distinct typed references between each other |
When to use | Broad range of topics, weakly structured text, no common scheme applicable | High degree of structural similarity between certain instances of your knowledge domain. Commonly agreed 'reasonable' scheme on how to present information |
output / appearance | heterogeneous, totally left to the user (apart from the sporadic use of templates which produce some standardized pieces of text | homogeneous, standardized scheme how information is presented; there may be areas where "stories" are embedded, but they have their fixed place in the overall schema design. |
Navigation | The author puts hyperlinks where he feels it makes sense. The kind of relationship which goes along with a link can only be derived from careful reading the text portion around the link. | The system expects references at some pre-defined positions and assigns a semantic meaning to them. The reader will find such references always in the same place and can traverse them backwards specifically. Even reports are possible. |
Burden for the average article writer |
|
|
Burden for the wiki designer |
EX POST approach:
|
EX ANTE approach:
|
Import / Export | The contents can technically be exported as XML but the contents is opaque, i.e. it is nothing more than a sequence of characters in the XML scheme. | The text can be exported as semantically structured XML or as a csv with named columns. |
Glossary
Before we are going to show an example and give more details we need a short definition of the terminology of DocTypes:
- Page
- A page (article) in your wiki which is designed in alignment with DocTypes principles. Pages contain one or more Objects of a certain Type.
- Type
- A definition of common Properties for all Objects (Instances) belonging to that Type.
- Object
- A piece of knowledge contained in a Page which has a certain Type.
- Property
- An attribute of an Object; it can be a plain value, a complex value (consisting of Instances of other Types) or a Reference to another Object.
- Reference
- A Property which points to another Object.
- ReferenceInfo
- Some text which can go along with a Reference; it explains more about the kind of relationship.
Technology
DocTypes is basically a series of clever templates which use standard Mediawiki features and some existing MediaWiki extensions like DPL and Variables. DocTypes is more a certain way to use existing MediaWiki technology than a new technology.
Continue reading with DocTypes Design or DocTypes Example.
Access DocTypes defined in this wiki: Category:DocType
Look at the template scripts which implement DocTypes: Category:DocTypeScript
Other Approaches
If you are interested in Semantic Mediawiki, you can play around in this wiki, too. See SMW Demo.