The online wiki Rock in China has been online for a number of years and focusses on Chinese underground music. Prior to starting implementing Semantic Mediawikia our wiki had roughly 4000 content pages with over 1800 artists and 900 records. We used a number of templates for bands, CDs, venues and labels, but apart from using numerous categories and the DynamicPageList extension for a few joints, we were not able to tangibly use the available data.
DPL example for JOINT between two Wikipedia Categories:
<DynamicPageList> category = Metal Artists category = Beijing Artists mode = ricstyle order = ascending </DynamicPageList>
{{SemanticVenue |Image= |ImageDescription= |City= |Address= |Phone= |Opened= |Closed= |GeoLocation= }}
As can be seen from the above template both predicates (City) and properties (Opened) are being proposed for the semantic class VENUE. Semantic Mediawiki is implementing this decisive difference in a very user-friendly way by setting the TYPE of each SMW property to either PAGE or something else. As good as this is, it somehow confuses if one is talking with someone else about the semantic concept in principle.
A major problem had been the implementation of external ontologies which was not sufficiently documented on the semantic mediawiki page, most probably due to a change in versioning. Especially the cross-referencing to the URI was a major problem. As per Semantic Mediawiki documentation, aliases would be allowed, however with trial and error, it was revealed that only a property with a domain prefix, e.g. foaf:phone or owl:sameas would be correctly recognized. We used the Special:RDFExport function to find most of these errors, everytime our URI referencing was wrong, we would get a parser function error.
First, the wrong way for the following two wiki pages:
Mediawiki:smw_import_mo:
http://purl.org/ontology/mo/ |[http://musicontology.com/ Music Ontology Specification] activity_end|Type:Date activity_start|Type:Date MusicArtist|Category genre|Type:Page Genre|Category track|Type:String media_type|Type:String publisher|Type:Page origin|Type:Page lyrics|Type:Text free_download|Type:URL
Property:genre:
[[Has type::Page]][[Imported from::mo:genre]]
And now the correct way how it should be actually implemented to work:
Mediawiki:smw_import_mo:
http://purl.org/ontology/mo/|[http://musicontology.com/ Music Ontology Specification] activity_end|Type:Date activity_start|Type:Date MusicArtist|Category genre|Type:Page Genre|Category track|Type:String media_type|Type:String publisher|Type:Page origin|Type:Page lyrics|Type:Text free_download|Type:URL
[[Has type::Page]][[Imported from::mo:genre]]
The ontology with most problems was the dbpedia, which documentation did not tell us what the correct URI was. Luckily the mailing list provided support and we got to know which the correct URI was:
http://www.dbpedia.org/ontology/
Being provided that, we were able to implement a number of semantic properties for a number of classes and start updating our wiki pages to get the data on our semantic database.
To utilize semantic properties within a wiki, there is a number of extensions available, such as Semantic Forms, Semantic Result Formats and Semantic Maps. The benefits we were able to gain were tremendous. For example the original JOINT query that we had been running at the beginning of the blog post with DPL was now able to be utilized with the following ASK query:
{{#ask: [[Category:Artists]] [[mo:origin:Beijing]] |format=list }}
However with the major benefit that the <references/> extension would NOT be broken after setting the inline query within a page. Dynamic Page List breaks the <references/>, rendering a lot of information lost. Other examples of how we benefitted from semantics is that previously we were only able to use Categories and read information of joining one or two categories, e.g. Artist pages that were both categorized as BEIJING artists and METAL artists. However now, with semantic properties, we had a lot of more data to play around with and could create mashup pages such as ROCK or Category:Records on which we were able to implement random videos from any ROCK artists or on which we were able to include a TIMELINE view of released records.
Owl:sameas (or sameas)
On various of our pages directly to Wikipedia pages.
However we learnt that the property
foaf:primaryTopic
is a much better and accurate property for this. The sameas property should be used for semantic RDF pages, i.e. the respective DBPedia RESOURCE page (not the PAGE page). Luckily we already implemented the sameas property mostly in templates, so it was easy enough to exchange the properties.
Having figured out this issue, we checked out both the freebase page as well as other pages, such as DBpedia or musicbrainz, but there seems to be no “submit RDF” form. Hence we decided that the best way for getting recognized in the Semantic Web is to include more links to other RDF resources, e.g. for our Category:Artists we set sameas links to dbpedia and music ontology. For dbpedia we linked to the class and for music ontology to the URI for the class.
Note on the side here, when checking on sameas.org, it seems that music ontology is NOT cross-linked to dbpedia so far.
Following the recommendations set forth at Sindice, we changed our robots.txt to include our semantic sitemap(s):
Sitemap: http://www.music-china.org/wiki/index.php?title=Special:RecentChanges&feed=atom Sitemap: http://www.rockinchina.com/wiki/index.php?title=Special:RecentChanges&feed=atom
Going the next step we analyzed how we can include external data on our SMW, e.g. from musicbrainz or from youtube. Being a music-oriented page especially Youtube was of particular interest for us. We found the SMW extension External Data that we could use to connect with the Google API:
{{#get_web_data: url=https://www.googleapis.com/youtube/v3/search?part=snippet&q=carsick+cars&topicId=%2Fm%2F03cmgbv&type=video&key=Googlev3API&maxResults=50 |format=JSON |data= videoId=videoId,title=title }}
And
{{#for_external_table: {{Youtube|ID={{{videoId}}}|title={{{title}}} }}<br/> {{{videoId}}} and {{{title}}}<br/> }}
See our internal TESTPAGE for the live example.
Youtube is using its in-house Freebase ID system to generate auto-channels filled with official music videos of bands and singers. The Freebase ID can be found on the individual freebase RESOURCE page after pressing the EDIT button. Alternatively one could use the Google API to receive the ID, but would need a Youtube internal HC ID prior to that. Easy implementation for our wiki: Include the FreebaseID as semantic property on artist pages within our definitions template:
{{Definitions |wikipedia= |dbpedia= |freebase= |freebaseID= |musicbrainz= |youtubeautochannel= }}
Voila, with the additional SQL-based caching of request queries (e.g. JSON) our API load on Google is extremely low as well as increasing speed for loading a page at our wiki. Using this method we were able to increase our saved YOUTUBE id tags from the original 500 to way over 1000 within half a day.