China – Data Science, Data Analytics and Machine Learning Consulting in Koblenz Germany https://www.rene-pickhardt.de Extract knowledge from your data and be ahead of your competition Tue, 17 Jul 2018 12:12:43 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.6 Experiences on semantifying a Mediawiki for the biggest recource about Chinese rock music: rockinchina .com https://www.rene-pickhardt.de/experiences-on-semantifying-a-mediawiki-for-the-biggest-recource-about-chinese-rock-music-rockinchina-com/ https://www.rene-pickhardt.de/experiences-on-semantifying-a-mediawiki-for-the-biggest-recource-about-chinese-rock-music-rockinchina-com/#comments Mon, 07 Jan 2013 09:38:45 +0000 http://www.rene-pickhardt.de/?p=1486 During my trip in China I was visiting Beijing on two weekends and Maceau on another weekend. These trips have been mainly motivated to meet old friends. Especially the heads behind the biggest English resource of Chinese Rock music Rock in China who are Max-Leonhard von Schaper and the founder of the biggest Chinese Rock Print Magazin Yang Yu. After looking at their wiki which is pure gold in terms of content but consists mainly of plain text I introduced them the idea of putting semantics inside the project. While consulting them a little bit and pointing them to the right resources Max did basically the entire work (by taking a one month holiday from his job. Boy this is passion!).
I am very happy to anounce that the data of rock in china is published as linked open data and the process of semantifying the website is in great shape. In the following you can read about Max experiences doing the work. This is particularly interesting because Max has no scientific background in semantic technologies. So we can learn a lot on how to improve these technologies to be ready to be used by everybody:

Max report on semantifying

max-leonhard-von-schaper
Max-Leonhard von Schaper in Beijing.
To summarize, for a non-scientific greenhorn experimenting with semantic mediawiki and the semantic data principle in general, a good two months were required to bring our system to the point where it is today. As easy as it seems in the beginning, there is still a lot of manual coding and changing to be done as well as trial-and-error to understand how the new system is working.
Apart from the great learning experience and availability of our data in RDF format, our own website expanded in the process by ~20% of content pages (from 4000 to above 5000), adding over 10000 real property triplets and gaining an additional 300 thousand pageviews.
Lessons learnt in a comprised way:

  • DBPedia resources are to be linked with “resources” in the URI not with “page”
  • SMW requires the pre-fix “foaf:” or “mo:” or something else for EACH imported property
  • Check the Special:ExportRDF early to see if your properties work
  • Properties / Predicates , no difference with SMW
  • How to get data to freebase depends on the backlinks and sameas to other ontologies as well as entering data in semantic search engines
  • Forms for user data entry are very important!
  • As a non-scientific person without feedback I would not have been able to implement that.
  • DBPedia and music ontology ARE not interlinked with SAMEAS (as checked on sameas.org).
  • Factbox only works with the standard skin (monoskin). For other skins one has to include it in the PHP code oneself.

Main article

The online wiki Rock in China has been online for a number of years and focusses on Chinese underground music. Prior to starting implementing Semantic Mediawikia our wiki had roughly 4000 content pages with over 1800 artists and 900 records. We used a number of templates for bands, CDs, venues and labels, but apart from using numerous categories and the DynamicPageList extension for a few joints, we were not able to tangibly use the available data.
DPL example for JOINT between two Wikipedia Categories:

<DynamicPageList>
category = Metal Artists
category = Beijing Artists
mode     = ricstyle
order  = ascending
</DynamicPageList>

Results of a simple mashup query: display venues in beijing on a Google Map

After having had an interesting discussion with Rene on the benefits of semantic data and Open Linked Data, we decided to go Semantic. As total greenhorns to the field and with only limited programming skills timely available, we started off googeling the respective key terms and quickly enough came to the websites of the Music Ontology and the Semantic Mediawiki, which we decided to install.
Being an electrical engineer with basic IT backgrounds and many years of working on the web in PHP, HTML, Joomla or Mediawiki, it was still a challenge to get used to the new semantic way of talking and understanding the principles behind. Not so much because there might not be enough tutorials or data information out in the web, but because the guiding principle is somewhere but not where I was looking. Without the help of Rene and several feedback discussions I don’t it would have been possible for us to implement this system within the month that it took us.
Our first difficulty (after getting the extension on our FTP server) was to upgrade our existing Mediawiki from version 1.16 to version 1.19. An upgrade that used up the better part of two days, including updating all other extensions as well (with five of them not working anymore at all, as they are not being further developed) and finally getting our first Semantic Property running.
Upon starting of implementing the semantic approach, I read a lot online on the various ontologies available and intensively checked the Music Ontology. However Music Ontology is by far the wrong use case for our wiki, as Music Ontology is going more into the musical creation process and Rock in China is describing the scene developments. All our implementations were tracked on the wiki page Rock in China – Semantic Approach for other team members to understand the current process and to document workarounds and problems.
Our first test class had been Venue, a category in which we had 40 – 50 live houses of China with various level of data depth that we could put into the following template SemanticVenue:

{{SemanticVenue
|Image=
|ImageDescription=
|City=
|Address=
|Phone=
|Opened=
|Closed=
|GeoLocation=
}}

As can be seen from the above template both predicates (City) and properties (Opened) are being proposed for the semantic class VENUE. Semantic Mediawiki is implementing this decisive difference in a very user-friendly way by setting the TYPE of each SMW property to either PAGE or something else. As good as this is, it somehow confuses if one is talking with someone else about the semantic concept in principle.
A major problem had been the implementation of external ontologies which was not sufficiently documented on the semantic mediawiki page, most probably due to a change in versioning. Especially the cross-referencing to the URI was a major problem. As per Semantic Mediawiki documentation, aliases would be allowed, however with trial and error, it was revealed that only a property with a domain prefix, e.g. foaf:phone or owl:sameas would be correctly recognized. We used the Special:RDFExport function to find most of these errors, everytime our URI referencing was wrong, we would get a parser function error.
First, the wrong way for the following two wiki pages:

  • Mediawiki:smw_import_mo
  • Property:genre

Mediawiki:smw_import_mo:

http://purl.org/ontology/mo/ |[http://musicontology.com/ Music Ontology Specification]
activity_end|Type:Date
activity_start|Type:Date
MusicArtist|Category
genre|Type:Page
Genre|Category
track|Type:String
media_type|Type:String
publisher|Type:Page
origin|Type:Page
lyrics|Type:Text
free_download|Type:URL

Property:genre:

[[Has type::Page]][[Imported from::mo:genre]]

And now the correct way how it should be actually implemented to work:
Mediawiki:smw_import_mo:

http://purl.org/ontology/mo/|[http://musicontology.com/ Music Ontology Specification]
activity_end|Type:Date
activity_start|Type:Date
MusicArtist|Category
genre|Type:Page
Genre|Category
track|Type:String
media_type|Type:String
publisher|Type:Page
origin|Type:Page
lyrics|Type:Text
free_download|Type:URL

Property:mo:genre:

[[Has type::Page]][[Imported from::mo:genre]]

The ontology with most problems was the dbpedia, which documentation did not tell us what the correct URI was. Luckily the mailing list provided support and we got to know which the correct URI was:

http://www.dbpedia.org/ontology/

Being provided that, we were able to implement a number of semantic properties for a number of classes and start updating our wiki pages to get the data on our semantic database.
To utilize semantic properties within a wiki, there is a number of extensions available, such as Semantic Forms, Semantic Result Formats and Semantic Maps. The benefits we were able to gain were tremendous. For example the original JOINT query that we had been running at the beginning of the blog post with DPL was now able to be utilized with the following ASK query:

{{#ask: [[Category:Artists]] [[mo:origin:Beijing]]
|format=list
}}

However with the major benefit that the <references/> extension would NOT be broken after setting the inline query within a page. Dynamic Page List breaks the <references/>, rendering a lot of information lost. Other examples of how we benefitted from semantics is that previously we were only able to use Categories and read information of joining one or two categories, e.g. Artist pages that were both categorized as BEIJING artists and METAL artists. However now, with semantic properties, we had a lot of more data to play around with and could create mashup pages such as ROCK or Category:Records on which we were able to implement random videos from any ROCK artists or on which we were able to include a TIMELINE view of released records.

Mashup Page with a suitable video

With the help of the mailing list of Semantic Mediawiki itself (which was of great help when we were struggling) we implemented inline queries using templates to avoid later data changes on multiple pages. That step taken, the basic semantic structures were set up at our wiki and it was time for our next step: Bringing the semantic data of our wiki to others!
And here we are, asking ourselves: How will Freebase or DBpedia actually find our data? How will they include it? Discussing this with Rene a few structural problems became apparent. Being used to work with Wikipedia we usually set the property same:

Owl:sameas (or sameas)

On various of our pages directly to Wikipedia pages.
However we learnt that the property

foaf:primaryTopic

is a much better and accurate property for this. The sameas property should be used for semantic RDF pages, i.e. the respective DBPedia RESOURCE page (not the PAGE page). Luckily we already implemented the sameas property mostly in templates, so it was easy enough to exchange the properties.
Having figured out this issue, we checked out both the freebase page as well as other pages, such as DBpedia or musicbrainz, but there seems to be no “submit RDF” form. Hence we decided that the best way for getting recognized in the Semantic Web is to include more links to other RDF resources, e.g. for our Category:Artists we set sameas links to dbpedia and music ontology. For dbpedia we linked to the class and for music ontology to the URI for the class.
Note on the side here, when checking on sameas.org, it seems that music ontology is NOT cross-linked to dbpedia so far.
Following the recommendations set forth at Sindice, we changed our robots.txt to include our semantic sitemap(s):

Sitemap: http://www.music-china.org/wiki/index.php?title=Special:RecentChanges&feed=atom
Sitemap: http://www.rockinchina.com/wiki/index.php?title=Special:RecentChanges&feed=atom

Going the next step we analyzed how we can include external data on our SMW, e.g. from musicbrainz or from youtube. Being a music-oriented page especially Youtube was of particular interest for us. We found the SMW extension External Data that we could use to connect with the Google API:

{{#get_web_data:
url=https://www.googleapis.com/youtube/v3/search?part=snippet&q=carsick+cars&topicId=%2Fm%2F03cmgbv&type=video&key=Googlev3API&maxResults=50
|format=JSON
|data= videoId=videoId,title=title
}}

And

{{#for_external_table:
{{Youtube|ID={{{videoId}}}|title={{{title}}} }}<br/>
{{{videoId}}} and {{{title}}}<br/>
}}

See our internal TESTPAGE for the live example.
Youtube is using its in-house Freebase ID system to generate auto-channels filled with official music videos of bands and singers. The Freebase ID can be found on the individual freebase RESOURCE page after pressing the EDIT button. Alternatively one could use the Google API to receive the ID, but would need a Youtube internal HC ID prior to that. Easy implementation for our wiki: Include the FreebaseID as semantic property on artist pages within our definitions template:

{{Definitions
|wikipedia=
|dbpedia=
|freebase=
|freebaseID=
|musicbrainz=
|youtubeautochannel=
}}

Voila, with the additional SQL-based caching of request queries (e.g. JSON) our API load on Google is extremely low as well as increasing speed for loading a page at our wiki. Using this method we were able to increase our saved YOUTUBE id tags from the original 500 to way over 1000 within half a day.

A big variety of videos for an act like carsick cars is now available thanks to semantifying

With these structures in place it was time to inform the people in our community not only on the changes that have been made but also on the additional benefits and possibilities. We used our own blog as well as our Facebook page and Facebook group to spread the word.

]]>
https://www.rene-pickhardt.de/experiences-on-semantifying-a-mediawiki-for-the-biggest-recource-about-chinese-rock-music-rockinchina-com/feed/ 3
How to do a presentation in China? Some of my experiences https://www.rene-pickhardt.de/how-to-do-a-presentation-in-china-some-of-my-experiences/ https://www.rene-pickhardt.de/how-to-do-a-presentation-in-china-some-of-my-experiences/#comments Fri, 02 Nov 2012 08:33:22 +0000 http://www.rene-pickhardt.de/?p=1432 So the culture is different from Western culture we all know that! I am certainly not an expert on China but after living in China for almost 2 years knowing some language and working in a chinese company seeing presentations every week and also visiting over 30 western and chinese companies placed in China I think I have some insights about how you should organize your presentation in China.
Since I recently went to Shanghai in order to to research exchange with Jiaotong University I was about to give a presentation to introduce my institute and me. So here you can find my rather uncommon presentation and some remarks, why some slides where designed in the way they are.
http://www.rene-pickhardt.de/wp-content/uploads/2012/11/ApexLabIntroductionOfWeST.pdf

Guanxi – your relations

First of all I think it is really important to understand that in China everything is related to your relations (http://en.wikipedia.org/wiki/Guanxi). A chinese business card will always name a view of your best and strongest contacts. This is more important than your adress for example. If a conference starts people exchange namecards before they sit down and discuss.
This principle of Guanxi is also reflected in the style presentations are made. Here are some basic rules:

  • Show pictures of people you worked together with
  • Show pictures of groups while you organized events
  • Show pictures of the panels that run events
  • Show your partners (for business not only clients but also people you are buying from or working together with in general)

My way of respecting these principles:

  • I first showed a group picture of our institute!
  • I also showed for almost every project where I could get hold of it pictures of the people that are responsible for the project
  • I did not only show the European research projects our university is in but listed all the different partners and showed logos of them

Family

The second thing is that in China the concept of family is very important. I would say as a rule of thumb if you want to make business with someone in china and you havent been introduced to their family things are not going like you might expect this.
For this reason I have included some slides with a worldmap going further down to the place where I was born and where I studied and where my parents still leave!

Localizing

When I choosed a worldmap I did not only take one with Chinese language but I also took one where china was centered. In my contact data I also put chinese social networks. Remember Twitter, Facebook and many other sites are blocked in China. So if you really want to communicate with chinese people why not getting a QQ number or weibo account?

Design of the slides

You saw this on conferences many times. Chinese people just put a hack a lot of stuff on a slide. I strongly believe this is due to the fact that reading and recognizing Chinese characters is much  faster than western characters. So if your presentation is in Chinese Language don’t be afraid to stuff your slides with information. I have seen many talks by Chinese people that where literally reading word by word what was written on the slides. Where in western countries this is considered bad practice in China this is all right. 

Language

Speaking of Language: Of course if you know some chinese it shows respect if you at least try to include some chinese. I split my presentation in 2 parts. One which was in chinese and one that was in english.

Have an interesting take away message

So in my case I included the fact that we have PhD positions open and scholarships. That our institut is really international and the working language is english. Of course I also included some slides about my past and current research like Graphity and Typology

During the presentation:

In China it is not rude at all if ones cellphone rings and one has more important stuff to do. You as presenter should switch of your phone but you should not be disturbed or annoyed if people in the audience receive phone calls and go out of the room doing that business. This is very common in China.
I am sure there are many more rules on how to hold a presentation in China and maybe I even made some mistakes in my presentation but at least I have the feeling that the reaction was quite positiv. So if you have questions, suggestions and feedback feel free to drop a line I am more than happy to discuss cultural topics!

]]>
https://www.rene-pickhardt.de/how-to-do-a-presentation-in-china-some-of-my-experiences/feed/ 3