Privacy – Data Science, Data Analytics and Machine Learning Consulting in Koblenz Germany https://www.rene-pickhardt.de Extract knowledge from your data and be ahead of your competition Tue, 17 Jul 2018 12:12:43 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.6 Smartphones of Policemen could give criminals a competitive advantage https://www.rene-pickhardt.de/smartphones-of-policemen-could-give-criminals-a-competitive-advantage/ https://www.rene-pickhardt.de/smartphones-of-policemen-could-give-criminals-a-competitive-advantage/#respond Fri, 27 Apr 2012 09:31:43 +0000 http://www.rene-pickhardt.de/?p=1310 If I were a criminal I would create a smart phone app which would give me the possability to geographically and socially track policemen. Here some background on this thought.
Yesterday I was sitting in the German summit on “Facebook Goolgle & Co – Chances and Risks” (which I will blog about soon) But today during my train trip to the second day of the summit I was sitting in the train talking to a very friendly police officer. He agreed with what was said on the summit. The police is using social networks to find potential criminals. They also use cellphone tracking together with mobile providers to find people they are looking for. Nothing new and special so far. But now my interesting observation.
The police officer proudly told me that he is not using any social networking service because he enjoys his life in privacy. I understood that he believed this to be necessary in his job. By telling me this he was holding his iPhone in his hand. Again this shows one of the most crucial parts in this entire privacy discussion. Even highly educated people often lack an understanding of how much private information they implicitly give to third parties.
So I asked him if he used it during work times and he told me that he did since only mobile providers would know where he is and they could not give away data that easily. I was amazed! A policeman using an iPhone during work. That is such a security lack. If I were a terrorist organization I would create an iPhone and android app (or if possible an open mobile html5 app like Tim Berners Lee suggests <– you see the ethics overwhelm I am just not a criminal (-:). I would design this app in a way to support policemen. Help them communicate or have a cool map integration anything that was useful for the police. In this way I would create a database with real movement data of policemen. This data I could use for a different service similar to http://girlsaround.me/ displaying the current position and face of policemen (including if requested a list of people they recently communicated with including their phone numbers) on a map to anyone of my terrorist organization. The police just could never catch me since I would always know where they are (without asking any mobile provider!) I could even give them fake phone calls pretending I am one of the people they recently communicated with inputting them false information or just distracting them.

Of course this setting is only half realistic:

  • Every policeman would have to have a smartphone and use it during work time
  • Every policeman would have to install the app of the criminal
  • The criminal can distinguish between policemen and other people using the app (should be possible with data mining)
  • The criminal can decide weather the policeman is currently working or in leisure time

But it should show and demonstrate the dangers…

To conclude:

We have to disallow policemen to use private smart phones during work! Or if they do so they must not install any applications from a source they don’t trust. And here is the crucial point. Who to trust and who not? Trust usually is created through social ties. So if the app is there and some policemen like the service and recommend it to their coworkers trust is created. Who does really ask about the source of an app and about who is running/owning the data servers. A service that is well known on the web can easily run by 2 or 3 people and even if they are nice it is easy to manipulate or blackmail them in order to get access to these very sensitive data.
And on another more technical topic: We need a decentralized mobile space. There has to be a frequency on which people are able to set up their own transmitters and create decentralized mobile networks. It is a shame that those frequencies are all owned by companies creating centralized services.
By the way this would be a good solution since it would also enable the police to have their own decentralized mobile networks giving them privacy against third parties!

Disclaimer:

I never thought I would write an article in this paranoid way telling people what is possible and where the risks are. I almost feel like a member of ccc, anonymous or finally like a real pirate. But one year of PhD in a very data driven environment having social networks, information retrieval and the web as a focus really makes me understand more and more what is possible (in particular easy to achieve). Also the low awareness of society about these dangers (probably due to the complex technologies) overwhelms me and makes me feel like I have to act and at least inform people.
To bad that mostly people who are already aware of these topics read my blog. Maybe I have to go geek and create this app to demonstrate the functionality in order to really rise awareness. There are just too many interesting things to do during a PhD program so I think this time only writing about this has to be sufficient.

]]>
https://www.rene-pickhardt.de/smartphones-of-policemen-could-give-criminals-a-competitive-advantage/feed/ 0
Risks and criticisms of Google's Data liberation front https://www.rene-pickhardt.de/risks-and-criticisms-of-googles-data-liberation-front/ https://www.rene-pickhardt.de/risks-and-criticisms-of-googles-data-liberation-front/#comments Mon, 18 Jul 2011 19:19:04 +0000 http://www.rene-pickhardt.de/?p=679 I have wanted to write about Google’s data liberation front from some time. The data liberation front is an effort by Google to make your data from Google products available to you once you decide not to use them any more. Today my former student Martin was faster in his blog and created an excelent article on Googles Data liberation front. Which you really should check out since it saved me a lot of effort ! I really agree with almost all the points he makes! Let us be honest: Even though Google had many chances to be evil most the time they weren’t evil at all. But it is also true that they never had the need tobe evil. I think Googles Data liberation front is great and propably this company will really not turn evil. But I would love to extend Martins article with some potential risks.

Data liberation front needs to be improved

The data liberation front is nice but could also be much better. Loads of data are still inaccassible.  I am not sure if it is nothing but a marketing gag which is nice to have as long as they rule the web anyway. If the business is running worse Google will still have the chance to be like Zuckerberg and change policies. We never know and there is for sure no guarantee!
I was not surprised by Google+ and Googles efforts to launch a social network. Even though the network seems to have great privacy settings it ist still centralized which is about the biggest criticism you can have on Google. If Google was really all that great and really cares about open data and our privacy why didn’t they use all their engineering effort to create a decentralized social network?
One last example: Googles blogging services allow you to take your blogposts and export them. But what about the backlinks you have collected. Let us assume you run a successfull blog and want to move to your own website. How is Google helping you here? As far as I know they don’t let you put 301 redirects from your old blog posts to your new blog? Exporting your content is only half the way of not being locked in!

What happens if Page & Brin aren’t there anymore?

Not much to say for this risk! Right now the Google founders clearly have a vision. But what happens if they don’t run the company anymore. There is for sure a lot of trust we – or at least I – put in Google or at least the two founders. I think it is important to be aware of this. Even though right now they seem to have good ethics and morals.
Please don’t misunderstand me. I love Google and the data liberation front is a very cool feature and goes a great step towards an open web and better society and live! I do not know of any company on the web offering similar services. I am just pointing out that we should not blindly trust Google with their data liberation front and that there is still space for improvement! In this way I encourage the google engineers to continue with their great job!
]]>
https://www.rene-pickhardt.de/risks-and-criticisms-of-googles-data-liberation-front/feed/ 2
First privacy impressions of my new android phone https://www.rene-pickhardt.de/first-privacy-impressions-of-my-new-android-phone/ https://www.rene-pickhardt.de/first-privacy-impressions-of-my-new-android-phone/#respond Wed, 22 Jun 2011 17:29:44 +0000 http://www.rene-pickhardt.de/?p=597 My new cellphone finally arrived today. Being a fan of Google products, I was excited to test Android and get a feeling for everything. I wasn’t sure whether I would really need a smartphone or whether it was rather a time wasting but cool toy. After a bit of testing and playing around, I have to admit that I will probably use the option to retrieve feeds and read more news / blogs while being on the train or bus. I might also work on my Chinese more frequently with Anki for Android and there are some other features that will most certainly enrich my life.
One of them was that Google offered to synchronize my Gmail address book + calender with Android. The data is with Google anyway so I decided that it is not a big deal. And voilà, all my contacts, including phone numbers, are on my new phone. Amazing, considering my heart attack after my old cell broke down for which I did not have any backups.

All this comes at a very high price.

Since I started blogging and working on my PhD, I also started to use Twitter. So I wanted to download a Twitter app from the Android app market. It is incredible that the official Twitter app asks permission to access my phone’s address book. Remember, my phones address book is just a copy of my Gmail address book. I see how it helps Twitter to increase their service but to me, it became just too easy to share very sensitive data with companies that you might not (?) trust. I wonder whether the service Twitter offers to us will really improve that much if we share our private address book with the company. In my opinion, the small improvement we get does not justify their need to access my private address book. What would I have to promise someone to have a copy of his address book?

Can I escape?

I decided not to install the Twitter app. But does that really make sense? I guess most people don’t mind. After all, it is Twitter, a well known brand, that asks for the data. Additionally, Twitter is a communication service, so it makes sense to share this kind of data. However, even if I don’t share, Twitter can still guess the entries of my address book. Most of my friends who use Twitter with an android phone will probably accept the terms and condition of the Twitter app. Does not installing the app really help to protect my and my friends(!) privacy?
It is amazing that I am thinking right now about the consequences of blogging my experience of interviewing with Google when exactly this company creates structures that make us all sit in a glass house! I am very sure that this is intentionally like this. Please don’t missunderstand me. My first impression of Android is very good and I knew before that it encourages you in several ways to share data with anyone. Still, Android is probably one of the most useful tools which were brought to customers within the last ten years. I am only pointing out that things are changing very fast these days.

Which Android apps do I need?

So far I have:

  • Google Maps
  • Gmail
  • Google Search
  • Google Voice Recognition
  • Google Reader
  • Google News
  • Tweetdeck (without sharing my address book !)
  • Ankidroid
  • Google Docs
  • Google Calendar

What else would you suggest? And no, I don’t want a Facebook app. (-:

]]>
https://www.rene-pickhardt.de/first-privacy-impressions-of-my-new-android-phone/feed/ 0
Filterbubble appeared on Eli Parisers Moveon.org! https://www.rene-pickhardt.de/filterbubble-appeared-on-eli-parisers-moveon-org/ https://www.rene-pickhardt.de/filterbubble-appeared-on-eli-parisers-moveon-org/#comments Tue, 24 May 2011 17:53:21 +0000 http://www.rene-pickhardt.de/?p=461 Yesterday I clicked on Eli Parisers site moveon.org and right away I was surprised that the AddThis plugin under the headline story offered facebook, email, twitter AND meinVZ (one of germanys biggest social networking sites)
MeinVZ logo can be found in the German Version of Moveon.org

I asked a friend in the states to visit moveon.org. For him the plugin offered facebook / email / twitter and google and of course no MeinVZ for the Americans.
Instead of MeinVZ there is a link to Google on the american Moveon website

Well of course this isn’t really the kind of personalization that Eli Pariser is telling us to watch out for. In my opinion it is a rather useful personalization of technology and not a personalization of information. But I was sure that it wasn’t on Elis mind that this is actually happening. Since Eli always points out that the filterbubble is invisible I thought it is an excelent little example of how INVISIBLE the filter bubble actually is and how easy people can contribute to it. So I contacted him and told him about it.

Eli Parisers reply to my mail

It’s a good catch — I wasn’t aware that AddThis was doing that. In the long run, we’re hoping to move away from that plugin for a number of reasons, but it’s fine to point out if you’d like — it does underscore the point that one can miss this happening under one’s own nose.

]]>
https://www.rene-pickhardt.de/filterbubble-appeared-on-eli-parisers-moveon-org/feed/ 1
What are the 57 signals google uses to filter search results? https://www.rene-pickhardt.de/google-uses-57-signals-to-filter/ https://www.rene-pickhardt.de/google-uses-57-signals-to-filter/#comments Tue, 17 May 2011 22:58:16 +0000 http://www.rene-pickhardt.de/?p=397 Since my blog post on Eli Pariser’s Ted talk about the filter bubble became quite popular and a lot of people seem to be interested in which 57 signals Google would use to filter search results I decided to extend the list from my article and list the signals I would use if I was google. It might not be 57 signals but I guess it is enough to get an idea:

  1. Our Search History.
  2. Our location – verfied -> more information
  3. the browser we use.
  4. the browsers version
  5. The computer we use
  6. The language we use
  7. the time we need to type in a query
  8. the time we spend on the search result page
  9. the time between selecting different results for the same query
  10. our operating system
  11. our operating systems version
  12. the resolution of our computer screen
  13. average amount of search requests per day
  14. average amount of search requests per topic (to finish search)
  15. distribution of search services we use (web / images / videos / real time / news / mobile)
  16. average position of search results we click on
  17. time of the day
  18. current date
  19. topics of ads we click on
  20. frequency we click advertising
  21. topics of adsense advertising we click while surfing other websites
  22. frequency we click on adsense advertising on other websites
  23. frequency of searches of domains on Google
  24. use of google.com or google toolbar
  25. our age
  26. our sex
  27. use of “i feel lucky button”
  28. do we use the enter key or mouse to send a search request
  29. do we use keyboard shortcuts to navigate through search results
  30. do we use advanced search commands  (how often)
  31. do we use igoogle (which widgets / topics)
  32. where on the screen do we click besides the search results (how often)
  33. where do we move the mouse and mark text in the search results
  34. amount of typos while searching
  35. how often do we use related search queries
  36. how often do we use autosuggestion
  37. how often do we use spell correction
  38. distribution of short / general  queries vs. specific / long tail queries
  39. which other google services do we use (gmail / youtube/ maps / picasa /….)
  40. how often do we search for ourself

Uff I have to say after 57 minutes of brainstorming I am running out of ideas for the moment. But this might be because it is already one hour after midnight!
If you have some other ideas for signals or think some of my guesses are totally unreasonable, why don’t you tell me in the comments?
Disclaimer: this list of signals is a pure guess based on my knowledge and education on data mining. Not one signal I name might correspond to the 57 signals google is using. In future I might discuss why each of these signals could be interesting. But remember: as long as you have a high diversity in the distribution you are fine with any list of signals.

]]>
https://www.rene-pickhardt.de/google-uses-57-signals-to-filter/feed/ 126
Apple I phone Location tracking: That's not a bug https://www.rene-pickhardt.de/apple-i-phone-location-tracking-thats-not-a-bug/ https://www.rene-pickhardt.de/apple-i-phone-location-tracking-thats-not-a-bug/#comments Wed, 27 Apr 2011 15:34:44 +0000 http://www.rene-pickhardt.de/?p=357 Today Apple finnaly published a press release with a statement about their tracking of geo locations of iPhone users. They clearly say that they did not intend to save this data on an I phone nor did they purposely send the data back to the apple servers. In their press release they state that this is a bug in the iPhone firmware iOS.

Apple:
6. People have identified up to a year’s worth of location data being stored on the iPhone. Why does my iPhone need so much data in order to assist it in finding my location today?
This data is not the iPhone’s location data-it is a subset (cache) of the crowd-sourced Wi-Fi hotspot and cell tower database which is downloaded from Apple into the iPhone to assist the iPhone in rapidly and accurately calculating location. The reason the iPhone stores so much data is a bug we uncovered and plan to fix shortly (see Software Update section below). We don’t think the iPhone needs to store more than seven days of this data.
7. When I turn off Location Services, why does my iPhone sometimes continue updating its Wi-Fi and cell tower data from Apple’s crowd-sourced database?
It shouldn’t. This is a bug, which we plan to fix shortly (see Software Update section below).”

This is redicoulus! This is no bug!

I have been developing software for many years. There are bugs in software developement. In this case it is very obvious that someone acted with full awareness of what he was doing. Thanks to this article and this piece of software I was able to extract the geo data from Nasir Naveed’s iPhone. Have a look at the table that he made available.

i phone geo locations
I phone collects timestamps, longitude and latitude (thanks to Nasir Naveed)

In this data format Apple stores timestamps which are equivalent to a data time and longitude and latitude where the phone was situated in this very moment. Additionally some other data is saved!

Apple is lying and distributing wrong information.

In their press release they state the following:

Apple:
3. Why is my iPhone logging my location?
The iPhone is not logging your location. Rather, it’s maintaining a database of Wi-Fi hotspots and cell towers around your current location, some of which may be located more than one hundred miles away from your iPhone, to help your iPhone rapidly and accurately calculate its location when requested.Calculating a phone’s location using just GPS satellite data can take up to several minutes. iPhone can reduce this time to just a few seconds by using Wi-Fi hotspot and cell tower data to quickly find GPS satellites, and even triangulate its location using just Wi-Fi hotspot and cell tower data when GPS is not available (such as indoors or in basements). These calculations are performed live on the iPhone using a crowd-sourced database of Wi-Fi hotspot and cell tower data that is generated by tens of millions of iPhones sending the geo-tagged locations of nearby Wi-Fi hotspots and cell towers in an anonymous and encrypted form to Apple.”

Note the lie that location tracking with GPS can take several minutes! If that was true how come every GPS in a car is working right away? Modern GPS in cars are able to quickly calculate the locations even in moving objects that are changing location quite frequently.

Apple is receiving your profile of motion

Apple:
5. Can Apple locate me based on my geo-tagged Wi-Fi hotspot and cell tower data?
No. This data is sent to Apple in an anonymous and encrypted form. Apple cannot identify the source of this data.”

It is true that for most research cases I can all of a sudden think of Apple does only need the anonymous data. but let’s be realistic. Even if Apple really would send over this data in an anonymous way. They could still do a meshup with the usual data they collect (your phone calls) and would probably be able to identify you quickly. In any case they have never asked your persmission!

Apple is obviously not even thinking to change its politics

Apple:
“In the next major iOS software release the cache will also be encrypted on the iPhone.”

ok tht is the coolest part! In Future iOS Apple will still store the data on your phone. They will probably still send it back to the Apple servers. But don’t wory, the data on your phone will be encrypted so that no one else but Apple can access the data! (not even yout) Thank you Apple! You must really be telling the truth when you say:

10. Does Apple believe that personal information security and privacy are important?
Yes, we strongly do. For example, iPhone was the first to ask users to give their permission for each and every app that wanted to use location. Apple will continue to be one of the leaders in strengthening personal information security and privacy.

Help to stop Apple spreading missleading information

It is very sad to see how a lot of big news sites are just skimming Apples press release and thereby help Apple to spread wrong and confusing information. Please pass on what you have discovered right now and don’t let Apple win over your privacy. Just use your Facebook account or Twitter.
Btw I don’t know about the law in other coutries but in Germany you have a two year warrenty for your product. So if this really is a bug this means that you could probably return your iPhone and get back your money. Have you ever considered to change away from your iPhone or have been unsatisfied? Well now is great moment to change!

Please send me your i phone’s location data

In the quoted blog  article and piece of software it is explained how you can extract your moving behaviour from your I phone. Apple has this data without even asking you. I am a researcher and I can think of many scientific usecases. So please do not only share your data with Apple but also with some people here in University. Just attach the logfile you extract from your iPhone to an email to r.pickhardt@gmail.com. I promise I will anonymize your data and only share it within the scientific community.
Btw you can find the full press release here

]]>
https://www.rene-pickhardt.de/apple-i-phone-location-tracking-thats-not-a-bug/feed/ 2
Algorithmic Information Filter from Eli Pariser’s TED Talks https://www.rene-pickhardt.de/algorithmic-information-filter-from-elis-parisers-ted-talks/ https://www.rene-pickhardt.de/algorithmic-information-filter-from-elis-parisers-ted-talks/#comments Sun, 13 Mar 2011 13:34:06 +0000 http://www.rene-pickhardt.de/?p=285 Just today an interesting story came up on a German news site which goes back to Eli Pariser’s (Homepage, follow @Twitter ) talk on TED about a thing he calls the Filter Bubble and how personalization is changing the Internet. Before commenting on his talk I want to personally thank him to use his reputation and start a discussion on such a fundamental and important topic!
UPDATE most likely you are looking for my list of almost 57 signals google might use to filter
I had a short Mail conversation with Eli. He asked me to temporarly remove his TED talk since his book isn’t on sale yet. I found a very similar talk by him which he allowed me to make public in my blog. So here you go folks:

Google is filtering and personalizing search results

Eli is pointing out a thing some people might have already noticed. If two different people search for the same thing on Google it is very probable that the search results will be very different. Google is doing this without telling the user that it is acutally filtering the results based on what the algorithm thinks the user might like. According to Eli Pariser Google is using 57 signals to determine the interest of us. Among those we find:

Of course this kind of personalization has its good sides. When I am about to buy a new notebook computer y I definitely want to see different Websites if I live in Germany or in the US. This could be due to tax and shipping fees. Which means that I am most probably interested in local stores and not in oversea shops. Still this personalization and filtering is a huge potential for serious problems. Let me ask a few questions:

  • What happens if Google misinterprets our 57 signals?
  • What happens if I only receive results from a certain type?
  • What if I rely to the fact that I have access to all kind of information?

We might think we get all the information we need. But in reality we are becoming blinded by the filters Google is using. We have no chance to determine what other information is filtert and potentially available for a certain topic. On the other hand due to the amount of information we need filters and computers to help us. But the systems should be more transparent!

Facebook is also filtering the newsstream from your friends:

I have always been thinking Facebook’s huge success is strongly correlated to the fact that there is hardly Spam on Facebook and the information economy is very smart and user friendly. The attention of users to status updates is very high making facebook a great place for every company to do online and viral marketing. This of course contributes to Facebook’s reach. In fact the information architecture on Facebook is even so smart that your 20’000 followers on Facebook might not receive your status updates since Facebook’s EdgeRank algorithm decides it is not relevant to your fans or friends. Edgerank might not have 57 signals but it still takes into consideration:

  • who your fans are friend with
  • what other news they like
  • how heavy they have interacted with you in the past
  • the time passed since your last status update

Great news isn’t it? Just compare this with my statement in a recent blog post about creating newsletters as a musician in order to communicate with your fans and not solely rely on other services like Facebook or MySpace.
You don’t believe the Facebook thing? There is a video about the EdgeRank algorithm used by Facebook to determine which status updates should reach us and which shouldn’t. Feel free to have a look and thanks to the guys from Klurig Analytics for producing such a great video resource:

So what can we do?

  1. We should join the discussion in order to pursue Google, Facebook and others to become more transparent.
  2. We should be aware of the fact that a lot of information might not reach us.
  3. Even though more and more information is made available through the Internet we should not become lazy and rely on all these great web services.
  4. Last but not least you can help to spread the information about this topic! As we have seen only if a lot of people spread the information it breaks through the filtering system. And this topic is worth to be spread!

Again thanks a lot to Eli Pariser to start this discussion!

]]>
https://www.rene-pickhardt.de/algorithmic-information-filter-from-elis-parisers-ted-talks/feed/ 35