Category Archives: Uncategorized


When heart rate becomes the goal of practice. Will professional sports lead the way in self – tracking?

Even the City bikes in Helsinki give you a report of the trip, which is also added to your personal use history along with the pick-up and return stations.

Even the City bikes in Helsinki give you a report of the trip, which is also added to your personal use history along with the pick-up and return stations.

Monitoring our performance with wearable devices has become popular despite the lack of scientific evidence on its benefits to our health. GPS tracking, for instance, is no longer an activity of goal-oriented athletes alone as anyone from a casual paddler to a committed bicycle commuter can easily record routes with a sports watch or smartphone application. Provided that the privacy of individuals is guaranteed, prospects for re-using the data for societal good are plentiful. Many of us would find novel value-added services enabling for example smarter route planning [1] or optimization of a city’s cycling network and its maintenance [2] very welcome.

As part of my doctoral work, I aim to understand the representativeness of crowd-sourced movement data. In terms of cycling, activity tracking data is typically expected to be biased towards faster recreational riders, which has raised questions regarding its value in the urban planning context. Other biases may not be as obvious, but can still be at least as problematic. One example is participation inequality by which we refer to the uneven distribution of recorded tracks between users. This is a typical characteristic of online communities based on volunteered collaborations, e.g. Wikipedia, OpenStreetMap, in which a small active minority of users is responsible of a large share of the contributions.

Our findings suggest that individual cyclists, particularly commuters who are repeatedly tracking their route between home and work, may have a greater, potentially biasing, effect on the spatial and temporal cycling patterns derived from public mobile sports tracking application data than mass events and other group journeys [2,3]. Regarding privacy, biases induced by individual-level phenomena brings us to an important decision: even though from privacy perspective it would seem justifiable to remove all user identifiers from the data, in terms of data usability, it is necessary to be able to associate tracks belonging to the same user via pseudo-IDs [2,3]. Whether this is possible or not, will affect routing [1] and other analyses, and ultimately determine the value of the data in building sustainable smart cities.

It is likely that privacy and data ownership issues will only proliferate in the future once biometric tracking devices move into the consumer sector and are plugged in for data fusion. In professional sports, wearables, and specifically biometric trackers, are already a highly controversial topic. The battle is tough for example in the NBA, where Lauri Markkanen, who played himself into the hearts of all Finns in Eurobasket 2017, has impressed also the Chicago Bulls fans.

The advantages of biometric data may seem so obvious, considering their benefits in recovery and avoiding injuries, that we forget the potential – not necessarily so desirable – effects in the longer term. This is where the players’ union has stepped in, reminding that such data can also be used against the player. In a recent article in the Atlantic – “The Upcoming Privacy Battle Over Wearables in the NBA” – Jeremy Venook described the current agreement [4]:

[…] use in practice is strictly voluntary […]. Any team requesting a player wear one must explain, in writing, what’s being tracked and how the team will use the information […] Most importantly, the agreement says that ‘data collected from a Wearable worn at the request of a team may be used for player health and performance purposes and Team on-court tactical and strategic purposes only. The data may not be considered, used, discussed or referenced for any other purpose such as in negotiations regarding a future Player Contract or other Player Contract transaction,’ under penalty of a $250,000 fine.”

Obviously, collecting data for one purpose and later using it for some complete other context may be problematic. In the context of professional sports this could have serious economic consequences for the athletes. Although data collection is voluntary, it is rare (or impossible) for any player to refuse it. Therefore it is important that everything be transparent and controlled by the player, Venook explains. In reality, the ownership of biometric data is currently very fuzzy [5], and the lack of transparency and trust can lead to cheating from the player’s side, for example by faking explosive movement or tying the sleep monitor around a pillow [6].

How can we increase the usability of movement tracking data in a privacy-respective manner in order to build smarter and more sustainable cities? How will the battles in professional sports affect future decisions on the use of biometric tracking regarding consumer-level devices? Can the NBA’s players’ union become “a radical potential of the big data era”? [7]

In my opinion, these worlds are tightly connected. After all, we are all humans. Yes, even Lauri Markkanen (2.13 m, born in Vantaa).

[1] Bergman, C., & Oksanen, J. (2016). Conflation of OpenStreetMap and Mobile Sports Tracking Data for Automatic Bicycle Routing. Transactions in GIS, 20(6), 848-868. Available at:

[2] Oksanen, J., Bergman, C., Sainio, J., & Westerholm, J. (2015). Methods for deriving and calibrating privacy-preserving heat maps from mobile sports tracking application data. Journal of Transport Geography, 48, 135–144. Available at:

[3] Bergman C., & Oksanen J. (2016). Estimating the Biasing Effect of Behavioural Patterns on Mobile Fitness App Data by Density-Based Clustering. In Sarjakoski T, Santos MY, Sarjakoski LT (Eds.). Geospatial Data in a Changing World. Lecture Notes in Geoinformation and Cartography, Springer International Publishing, pp. 199–218. Available at:

[4] Venook, J. (2017). The Upcoming Privacy Battle Over Wearables in the NBA. The Atlantic.

[5] Karkazis, K., & Fishman, J. R. (2017). Tracking U.S. Professional Athletes: The Ethics of Biometric Technologies. The American Journal of Bioethics, 17(1), 45–60.

[6] LHN Presents: SXsports “1984 Meets Moneyball: Who Owns Player Data”

[7] Crawford, K. (2014). The Anxieties of Big Data. The New Inquiry.

Does privacy matter?


Text by José Vallet

When I discuss about privacy issues with other people, I find out that many of them don’t understand why it does matter so much to me and why I am so careful with it in many fronts. For example, they do not understand why I make my life “so difficult” by reading terms of agreement and avoiding to use most (if not all) of the popular “free” applications and services available on the Internet today. They don’t understand either why I don’t use a modern smart phone. I believe that the main reason for this difference of attitudes is unawareness, e.g. regarding:

  • What data can be and is being collected from us.
  • What information it conveys about us as individuals.
  • How that information can and is being used.
  • And, ultimately, how this affects our societies and us as individuals, which is the real non-monetary price that we pay for using those non-transparent services and devices.

This blog post is not meant to convince you about the importance of privacy, but rather to invite you to think about it by yourself. It contains a proposition for a reflection, a reading and a post-reflection. It will not take long and I think it will be worth for you a few minutes of your time.

Imagine an average Internet user, let’s call him John, who expresses his opinions using tools provided by popular social media platforms, e.g. Facebook, and whose activity/presence on Internet is being tracked by this and other applications/tools developed by companies that collect users’ data. Please think about the following questions:

  • How much information about his personality do you think that the “likes” that John does throughout Internet convey?
  • How many “likes” are necessary to build up a detailed personality profile of John’s personality?
  • How detailed the profile can be if it is built using data such as connections with friends, the pictures that he posts, what he reads, thinks and what his opinion is, and in general the fingerprint that he leaves on the Internet? Can this profile contain information about John that he does not know of himself?
  • Can a personality profile built with that information be used to predict his behavior?

And can a computer do this in an automated, fast and efficient manner? Can these predictions be more accurate than what John believes of himself?

  • What for could a for-profit company use a particular individual’s detailed personality profile? What if this company would have available a detailed profile of all the individuals of an entire country, or even of the whole world? Is that feasible at all?
  • Lastly, do you think that you are clever and well informed enough as to be able to avoid being manipulated?

Now I kindly invite you to read this post and to find answer to some of the previous questions.

After the reading, I kindly invite you to think again about the previous questions, but now adding to the arsenal of data being collected the individuals’ position at all times with an accuracy of under a meter, indoors and outdoors. But please also think about what information your position carries about us as individuals.

Does privacy matter to you?

Are you following your neighbourhood?

Twitter makes it easy for us to follow other people and organizations. You just press follow and you will stay up to date on what’s going on. But what if you would like to follow what’s happening in your neighbourhood or some other small area you are interested? How would you do that?

There are many interesting things happening near us, but we sometimes miss them because there really isn’t an easy and efficient way to keep up with what’s going on around us. We miss the small local things like the blueberry pie tasting at the local bakery, the event night at the local library or the little league match on the local sports field – the types of things we can just walk to.

This kind of information, referred to as hyperlocal information, is usually shared through channels that reach people that are near the information or somehow connected to it. There are two ways to come across hyperlocal information: being there or hearing about it.

You could come by hyperlocal information for example by going to take a look at the local bulletin board or you could walk by a bulletin attached to a lamp post. Obviously you might get lucky and walk right into something interesting that’s happening. These require you to physically be at a specific location and also at a specific time. Being at the right place at the right time isn’t something that happens without effort. This way of getting hyperlocal information is relevant to you regarding its location, but you might not necessarily be interested in it. So, even if you are at the right place at the right time, you still might not be interested in what’s going on.

Another way to get hyperlocal information is to hear about it. You could for example be talking to your neighbour and they mention something about a local event. This is great, but your neighbour might not be aware of your interests or aware of all the things going on around you. Actually, most of our neighbours aren’t walking bulletin boards so we tend to rely on social networks. Seeing something posted to a group on a social network site usually means that it’s interesting to us, but then the location aspect could be a little vague. Facebook for example thinks “near” is in the city you live in. In most cities this “near” might mean hours of transportation. Currently social networks tend to do a poor job at linking their information to a specific location. Established places such as cafés, restaurants and museums can post their information to their social media outlets, but then you do have to be part of that group to receive the information.

Let’s say there’s an open picnic at the local park. The organizer of this event doesn’t own the park nor does the park have any kind of social media presence. How would the organizer reach out to people interested? This kind of grassroot information is usually missed also due to the sheer amount of information posted to social media sites. Unless you scroll all the way down to your new notifications list of all your social networks, you might be missing out on fun stuff.

On one hand if you are there, you might not be interested. On the other hand you might be interested, but you are not there. Due to the nature of hyperlocal information it stays hidden and is usually only available to the people already spending time in near the particular area it happens. Furthermore, even spending time in the area is sometimes not enough since you have to be at the right place at the right time to receive the information. You might walk by the poster on the lamp post and not notice it. You might even be part of the social network group that shares the hyperlocal information of your area, but you might miss the update. Often the social network group you are in is spread out over such a large area that most of posts aren’t that relevant to you. Thus, hyperlocal information is usually presented in a way that is difficult to reach.

What if there was a way to stay in touch with what’s happening near you without having to worry about privacy or missing things you are interested in? That’s what we are trying to do with #hylo. Check out more about #hylo at

– Mikko Rönneberg

You are where we are

What information does our geographical position convey about us? What information about myself and others I am giving (voluntarily or not) to those that have access to my location history? This is a question that I often ask myself when I try to understand and argue why it is so important for companies like Google to keep a track of people’s position at all times. The obvious reason is money: they want to monetize that vast amount of information. But how? What is the information that they can obtain and how do they make business out of it?

The reality nowadays is that most of us ignore all the details regarding what information is being juiced out of our personal location data and how it is being monetized. What is clear, though, is that, in terms of information, the aim is to create a profile of each one of us and a model that can be used to predict our preferences and behavior individually, and that location information is only one source of data that they use to create and exploit this model. However, with the advancements in localization and mapping technologies, soon it will be possible to estimate the position of mobile devices outdoors and indoors with an accuracy of less than a meter, and through this our personal location. This unprecedented ubiquitous level of accuracy will make it possible to directly track down where and how we spend time in close detail. This will expose more private information than ever before about our daily habits, activities and about how we live our lives in general. Furthermore, when cross-analyzing this type of location data collected from larger population groups, one obtains access to information regarding how we relate to others, perhaps the most important factor that defines our personality. Therefore, position information is becoming a key input explanatory variable in that complex artificial model of our personality, and thus its value is increasing in the eyes of companies creating and exploiting peoples’ profiles

But what personal information could one obtain from accurate location data records? The answer is that it depends on the creativity and skills of the experts that design the algorithms, which in turn work following for-self-profit corporate business strategies. There are no implicit limits as to what information can be distilled from raw data, but a general rule of thumb: the more detailed the data source is, the more information it carries and the more accurate models and predictors can be built. As a scientist familiarized with simple statistical, machine learning and artificial intelligence methods, it is not difficult for me to come up with possible probabilistic models and algorithms that one could devise and implement in a computer in order to infer personal information from people based on their geographical position history. Here I collect a few thoughts from one simple and informal brainstorming session.

Our spatiotemporal patterns reveal who we are

Let’s assume that we have an uniquely identifiable device (e.g. a mobile phone) with us all the time whose geographic position is estimated periodically (e.g. once per second) with the above-mentioned accuracy of 1 m and stored together with a timestamp in a database. This is what I will refer as our location history record. Let’s now think about our daily life in terms of our geographic position and spatiotemporal behavior, all embodied in this collected data. It is clear that we all have repetitive patterns that are easy to identify, and these carry rich personal information about us. Let’s see some examples.

One clear repetitive behavioral pattern is that most of us work in the same place during roughly the same time interval, and that every evening/night we come back home, where we stay for a while and eventually spend the night and the early mornings. These patterns make our home and working places easy to recognize from location history records: for example, one can simply plot a dot on a map for each of the points of your location history and look for the center of the two areas with highest density of points (see Illustration 1). Of course, you can also refine the identification by making two separate plots: one with points corresponding to positions during the evening/night and another one during roughly working hours.

Illustration 1: Example of a representation of a three days location history record from an fictitious person. In view of this figure, what can you say about this person? Where does he work and live? How does he go to his working place? What does he do when he is located in the right-most cloud of points? (which corresponds to Mäkelänrinne, a well known swimming facility in Helsinki).

In fact, speaking about sleeping, we typically do it in the same bed and bedroom, perhaps with the same person which, very likely, is our partner. Some might also have kids sharing the same common space (home) but sleeping in other sub spaces (rooms). Broadly speaking, we can say that, given the structure of our society, it is very likely that two or more persons typically spending the nights in the same space form a couple/family. There are other possibilities, like for example flat-mates in a student dorm, but these tend to be localized in known places and have different mid-term spatial behavioral patters through which they can be easily distinguished.

This type of general “broad” information known beforehand is what statisticians call “prior knowledge”. Using the previously described prior knowledge and individuals’ accurate position history records, it would be relatively easy to infer things like who sleeps with who and who belongs to what family group. For example, one possible way to do it in practice is to form small-sized spatial clusters (about a room large) containing the location of different persons during the night. One can argue that most of the clusters containing two elements correspond to people sleeping together or in the same room. Increasing the cluster size to e.g. the average size of flats in a certain neighborhood, one could identify with a relatively high success rate the whole set of family groups in that neighborhood and what family each individual belongs to (see Illustration 2)

Illustration 2: Small (in light green) and larger (in light blue) sized clusters containing position of people during the night. This clusters reveal information about the relation among individuals. (Background floormap courtesy of RoomSketcher).

Let’s still keep our attention in our homes for a while and think about how we use its different subspaces (rooms). We wake up and typically have breakfast with the same people (probably family members) and in the same space; let me guess: either the kitchen or the living room. Furthermore, it might be the same place where the family has dinner together (again, prior knowledge). Thus, one can identify a common space within the house and associate it with a certain usage/activity (preparing/eating food). Similarly, the place where people watch TV, play games or spend free time can equally be identified and labeled as “livingroom”. After all, people usually watch TV in a sofa, not moving much and “in clusters” (with other group members). If in addition you have a smart TV, be sure that its position can be estimated using WiFi signals, and, if you are watching it, you are likely to be close to it.

The main point is that, by analyzing spatiotemporal patters of groups of people from position history records, one can create a map of all the spaces in which they interact (e.g. buildings) with abstract subdivisions to which one could put attributes, such as the activities that the group members develop there. And once these maps are created, one can do the inverse process: infer information about a person (e.g. activity or interests) knowing his position in that space. There are two relevant aspects regarding these maps that are worth considering. First, they do not need to be the typical visually appealing floormaps designed to show spatial information to humans: they only need to be understood by the computer that runs the inference algorithms and for the purpose they have been designed. Second, it is not necessary to send specialized surveying personnel to build the map on-site: it is built automatically by the group members that share the space (the family), probably being completely unaware of what is going on.

A close-up in public spaces

The same rational can be applied to other less private spaces. If the space is completely public, the prior knowledge on what activity is normally carried out in that particular space can be more accurate and retrieved manually beforehand. For example cafeterias, restaurants, cinemas, gyms, libraries, theaters and museums, shopping malls (and shops inside them) all are spaces in which we develop well defined and different activities. Imagine that you are in a museum. A typical floormap of the museum would be divided in rooms, each room containing different paintings. Another possible, more detailed map to be stored in a computer could consist of associations between paintings and the areas where the visitors are most likely to be looking at them. Once this map is created, it would be easy to know what paintings the visitors are more interested in from their timestamped position records.

The same principles can be applied e.g. to a supermarket. In this case the map can consist of associations between stands and the spaces in front of them where people are looking at the products exposed. So, from our position records, one could infer information about how we move in the supermarket, what calls our attention, what products we are interested in, how long do we spend deciding whether to buy something or not, etc. all at an individual level. This is actually the target of retail analytics, a similar concept than web analytics but applied to traditional physical shops. Its aim is to propose and study the effectiveness of different selling strategies, like position of adds, stands and so on, including personalized marketing. At the moment, tracking shoppers’ movements is done e.g. using devices attached to the shopping charts and trolleys as well as smart phones, but there is research done about how to track the eyes of the consumer while shopping.

Our mode of transportation can also be inferred from location information. If you go to work by car, you will use roads starting and finishing the trip from/at different positions around the same area with certain, but small, degree of randomness. If you take the bus you enter and leave the line only in bus stops located in well known fixed spots, plus the buses always follow a well known route. If you use the bike you cannot go as fast as cars, but faster than pedestrians. And if you go walking… well, you are lucky!, but still traceable. Again, these are all activities (in this case transportation modes) characterized by different spatiotemporal patterns that can be easily identified. And, again, our transportation means convey information about our social status and values, at least to some extent, don’t they?

Mutual location history reveals relationships

One important idea that has been floating around but has not been explicitly mentioned in this blog is that the type of relation that we have with other persons can be inferred from our mutual location history. Without much effort, one could come up with a simple classification criteria for types of human relations based on this history. For example, we can form the following classification groups:

  • Colleagues: persons that we share the working time with.
  • Acquaintances: persons that we see during leisure time only every now and then and mostly outside our place of residence.
  • Friends: persons that we see and spend time with more often, occasionally also in our respective places of residence. We might go to public spaces together (e.g. cinemas, parks, gym, etc.).
  • Relatives/flat mates: persons spending time regularly in the same space during the night.

The information about how we relate to others is especially revealing because we influence and are influenced by others in different ways and to a different degree depending on what type of relation we maintain. For example, we tend to keep closer relations with persons with whom we share something, e.g. interests/hobbies or values. This, in turn, means that my own profile can be used to add info to the profile of others that relate to me, even if they have not given the consent to collect data from them (see shadow profiles). This creates a fundamental problem in privacy management: my personal attitude and choices affects the privacy of others. It is like smoking: if I smoke, others around me will smell like smoke, whether they choose to smoke or not. This raises an interesting issue regarding our own responsibility towards the privacy of others, especially of those closest to us: privacy is not a personal individual choice and election anymore.

In conclusion: the location history of a population is a very rich source of information that contains very detailed private information about the activities, habits and relations of each of its members, from which more elaborated conclusions can be made regarding their status, religion, interests, health, personal attitudes/values and other characteristics of their individual and collective personality. A record of our position history tells who we and others around us are.

Layout mode
Predefined Skins
Custom Colors
Choose your skin color
Patterns Background
Images Background