Thursday, 27 June 2013

Why Mass Surveillance Does Not Work

by Dirk Helbing (ETH Zurich,

These days, it is often claimed that we need massive surveillance to ensure a high level of security. While the idea sounds plausible, I will explain, why this approach cannot work well, even when secret services have the very best intentions, and their sensitive knowledge would not be misused. This is a matter of statistics - no method is perfect.
For the sake of illustration, let us assume there are 2000 terrorists in a country with 200 Mio. inhabitants. Moreover, let us assume that the secret service manages to identify terrorists with an amazing 99% accuracy. Then, there are 1% false negatives (type II error), which means that 20 terrorists are not detected, while 1980 will be caught. The actual numbers are much smaller. It has been declared that 50 terror acts were prevented in about 12 years, while a few terrorist attacks could not be stopped (although the terrorists were often listed as suspects).
It is also important to ask, how many false positives ("false alarms") do we have? If the type I error is just 1 out of 10,000, there will be 20,000 wrong suspects, if it is 1 permille, there will be 200,000 wrong suspects, and if it is 1 percent, it will be 2 million false suspects. Recent figures I have heard of on TV spoke of 8 Million suspects in the US in 1996, which would mean about a 4 percent error rate. If these figures are correct, this would mean that for every terrorist, 4000 times as many innocent citizens would be wrongly categorized as (potential) terrorists.
Hence, large-scale surveillance is not an effective means of fighting terrorism. It rather tends to restrict the freedom rights of millions of innocent citizens. It is not reasonable to apply surveillance to the whole population, for the same reasons, why it is not sensible to make a certain medical test with everybody. There would be millions of false positives, i.e. millions of people who would be wrongly treated, with negative side effects on their health. For this reason, patients are tested for diseases only if they show worrying symptoms.
In the very same way, it creates more harm than benefit, if everybody is being screened for being a potential future terrorist. This will cause unjustified discrimination and harmful self-censorship at times, where unconventional, new ideas are needed more than ever. It will impair the ability of our society to innovate and adapt, thereby promoting instability. Thus, it is time to pursue a different approach, namely to identify the social, economic and political factors that promote crime and terrorism, and to change these factors. Just 2 decades back, we saw comparatively little security problems in most modern societies. Overall, people tolerated each other and coexisted peacefully, without massive surveillance and policing. We were living in a free and happy world, where people of different cultural backgrounds respected each other and did not have to live in fear. Can we have this time back, please?


Type I and type II errors, see

Monday, 17 June 2013

How to Ensure that the European Data Protection Legislation Will Protect the Citizens

by Dirk Helbing (ETH Zurich,
(an almost identical version has been forwarded to some Members of the European Parliament on April 7, 2013)

Some serious, fundamental problems to be solved 

The first problem is that, when two or more anonymous data sets are being combined, this may allow deanonymization, i.e. the identification of the individuals of which the data have been recorded. Mobility data, in particular, can be easily deanonymized.

A second fundamental problem is that it must be assumed that the large majority of people in developed countries, including the countries of the European Union, have already been profiled in detail, given that individual devices can be identified with high accuracy through individual configurations (including software used and their configurations). There are currently about 700 Million commercial data sets about users specifying an estimated number of 1500 variables per user.

A third problem is that both, the CIA and the FBI have revealed that, besides publicly or semipublicly available data in the Web or Social Media, they are or will be storing or processing private data including Gmail and Dropbox data. The same applies to many secret services around the world. It has also become public that the NSA seems to collect all data they can get hold of.

A fourth fundamental problem is that Europe currently does not have the technical means, algorithms, software, data and laws to counter foreign dominance regarding Big Data and its potential misuse.

General principles and suggested approach to address the above problems

The age of information will only be sustainable, if people can trust that their data are being used in their interest. The spirit and goal of data regulations should be to ensure this.

Personal data are data characterizing individuals or data derived from them. People should be the primary owners of their personal data. Individuals, companies or government agencies, who gather, produce, process, store, or buy data should be considered secondary owners. Whenever personal data are from European citizens, or are being stored, processed, or used in a European country or by a company operating in a European country, European law should be applied.

Individuals should be allowed to use their own personal data in any way compatible with fundamental rights (including sharing them with others, for free or at least for a small monthly fee covering the use of ALL their personal data – like the radio and TV fee). [Note: This is important to unleash the power of personal data to the benefit of society and to close the data gap that Europe has.]

Individuals should have a right to access a full copy of all their personal data through a central service and be suitably protected from misuse of these data.

They should have a right to limit the use of their personal data any time and to request their correction or deletion in a simple and timely way and for free.

Fines should apply to any person or company or institution having or creating financial or other advantages by the misuse of personal data.

Misuse includes in particular sensitive use that may have a certain probability of violating human rights or justified personal interests. Therefore, it must be recorded what error rate the processing (and, in particular, the classification) of personal data has, specifying what permille of users feel disadvantaged.

A central institution (which might be an open Web platform) is needed to collect user complaints. Sufficient transparency and decentralized institutions are required to take efficient, timely and affordable action to protect the interest of users.

The execution of user rights must be easy, not time consuming, and cheap (essentially for free). For example, users must not be flooded with requests regarding their personal data. They must be able to effectively ensure a self-determined use of personal data with a small individual effort.

To limit misuse, transparency is crucial. For example, it should be required that large-scale processing of personal data (i.e. at least the queries that were executed) must be made public in a machine-readable form, such that public institutions and NGOs can determine how dangerous such queries might be for individuals.

Proposed definitions

As indicated above, there is practically no data that can not be deanonymized, if combined with other data. However, the following definition may be considered to be a practical definition of anonymity:

Anonymous data are data in which a person of interest can only be identified with a probability smaller than 1/2000, i.e. there is no way to find out which one among two thousand individuals has the property of interest.
Hence, the principles is that of diluting persons with a certain property of interest by 2000 persons with significantly other properties in order to make it unlikely to identify persons with the property of interest. This principle is guided by the way election data or other sensitive data are being used by public authorities. It also makes sure that private companies do not have a data processing advantage over public institutions (including research institutions).

I would propose to characterize pseudonymous data as data not suited to reveal or track the user and properties correlated with the user that he or she has not explicitly chosen to reveal in the specific context. I would furthermore suggest to characterize pseudonymous transactions as processing and storing the minimum amount of data required to perform a service requested by a user (which particularly implies not to process or store technical details that would allow one to identify the device and software of the user). Essentially, pseudonymous transactions should not be suited to identity the user or variables that might identify him or her. Typically, a pseudonym is a random or user-specified variable that allows one to sell a product or perform a service for a user anonymously, typically in exchange for an anonymous money transfer.

To allow users to check pseudonymity, the data processed and stored should be fully shared with the user via an encrypted webpage (or similar) that is accessible for a limited, but sufficiently long time period through a unique and confidential decryption key made accessible only to the respective user. It should be possible for the user to easily decrypt, view, copy, download and transfer the data processed and stored by the pseudonymous transaction in a way that is not being tracked.

Further information:

Difficulty to anonymize data 

Danger of surveillance society
New deal on data, how to consider consumer interests 
  • HP software allowing personalized advertisement without revealing personal data to companies, contact: Prof. Dr. Bernardo Huberman:
FuturICT initiative
Information on the proposer

Dirk Helbing is Professor of Sociology, in particular of Modeling and Simulation, and member of the Computer Science Department at ETH Zurich. He is also elected member of the German Academy of Sciences. He earned a PhD in physics and was Managing Director of the Institute of Transport & Economics at Dresden University of Technology in Germany. He is internationally well-known for his work on pedestrian crowds, vehicle traffic, and agent-based models of social systems. Furthermore, he is coordinating the FuturICT Initiative (, which focuses on the understanding of techno-socio-economic systems, using Big Data. His work is documented by hundreds of well-cited scientific articles, dozens of keynote talks and hundreds of media reports in all major languages. Helbing is also chairman of the Physics of Socio-Economic Systems Division of the German Physical Society, co-founder of ETH Zurich’s Risk Center, and elected member of the World Economic Forum’s Global Agenda Council on Complex Systems.

Saturday, 8 June 2013

Qualified Trust, not Surveillance, is the Basis for a Stable Society - Dirk Helbing

Peaceful citizens and hard-working taxpayers are under government surveillance. Confidential communication of journalists is intercepted. Civilians are killed by drones, without a chance to prove their innocence.[1] How could it come that far? Since September 11, freedom rights have been restricted in most democracies step by step. Each terrorist threat has delivered new reasons to extend the security infrastructure, which is eventually reaching Orwellian dimensions. Through its individual configuration, every computer has an almost unique fingerprint, allowing one to record our use of the Web. Privacy is gone. Over the past years, up to 1500 variables about half a billion citizens in the industrial world have been recorded. Google and Facebook know us better than our friends and families.

Nevertheless, governments have failed so far to gain control of terrorism, drug traffic, cybercrime and tax evasion. Would an omniscient state be able to change this and create a new social order?[2] It seems at least to be the dream of secret services and security agencies.   
Ira "Gus" Hunt, the CIA Chief Technology Officer, recently said:[3]

"You're already a walking sensor platform… You are aware of the fact that somebody can know where you are at all times because you carry a mobile device, even if that mobile device is turned off. You know this, I hope? Yes? Well, you should… Since you can't connect dots you don't have, it drives us into a mode of, we fundamentally try to collect everything and hang on to it forever… It is really very nearly within our grasp to be able to compute on all human generated information." 

Unfortunately, connecting the dots often does not work. As complex systems experts point out, such "linear thinking" can be totally misleading. It's the reason why we often want to do the right things, but take the wrong decisions.

I agree that our world has destabilized. However, this is not a result of external threats, but of system-immanent feedback effects. The increasing interdependencies, connectivity and complexity of our world and further trends are causing this.[4] However, trying to centrally control this complexity is destined to fail. We must rather learn to embrace the potential of complexity. This requires a step towards decentralized self-regulatory approaches. Many of us believe in Adam Smiths "invisible hand", according to which the best societal and economic outcome is reached, if everybody is just doing what is best for himself or herself. However, this principle is known to produce "market failures", "financial meltdowns", and other "tragedies of the commons" (such as environmental degradation) under certain circumstances. The classical approach is to try to "fix" these problems by top-down regulation of a powerful state.

However, self-regulation based on decentralized rules can be learned. This has been demonstrated for modern traffic control concepts, but it's equally relevant for smart grids, and will be even more important for the financial system. The latter, for example, needs built-in breaking points similar to the fuses in our electrical network at home, and it requires additional control parameters to equilibrate. 

There is an alternative to uncoordinated bottom-up organization and too much top-down regulation -- a better one: the "economy 2.0". Doing the step towards a self-regulating, participatory market society can unleash the unused potential of complexity and diversity, which we are currently trying to fight.[5] This step can boost our societies and economies as much as the transition from centrally regulated societies to market societies inspired by Adam Smith. But after 300 years, it's now time for a new paradigm. Societies based on surveillance and punishment are not long-term sustainable. When controlled, people get angry, and the economy never thrives. Qualified trust is a better basis of resilient societies. But how to build it? Reputation systems are now spreading all over the web. If properly designed, they could be the basis of a self-regulating societal and market architecture. Further success principles of decentralized self-regulating systems can be learned from ecological and immune systems. They can also be a basis for a trustable Web, which can successfully neutralize harmful actions and contain cybercrime.   

Rather than in surveillance technology, government should invest their money in the creation of self-regulating architectures. It will be crucial for a successful transition to a new era -- the era of information societies. If we take the right decisions, the 21st century can be an age of creativity, prosperity and participation. But if we take the wrong decisions, we will end in economic and democratic depression. It's our choice.

[2] The subject is discussed in my essay "Google as God?", see
[4] D. Helbing, Globally Networked Risks and How to Respond, Nature 497, 51-59 (2013), see
[5] D. Helbing, Economics 2.0: The Natural Step towards a Self-Regulating, Participatory Market Society (2013), see