Peter Fleischer: Privacy...?: 2010

Friday, November 26, 2010

Imagine if tennis had different rules in every country: Cookie Confusion comes to the Continent

A decade ago, European policymakers debated the level of consent required for data protection purposes when a website uses a cookie. Common sense ultimately prevailed. Policymakers realized that an opt-in regime would drive users mad, as every website would be forced to serve up pop-ups asking users to opt-in, annoying everyone. Alternatively websites could just stop using cookies, but that's unworkable in basic technology terms. So, a Directive was adopted mandating an opt-out regime, together with clear notice in privacy policies of the use of cookies. All browsers introduced cookie controls too. After a decade more experience with the Web, rather than seeing more wisdom about the Web, we're seeing the status quo common-sense approach up-ended by contradictory policy agendas in Europe. So, the question is back on the policy agend: should interest-based advertising should be opt-in, or opt-out?

What are the rules now? The 2002 E-Privacy Directive was significantly changed in 2009 (Directive 2009/136/EC). Specifically, the wording for cookies was modified:

Article 5(3): Member States shall ensure that the storing of information, or the gaining of access to information already stored, in the terminal equipment of a subscriber or user is only allowed on condition that the subscriber or user concerned has given his or her consent, having been provided with clear and comprehensive information, in accordance with Directive 95/46/EC, inter alia, about the purposes of the processing...
Recital 66 (non-binding): ...Where it is technically possible and effective, in accordance with the relevant provisions of Directive 95/46/EC, the user’s consent to processing may be expressed by using the appropriate settings of a browser or other application.

While non-binding, Recital 66 clearly indicates that the directive does not intend to make cookies opt-in. The guidance from the Commission on this question has, however, been ambiguous.

In a June 2010 opinion, the Article 29 Working Party contended that in the case of interest-based advertising, at least:

The Article 29 Working Party is of the view that prior opt-in mechanisms, which require an affirmative data subject's action to indicate consent before the cookie is sent to the data subject, are more in line with Article 5(3). In a reference to consent as legal grounds for processing, the Article 29 Working Party recently confirmed these views "The technological developments also ask for a careful consideration of consent. In practice, Article 7 of Directive 95/46/EC is not always properly applied, particularly in the context of the internet, where implicit consent does not always lead to unambiguous consent (as required by Article 7 (a) of the Directive). Giving the data subjects a stronger voice ‘ex ante’, prior to the processing of their personal data by others, however requires explicit consent (and therefore an opt-in) for all processing that is based on consent."

At a speech in September 2010, Neelie Kroes (European Commissioner for the Digital Agenda) acknowledged the value created by the online advertising sector and signalled that she was “open to all creative ideas” to develop self-regulation that works for the advertising industry. When directly asked, she also said she was "not in favour of opt-in" for interest-based ads.
Alberto Alvaro, the MEP who drafted the revised ePrivacy Directive, has written that it “does not require websites to obtain prior consent for cookies to be placed on users’ terminals.”
The Commission’s DG Legal Service has not yet expressed an opinion on whether the revised E-Privacy Directive requires explicit opt-in for interest-based ad cookies.

But this is only the start. The scope for confusion will increase exponentially as individual Member States transpose the law. With no clear guidance to Member States, it is inevitable that 27 different national parliaments will begin diverging in how they transpose these rules. All of which means that global websites will face far more policy and legal confusion in Europe in the years ahead, and users will be facing very different privacy "protections" across geographies. How all of this is supposed to work in the real world is anyone's guess. Messy and contradictory laws and regulations are nothing new in politics, but if you're an engineer, what are you supposed to code?

I always think transparency and user choice are the linchpins of privacy. But a legislated solution which forces people to click like mad on cookie consent pop-ups is hardly the right way forward. At least tennis has clear rules, and they're the same everywhere.

Thursday, September 16, 2010

Privacy: a number's game?

How do you measure privacy protections? There are many important questions that I ask, including these:

What data is collected?

Who has access to this data?

How is this data used?

Is this data transferred to third-parties?

Can the data subject see and control this data?

Is this data protected by adequate security safeguards?

How long is this data retained before it is either destroyed or anonymized?

In reviewing this list, I think the last one is the least important in terms of measuring meaningful privacy protections for data. But curiously, it's precisely this one that I hear the most as I move around Continental Europe listening to privacy media and regulatory concerns in the online debates in recent years. Why is that?

European privacy law has clear provisions that personal data should not be retained "longer than necessary". Naturally, this time period is left vague in the laws, since it would be impossible to prescribe precise time periods for myriads of different contexts, especially since retention is always justified by "legitimate purposes". I think there's a temptation to try to boil privacy down into something simple and numerical, and what could be simpler and more measurable than a time period? In practice, there's a vast spectrum of legitimate retention periods, even for similar services, if the retention periods were designed to respect the very different legitimate purposes for which they were retaining data. To take some Google services as examples: Search logs (9 months), Instant Search logs (2 weeks), Suggest logs (24 hours), etc. To me, it's absurd to think that the most important privacy issue in Search is whether Search logs are retained for 6 or 9 months.

To take a different example: data retention rules in Europe (for government and law enforcement access) range from 6 months to 24 months, with each country in Europe picking and debating different time periods. Germany for example picked 6 months (but the German Constitutional Court struck down its version of data retention on other grounds), while France picked 12.

Curiously, the time dimension of data retention is almost entirely a Continental European privacy concern. It rarely registers as a meaningful vector in other countries, even in countries with very intense privacy debates. Of course, the euro-time-period debate is also intimately tied up with the debate about the so-called "right to be forgotten", the "droit a l'oubli", a well-intentioned idea that people should somehow be able to have parts of their own past (presumably the disagreeable parts) edited out of their personal histories. And, not coincidentally, this debate is most intense in countries with historical chapters that many people consciously or unconsciously want to forget: like Spanish society's conflict between remembering or forgetting the crimes of the Franco era.

I've spent a fair amount of time engaging in the time period debate "how many months is ok." It's pretty repetitive after a while. Lots of people who can't be bothered to think about the issues will just say: "oh, that's too long". I strongly believe that personal data should not be retained for "longer than necessary", as required by European privacy law, and I generally believe that it's an important debate for data controllers to justify their retention according to "legitimate purposes". Beyond that, reducing the online privacy debate to a numbers' game risks focusing all the attention on only one aspect of the broader privacy debate (and in my opinion, on the least important aspect of the debate to boot). And I am very much not in the superficial privacy school of thinking that "shorter is always better".

To clear my head, I spent some time playing tennis this summer. Now that's a number's game. By the way, I lost.

Tuesday, September 7, 2010

Face recognition software

How should we handle face recognition software?

Every so often a new technology comes along that has the ability to alter fundamentally the private/public balance, with profound implications for privacy. Face recognition is one of them, in my opinion.

We're already seeing highly accurate face recognition software provided by companies like face.com in the Facebook community. Some online photo albums also offer it, as a tool for users to tag one of their photos and allow the software to come back with face matches and propose auto-tagging them too.

But what will we do about face recognition software in the wild? Any Internet-connected smart phone with a camera could in theory do a real-time face recognition search on a person walking down the street, without their knowledge, and get web-based search results. Google declined to include face recognition in the version of Goggles that it launched a few months ago, precisely because of the unresolved privacy implications.

Over the last few months, I've spoken about face recognition with a number of privacy experts. Everyone quickly understands how it could be a useful tool, and how it could be a freaky tool, depending on how it's used. But essentially no one has a clue what to do about it. One could imagine a "solution" where users would upload their photos to a company offering this service, with either an opt-in or an opt-out, in other words, telling the company, "yes" you can can run searches against my photo, or "no", please do not run searches against my photo. In either case, the company has to maintain a central database of these people and their faces. Moreover, the database is essentially a biometric database, since the software runs against algorithmic "face prints". Neither of these "solutions", opt-in or opt-out, seems very palatable. In addition, it's hard to imagine how different countries might regulate such global services according to different standards, if, as one might realistically expect, one country wants to regulate an opt-in model, while another wants to take an opt-out model, while yet a third wants to prohibit such services entirely. How would that work?

Well, as we reflect, the technology is developing rapidly, and is already on the marketplace, offered by many different companies. Once again, the technology will evolve faster than our legal, political and sociological response to it. Hang on, this one will be interesting. If you have an idea about how to handle it, I'd welcome your comments, which you're free to submit, anonymously, of course.

Monday, September 6, 2010

Exhibitionism, or Self-Expression?

In privacy circles, we all try to make sure that people are sensitive about what they post online. I remember a chat I had with a journalist at SFGate.com back in 2007 :

"Before posting anything online, Peter Fleischer asks himself: Is this something I want to make public forever? ...
he thinks a lot about the implications of sharing information with the world. As a result, in his private life, he takes a cautious approach...
But he's uncomfortable sharing photos online..."

I generally advise people not to post things publicly without thinking about whether they're likely to regret having posted it. I also advise people not to post anything about other people (like pictures or videos), unless those people agreed to have it posted. But that doesn't mean that I think people should stop posting stuff about themselves and their friends online. In fact, I'm wildly enthusiastic about these social platforms that empower people to publish things about themselves and their friends to the world. The interesting risk-debate is about stuff in a gray zone, where one person's self-expression is another person's exhibitionism. This sort of gets summed up as a question that helps kids understand the consequences of posting things online: "even if you think this photo/video etc is cool, what will a future employer think about it when you start looking for a job?"

Digital natives are creating a part of their identity online. What they publish, or don't publish, is a self-created, highly edited version of their "identity" that they'd like to project. Digital natives are used to seeing lots of stuff about themselves and their friends online. The older generation isn't. So, rather than a technology clash, this strikes me more as a classic generational clash. The older generation warns the younger generation about putting too much of themselves out there, because, well, they never did, didn't have the opportunity, and no one in their generation did either. Perhaps that's why some people are calling the younger crowd Generation Xhibitionists.

Curiously, every time I've done an image search on my own name (and hey, regular "vanity" searches on your own name are an essential part of privacy hygiene, to know what's out there about yourself), I see a highly-ranked image search result of a guy in a bathing suit...who isn't me. Since I'm a believer in the principle that the best answer to bad speech (or bad content) is to confront it with better content, I figure I might as well post a picture of myself in a bathing suit too. The other guy is younger and better-looking, but hey, at least this is me. And to all those people who say I'm never willing to share anything personal online, well, call me Gen X.

Sunday, September 5, 2010

10 paths and they're all hard

We spent a couple days on mountain bikes in Switzerland recently. We got lost a lot. We didn't use GPS or geo-location-apps. We didn't really know where we were going, but we sort of had faith in our legs and our bicycles that we'd somehow get up and back down.

It was good to get out on a mountain. It clears my head. I was trying to think of the big privacy challenges this year.

And like choosing a mountain path that you don't know, these privacy challenges may turn out to be easy, or they may turn out to be the hardest ride of your life.

Here's my list of this year's cliff-hangers. And like any good cliff-hanger, I'll be back to comment on all of them in the months ahead.

1. Location: who should know where you are and where you've been and how can you control it?

2. Face recognition: how to enable useful apps without creating a mass surveillance device?

3. Data minimization: can we (or should we) restrict some data collection in the age of data ubiquity?

4. Notice and consent in machine to machine processing: e.g., how can a user meaningful exercise control and consent when apps instantly share data?

5. Communicating with end users: everyone agrees privacy policies aren't human-friendly, but does anyone have a better idea?

6. Social graph: what can algorithms know or deduce from your public social graph and what can you do about it?

7. Online mapping: what's private in a public place?

8. Droit a l'Oubli: can a line be drawn between "forgetfulness" and censorship?

9. Conflicts of laws: how can sites on the global web comply with conflicting rules from country to country, and is the global web balkanizing?

10. Anonymization: in the age of data mining, what is "anonymous", or is everything somewhere on a spectrum to identifiability, and what does that mean for privacy practices?

Saturday, July 31, 2010

Policy Frameworks for Protecting Privacy in the Cloud

I had the privilege of sharing a podium in Dublin last week at the Institute of International and European Affairs with the Irish Data Protection Commissioner. We were invited to discuss policy frameworks for protecting privacy in the Cloud. The talks are posted here at the IIEA's site:

http://www.iiea.com/events/fleischer-hawkes-regulating-for-the-cloud

Monday, June 21, 2010

Berlin, and its ghosts

I'm back from another few days in Berlin. As usual, I met some political leaders to talk about privacy. I also took a personal side trip to visit the villa where the Wannsee Conference took place in 1942 (the infamous "final solution" conference). The German privacy debate, which I think is the most intense in the world, simply makes no sense to my ears without the backdrop of Germany's two totalitarian traumas in living memory. Privacy is always a cultural concept, and it varies from country to country, based on history and self-perception. Hardly any country, thank heaven, has Germany's history.

Even so, it was a bit of a surprise when I heard a political leader tell me clearly: "in Germany, we want innovation, but we want you to ask for permission first". Innovation and permission. In fact, I wonder if they're oxymoron. I think of innovation as serendipitous, almost the opposite of bureaucratic/political process. But in a nutshell, there it was. I sensed the frustration of politicians and regulators who want (or feel the responsibility) to regulate the profoundly disruptive phenomenon of Internet innovation, but feel dis-empowered to do so. It's hard indeed to control a phenomenon like innovation on the Internet, especially if it happens outside your borders. You can't grab the Internet by the ears and shake it, but you can grab one guy, or one company, and shake them as hard.

Innovation requires you to take risks, to try new things, to accept failure, to iterate and to move on. They all depend on a culture that accepts novelty and failures as a necessary learning step on the way to success. "Launch and iterate" has become the innovation model for the Internet. Some people and countries are more comfortable with that than others, perhaps for very valid historical and cultural reasons. As one Berliner told me: "of course Americans think differently about privacy...so would we if we had had two centuries of stable democracy."

At the Wannsee Conference villa, the Nazi officials spent a lot of time discussing how to deal with "mixed race people", categorizing each permutation of people like me with one Jewish grand-parent into a box. I saw the memo that clarified how I would have been classified as a "second-degree mongrel", with a full catalog of the legal "rights" to which I was entitled. I think of my dad, "a first-degree mongrel", who amazingly lived in Berlin throughout those years. I have lots of pictures of him as a little boy, in the early 1930's, heading off for his first day in school, petting a tiger cub in the Berlin zoo, with his dog. But then nothing, not a single picture, no record at all, for the next decade.

There's a lot of debate about the potential evils that the Internet might enable in the future, as vast amounts of data are retained and publicly available. Those issues are serious, indeed, and I can't get my head around them. Many of the people who argue most passionately about the need for a "right to be forgotten" on the Internet are thinking about these potential evils. But at the same time, so much information also has a disinfectant quality for people who believe in free speech and transparency. There are no records that I can find of that missing decade of my dad's life. In many ways, I'm more a supporter of a "right not to be forgotten" than the opposite.

I doubt the horrors of Wannsee would have been possible in the age of the Internet. Imagine Anne Frank writing a daily blog. Or the Wannsee Conference proceedings leaked onto YouTube. Or maybe I have it all wrong, and the future will cook up evils using the same technologies that seem so benign to me now. I walk around Berlin shaking my head in incredulity, no matter how often I've been. I can understand the intense urge there to forget. Surely, that influences the concept of "privacy" too.

Friday, May 7, 2010

Which privacy laws should apply on the global Internet?

Given the nature of the Internet, all web services are inherently global. All companies doing business on the Internet rely on the collection, storage and analysis of information generated by users, and all of them are confronted by the lack of consistency in the applicability and content of privacy laws across jurisdictions. So, I’ve struggled with the following three questions:

What are the current rules establishing the application of privacy laws around the world?

Do the current rules work?

How could we create clearer rules, to provide greater consistency and certainty?

There are three different jurisdictional approaches to determine the applicability of privacy and data protection laws around the world.

1.1 Location of the organization using the data

This is the principle under Article 4(1)(a) of the EU Data Protection Directive, which looks at the place of origin of the organization that makes decisions about the uses of the data and determines the applicability of the law on that basis. This approach is also used in Canada, where the Federal Personal Information Protection and Electronic Documents Act (“PIPEDA”) controls the collection, use and disclosure of personal information in the course of the commercial activities of organizations that are federal works, undertakings or businesses.

In both cases, the law applies to an organization established in that particular jurisdiction irrespective of where in the world the actual processing takes place. In the EU where the organization is established in several EU countries, the organization must take the necessary measures to ensure that each of these establishments complies with local law obligations. Under PIPEDA, Canadian entities transferring data outside the country must have provisions in place to ensure a comparable level of protection to that granted by the law.

1.2 Location of the people whose data is being used

This is typically the USA approach under the Federal Children’s Online Privacy Protection Act (“COPPA”) and the data breach notification laws enacted by the majority of individual states. For example, COPPA will apply to operators of websites directed at children within the USA, while a serious data breach affecting a Californian resident must be notified to that person irrespective of who is responsible for the data or where the data breach occurred. This is also the approach in the laws of other jurisdictions like Australia and New Zealand where certain provisions apply in respect of Australian citizens and New Zealand residents respectively.

1.3 Place where the actual processing happens

The EU Data Protection Directive relies on this approach in Article 4(1)(c) to claim jurisdiction on the basis of the use of equipment situated in the EU where the organization is not located in the EU. Many other jurisdictions around the world follow this approach, like Argentina (i.e. law applies to any processing in the national territory), Israel (i.e. law applies to acts that occur in Israel) and even new laws like South Africa’s Protection of Personal Information Act which follows the EU Article 4 model (i.e. law applies both to when a party is domiciled in South Africa and when not domiciled but using means situated in South Africa).

As a result of the different approaches mentioned above (which are often combined - as in the EU), organizations using the Internet, multinational organizations and those engaging global service providers find themselves caught by the laws of many different jurisdictions. Examples of the practical problems caused by this include the following:

2.1 Multinational operations

Multinationals with established operations in many parts of the world face different rules affecting each subsidiary or affiliate. Since there is no international consistency determining the content and obligations under data protection and privacy laws, to be compliant a multinational must review the specific obligations under local law in each case. This is even the case within the EU despite the fact that EU data protection law at a local level emanates from the same source – the EU Data Protection Directive. The result is that a global company seeking to develop a consistent approach across all of its operations is required to create a tailored solution for specific jurisdictions according to the quirks of local law. This is not simple for companies operating standardized global web services.

Internet businesses which transact with individuals who are based in jurisdictions that claim jurisdiction when their citizens’ or residents’ data is being used, will find themselves subject to laws that bear no connection with the place of establishment of that business. For instance, an EU based internet business should be alert to any customers who are Californian residents since Californian data breach laws apply to an organization wherever it is located. Internet businesses must therefore anticipate the application of laws with which they have no real connection. Alternatively an Internet business might consider putting in place a defensive measure to ensure that it does not transact with individuals from those jurisdictions to protect itself from the application of foreign laws, but that approach violates the spirit of the open global Internet.

2.2 Use of equipment

Relying on the use of equipment in a particular jurisdiction (perhaps including the computers of end users) to determine the application of the law could mean that the laws of every single EU Member State will apply to every website operator in the world that uses cookies to gather browsing-related information. This result is due to the interpretation of the scope of ‘equipment’ under EU law and the view of EU regulators that website operators that place cookies on a user’s computer based in the EU without the control of the user, make use of equipment in a way that is caught by EU law. This shows that relying on ‘equipment’ to establish jurisdiction is unworkable.

2.3 Cloud computing: where the processing happens

Cloud computing is directly affected because the dynamic nature of this practice is at odds with the approach based on where the actual processing happens. Part of its agile functionality enables cloud computing to switch between processing data in one location to another location in order that customers are provided with an efficient, affordable and consistent service. Where the processing of data switches according to this technology this could have a knock on effect of changing which law applies to the processing thus introducing uncertainty.

2.4 Cloud computing: where the equipment is located

Another problem for cloud computing is that if the servers of the service provider are based in Europe, any overseas customer could be subject to EU law. Due to the structure of cloud computing technology and the network of servers that are used to deal with demand, a customer based outside the EU may find their data being stored on an EU server. Consequently, under EU rules the equipment (i.e. the server) is located in the EU and EU law applies even though the customer has no other connection with the EU.

Current models for determining the application of privacy law present complicated problems and unintended consequences which are unsuitable to deal with the changing pace of technology and the realities of global business. It is vital that more appropriate and flexible ways are found to address the practical problems created by the different jurisdictional approaches. Alternative approaches could include:

3.1 International privacy standards

The most obvious way of resolving the conflicts created by the different regulatory regimes would be to have just one global privacy regime. The initiative led by the AEPD and approved in Madrid during the International Privacy Commissioners’ Conference is a step in that direction. The initiative recognises that the current approaches in reality provide less protection for individuals and more complexity for businesses.

3.2 Treaty dealing with conflicts of law

As with other areas like contractual disputes, there could be an international treaty setting out which law would apply in the event of a potential conflict. Establishing such a treaty would help to provide certainty for businesses and individuals when situations of conflict arise.

3.3 Country of origin and accountability principle

A key rule to be established by an international treaty would be to apply the law of the country where the main operations reside (e.g. place of establishment of parent company, HQ, etc.) and make the provisions of that law follow the use of the data globally. Following a country of origin principle would bring data protection rules into line with the underlying principle governing e-commerce in the EU. Furthermore it would allow businesses to develop a coherent and consistent global compliance framework to deal with customers on the same terms wherever a customer is located. Adopting a consistent approach would also encourage greater accountability as the business would adopt one defined standard.

3.4 Voluntary submission to one regime

Governments and/or regulators could agree to allow organizations to choose one lead jurisdiction (based on objective, pre-established criteria). In the context of the EU, this is certainly viable as demonstrated by the "lead regulator" concept used in the area of Binding Corporate Rules applications. By submitting to one lead regime or jurisdiction, the organization would then abide by the rules of that regime enabling the business to be certain which law applies to its operations.

Thursday, April 22, 2010

Transparency: now for government requests too

This is my personal blog, and I try hard to keep my Google work-life out of it. I try to resist the temptation to turn this into a running daily diary of privacy at Google, since that would be a different blog. But sometimes, Google launches something that is so important in privacy terms that I can't resist some personal comments.

The most recent launch answers a basic question: how many requests does Google get from governments for user data? Take a look at the map and the country-by-country data.

This is an important step on the road to transparency. Users should be able to see their own data. And they should be able to get maximum information too about who else can see their data, including, perhaps more important than anyone else, governments. I haven't seen any other company provide this level of transparency. Hopefully some others will be inspired to do this too.

Wednesday, April 21, 2010

The data deluge

One of the most provocative things a privacy geek can say is "data minimization is dying". Data minimization has been one of the foundations of traditional privacy-think. The idea is basic and appealing: privacy is better protected when less data is collected, when less data is shared, when data is kept for shorter periods of time. This explains the endless debates in privacy circles about how many months computer or phone logs or passenger-name records should be retained, as though a numbers game about retention was the key issue in privacy. It isn't, but a debate over numbers is simple and appealing, and can be relayed by the press in a simple manner.

But whether you like it or not, we're entering an age of data ubiquity. Clearly, technology trends are making this possible, computing power, storage capacity, Internet transmissions have all allowed this to happen. And like all trends in technology, it will have good and bad applications: the same ease of transmission of data that enables billions of people to access information from around the globe makes it easy to transmit malicious viruses as well.

Statistics about the scale of the data deluge are indeed sobering, even if they reflect scales that human brains can't really understand. There are over a trillion web pages now, growing by billions per day. I read that there are now over 40 billion photos on Facebook alone. YouTube users upload over 24 hours of video every minute. The Economist reported that the total amount of data in the world is growing by 60% per year. No matter where you turn on the web, the scale of data growth is stunning. Even if you find concrete steps to advance data minimization, you're just taking a few drops out of the ocean of the data deluge.

There's no doubt that the Information Age is doing a lot of great stuff with this data deluge. It's also true that this data deluge is posing unprecedented challenges to privacy. I've struggled with this conundrum for many years. I don't think there's a better solution than trying to create maximum transparency and putting control over data back into people's hands, as best as possible. Trying to stop the data deluge is either Sisyphean or chimerical. But trying to decide on behalf of people also undermines the fundamental dignity and choice that each individual should be able to exercize over his/her own data. Of course, not all people can or will exercize responsible control over their own data. But putting transparency and control into users' hands is much like democracy. It fundamentally empowers the individual to make choices and trade-offs about data: making choices between data benefits and privacy. It's not perfect, of course, but it's still better than putting someone else (like governments or companies) in charge of those decisions. I think companies, governments and privacy professionals should define success foremost by whether we contribute to putting people in charge of their own data. As Churchill said: It has been said that democracy is the worst form of government except all the others that have been tried.

Thursday, April 15, 2010

To tweet or to delete?

How would you resolve the conflict between the cultural imperative to archive human knowledge and the privacy imperative to delete some of it? To put this in perspective, compare the approaches of the US Library of Congress and the French Senate.

As reported by The New York Times, the "the Library of Congress, the 210-year-old guardian of knowledge and cultural history, ...will archive the collected works of Twitter, the blogging service, whose users currently send a daily flood of 55 million messages, all that contain 140 or fewer characters."

Meanwhile, the French Senate is moving in the opposite direction, as it explores a law to legislate "the right to be forgotten". The French Senate has been considering a proposed law which would amend the current data protection legislation to include, among other things, a broader right for individuals to insist on deletion of their personal information. The proposed law in France would require organisations to delete personal information after a specified length of time or when requested by the individual concerned.

To take another example, this time from Germany. A court there was recently asked to consider a legal action by two convicted murderers (now released from prison) seeking to force Wikipedia to remove their names from an article documenting their criminal past. While the case is ongoing (as far as I know), the German language version of Wikipedia has agreed to remove the names from the article in question. The two men are now seeking to force the Wikipedia Foundation to delete their names from the English language version as well.

Well, I think we'll be blogging and tweeting about this dilemma for some time, knowing that our tweets will be archived. I testified to French Senators recently that I could never support a privacy "right to be forgotten" that amounted to censorship. I wonder if they tweet in the French Senate, and if they know their tweets are being archived in the US Library of Congress?

Which photos reveal "sensitive" personal data?

There are hundreds of billions of photos and videos online now. As a matter of common sense and common courtesy, users should not upload pictures or videos of other people to hosting platforms without their consent. Moreover, users should take particular care when uploading photos which might reveal "sensitive" personal data?

Privacy laws provide lots of extra legal protections to "sensitive" personal data. Trying to define what is "sensitive" is no easy task. The EU Data Protection Directive uses this definition: "personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life."

But what is "sensitive personal data" in the context of photos or videos? In one extreme logical sense, any photo of a person reveals "racial or ethnic origin". A picture of my face reveals that I am a middle-aged Caucasian male of European descent, revealing my racial or ethnic origin, as well as the fact that I usually wear glasses, indicative of the health issue of myopia. Does that mean that every photo or video of a person should be treated according to the legal standards of "sensitive personal data"? Most people would assume that is neither possible nor desirable, since it could require the explicit consent of data subjects (in writing, in some countries, and subject to prior approval by the DPA, in other countries) before their photos could be uploaded to the web. Clearly, this is not the way that the web works today, and indeed it would be completely unworkable.

I've discussed this issue with many people, in particular in the context of photos taken on public streets. Some privacy regulators have shared their (rather extreme) opinion with me that a photo or video of someone sitting in a wheel chair, or even someone walking in the vicinity of a hospital, should be treated as "sensitive", since it might reveal "health" status. Similarly, a photograph of a person appearing on a street near a mosque should be treated as "sensitive" since it might reveal "religious beliefs". But it's hard for me to imagine a crude solution like drawing a no-photograph zone around mosques and hospitals. It also seems wrong to me to apply the legal standard of "sensitive" personal data to situations which merely increase the likelihood of associations. So, many people take a more nuanced approach. A photo or video often lacks the context to make it meaningful: a photograph of myself in front of a cathedral doesn't automatically mean that I'm Catholic, and isn't necessarily revealing "sensitive" personal data. A photograph of people praying there maybe does. But does the fact that such photos are taken in a public place, and are widely considered banal, change the analysis of whether they should fit into the more restrictive categories of "sensitive" personal data?

All in all, it's very hard to know where to draw the lines. Hopefully, people who take photos and videos will be respectful of the very serious issues that the legal concept of "sensitive" personal data" is meant to protect. But the lines separating "sensitive" from "normal" personal data will usually be fuzzy and contextual. Think of the simple example of a photo of two people holding hands. Is this indicative of their sexual orientation, and hence, "sensitive" personal data, or really, just two people holding hands? I suppose it depends on the context. This is not something that photo or video hosting platforms or software filters are able to know. Ultimately, this is all about protecting people's human dignity, and that fundamentally, is a human judgment.

Thursday, March 18, 2010

Privacy Audits

In theory, privacy audits are a sensible and useful thing. Regardless of whether they're conducted internally or externally, they can provide insights into data handling systems, identify shortcomings, and help prioritize resources. They can provide external, independent validation of compliance with privacy laws and contractual commitments. And they can be a useful source of transparency. Sometimes, they're even mandated by privacy law, e.g., in some controller-processor outsourcing arrangements under EU data protection rules. Considering how many good reasons there are to conduct privacy audits, it's a bit of a mystery to me why there isn't more of an industry to provide them. Indeed, if you were looking to hire external experts to conduct privacy audits, and if you asked me for a recommendation, well, I'd be kind of stuck to give you a name. I've asked a bunch of my peers at other companies too, and privately, they're stumped too.

Lots of people purport to be able do privacy audits. Law firms, accounting firms, consulting firms are all ready to sell this service, at sometimes astronomical costs, but in practice, if you ask around amongst people who have tried to hire them, you often hear people complain about high-priced pay-as-you-learn tutorials for junior professionals. There are also a few "low-cost" versions floating around, but they are often rudimentary checklists (e.g., "do you have a written privacy policy in place? yes, check!") etc. There must be more room for the happy middle ground between the super-high-cost customized audit and the self-audit checklist models.

So, here's a business idea. Why don't some enterprising people work to establish a privacy auditing business, combining some deep technical understanding with process rigor, offer the service at a competitive cost, and help fill a vacuum? Almost everyone in the profession whom I know agrees that privacy audits are, in theory, a useful tool for privacy hygiene, but in practice, it's hard to find the right level of professional service.

There seems to be a clear market failing here. Over time, surely, the idea of privacy audits will become more integrated into good privacy practice. Whoever can figure out how to provide this service will be contributing to the privacy profession and probably end up making a lot of money. Good luck!

Wednesday, March 10, 2010

A new chance to get the Working Party to work better?

I'm delighted to see a new Chairman, Jacob Kohnstamm, assume the helm at the Working Party, which is the group of all of Europe's national Data Protection Authorities, created to try to achieve common approaches to privacy across Europe. Mr. Kohnstamm is a privacy leader whom I've known for years, and whom I greatly admire, even when we find ourselves on the opposite sides of the debate. I'm confident he'll provide new leadership and relevance at the Working Party. I also think it's healthy for European institutions to break away from alternating franco-german leadership, which has so dominated the Working Party over many years.

As a privacy professional, people sometimes ask me why I take the Working Party seriously, and why I would want to see it play a greater role in privacy matters in Europe? The answer is simple: with all its institutional flaws, any body that contributes to a more harmonized data protection across Europe is better than the alternative, with 27 different approaches and inconsistent cacophony. Since the Working Party is the best instrument we've got in Europe to try to do things in a coherent way, I think it's worth taking a moment to make suggestions about how it could work better. My comments are strictly focused on only one aspect of its role, namely, the extent to which it interacts with the private sector in a semi-regulatory context. My critiques are offered in a spirit of constructive feedback.

So, what are the key issues that deserve attention to make the Working Party work better in the future?

Public Transparency: the Working Party operates behind closed doors. It rarely involves outsiders in its deliberations. It almost never publishes draft opinions for external review, and rarely (if ever) opens its meetings to the public. As far as I know, it never publishes the range of consenting/dissenting views with its opinions, and it publishes little more than a summary agenda and adopted Opinions. I strongly believe that transparent government is good government, and the Working Party is simply not transparent today.

Accountability and Review: the Working Party's opinions are not "binding" and therefore have never, to my knowledge, been subject to judicial review. Sometimes Working Party opinions make sense, sometimes not. Sometimes they're insightful, sometimes they're gibberish. External, objective, academic, technical, maybe even judicial review, is much needed.

Technical expertise: The Working Party has many times embarked on issues which turn on Internet technical architecture. There is not enough technical expertise at the Working Party level, which is unsurprising, considering that the members generally come from political or administrative backgrounds. But to have well-informed discussion about Internet regulation, a foundation of technical knowledge must be in place, or must be provided from the outside.

Confidentiality: To deal with confidential business matters in a semi-regulatory context, any regulatory body needs to be able to respect business secrets submitted to it. Maintaining confidentiality has not been a strong point of the Working Party, given that its documents are routinely distributed amongst 27 countries. But leaks damage the ability of the Working Party to be effective.

Speed: In tech circles, things move fast. This is an innovation business, after all. But a discussion with the Working Party can often take years, with rather stilted exchanges of letters, each exchange punctuated by multi-month pauses. Surely, there must be a faster, less formalistic, way to collaborate.

All in all, these critiques are meant to be constructive. I think privacy would be well-served by a more realistic and collaborative dialogue between the Working Party and industry. The old Working Party made some progress, but there's room for more. I'm hopeful about the future.

Friday, March 5, 2010

Billions of photos online, Billions of privacy offenders?

With the proliferation of Internet platforms for user-generated content, people are increasingly seeing examples where one person's right to freedom of expression may infringe someone else's right to privacy, and vice-versa. If I upload my holiday pictures to the Internet, taken from a public place, and if they capture you lounging by your pool, does my freedom of expression trump your right to privacy, or the other way around? Whatever you think, there are already billions of such photos online and publicly accessible.

Both freedom of expression and privacy are fundamental human rights. But those rights are not both equally enforced, protected or policed. There are literally thousands of data protection bureaucrats in Europe whose job is to enforce European data protection regulations. As far as I can tell, there is not a single government official in all of Europe whose sole job is to do the same for freedom of expression. Curious, no?

As I go to privacy-centric conferences where people invariably talk about the problems and risks of social networking sites, I'm often the odd guy out who seems to think that they're also precious platforms for freedom of expression. Lots of guys in power lecture about how lives or careers or futures are jeopardized by a single embarrassing photo posted to a platform.

Well, I'm not so sure. I was thinking about what this guy showed when he was young, and he just got elected Senator, so maybe things are changing.

A privacy regulator in Europe told me the other day that he thought it was a data protection violation for anyone to post a photo online if it captured someone's face or property without their consent. I asked him whether he thought this restricted the right to freedom of expression. He didn't seem to understand the question.

Tuesday, March 2, 2010

Grazie! for your support

I'm thinking about Italy a lot these days. Many of you have expressed your support, and I'm gratified by your concern and your solidarity.

I see this case has prompted an important debate and passionate expressions of support for the principles of freedom of expression that I have always felt are at stake in this prosecution. We'll get the Judge's written opinion within 90 days of last week's verdict, so probably around mid-May. Until then it's hard to speculate about his precise legal reasoning, even if the implications of this conviction are already being widely discussed in terms of the potential liability of employees working for internet platforms that host user-generated content. As for me, I'm not really at liberty to comment much publicly, because, anything I say about it can (and has!) been used against me.

Many thanks to you, my many friends in the privacy community who have reached out to me. Grazie!

Wednesday, February 24, 2010

Today's astonishing verdict in Milan

Google has already reacted to today's astonishing verdict in Milan. I'd like to add a few personal words.

I will vigorously appeal today's verdict in Milan. The judge has decided I am criminally responsible for the actions of some Italian teenagers who uploaded a reprehensible video to Google Video. I knew nothing about the video until after it was removed by Google in compliance with European and Italian law. I was very saddened by the plight of the boy in the video, not least as I have devoted my professional life to preserving and protecting personal privacy rights. Despite this a public prosecutor in Milan has spent 3 years investigating, indicting and successfully prosecuting me and 2 other Google colleagues.

This ruling also sets a very dangerous precedent. If company employees like me can be held criminally liable for any video on a hosting platform, when they had absolutely nothing to do with the video in question, then our liability is unlimited. The decision today therefore raises broader questions like the continued operation of many Internet platforms that are the essential foundations of freedom of expression in the digital age. I recognize that I am just a pawn in a larger battle of forces, but I remain confident that today’s ruling will be over-turned on appeal.

Monday, February 22, 2010

Austrian insights

I've been thinking about the conundrum of trying to fit all of the words data into two random black-and-white categories: "personal" data or "non-personal" data, or personally-identifiable information and non-PII if you prefer. The reason we're all trying to do this is because most of the world's legal regimes create these two categories, and only these two categories, even if it's obvious that many things sit uncomfortably in the gray zone between them. The big privacy debates generally turn on these gray-zone categories, which identify some things about an individual (e.g., speaks Spanish), but don't identify an actual human being. Think of the privacy debates around IP addresses, cookies, RFIDs etc, and you see that the debates can't be settled using only these two categories.

I think the way forward is the creation of a third-category, something we could call "indirectly identifiable data". Interestingly, Austrian law has already done that. Here are some insights into the Austrian law, the Austrian Federal Act concerning the Protection of Personal Data (Datenschutzgesetz 2000). Under Austrian Law, data is ‘only indirectly personal’ for a controller, a processor or recipient of a transmission when ‘the Data relate to the subject in such a manner that the controller, processor or recipient of a transmission cannot establish the identity of the data subject by legal means." In other words, the identity of the individual can be retraced but not by legal means.

When introducing the concept of indirectly personal data, the Austrian legislators referred on the face of the bill before Parliament to Article 2 (a) of the Directive and, in particular, to the phrase ‘…an identifiable person is one who can be identified, directly or indirectly…’. This suggests that a deliberate decision was made to distinguish between persons who can be identified directly (and for which the full force of the Austrian Law applies) and those persons who can only be identified indirectly – hence the concept of indirectly personal data. In the eyes of the legislators, indirectly personal data did not require the full range of protection that directly personal data required. There may additionally have been commercial and practical reasons considered by the legislators why to require organisations to treat indirectly personal data in the same way as directly personal data made no sense.

This is how I've been told Austrian Law treats indirectly personal data below:

*Section*	*Provision*
8 (2)	Use of only indirectly personal data shall not constitute an infringement of the fundamental interest in secrecy that deserves protection under s. 1 (1).
9 (1) (2)	Use of sensitive data does not infringe interests in secrecy deserving protection only and exclusively if data are used only in indirectly personal form.
12 (3)	Transborder data exchange shall not require authorisation if data are transferred or committed that are only indirectly personal to the recipient
17 (2)	There is no requirement to notify the Data Protection Commission where the data application only contains indirectly personal data.
24 (4)	There is no duty to provide information to data subjects when collecting data where such data is not subject to notification under s. 17 i.e. this would include the use of indirectly personal data.
29	The rights granted under s. 26 – 28 cannot be exercised insofar as only indirectly personal data are used. Section 26: right of access Section 27: right of rectification/ erasure Section 28: right to object
46 (1)	For the purpose of scientific or statistical research projects where the goal is not to obtain results in a form relating to specific data subjects, the controller shall have the right to use all data that are only indirectly personal for the controller.
46 (5)	Where the use of data in a form which permits identification of data subjects is legal for purposes of scientific research or statistics, the data shall be coded without delay so that the data subjects are no longer identifiable if specific phases of scientific or statistic work can be performed with indirectly personal data only

All of this is interesting, because I think privacy law will never adapt to the nuances of the real world if the entire real world has to be fit into only two black and white categories. Finding a legal category to deal with the gray zone is essential to getting privacy laws right, and the Austrian model is one of the most promising I've seen.