Tuesday, November 05, 2013

The Greatest Enemy of Privacy Is Ambiguity

Superb article.

---------

The Greatest Enemy of Privacy Is Ambiguity
If we can't define what it means to be left alone, don't be surprised when the government comes knocking.
J.M. Berger
31/10/2013
Foreign Policy Mag

The details of who is listening to what and who knew may be decidedly unclear, but it's hard to escape the clamor over one of our most cherished possessions: privacy.

But that word, and the concept behind it, is fluid, subjective, contextual, and self-referential -- shifting sands that have come to the forefront in our torturous public debate about NSA surveillance.

In their formative years, most people discover the dictionary game -- in which you look up a word and then look up all the words used in the first word's definition.

Properly executed, this exercise teaches us something important about the power and challenge of language -- how some of our most important words drift into meaninglessness because their definitions are circular, while others bloat into unmanageable complexity due to the number of additional, definable terms needed to describe them.

Depending on what dictionary you choose, the word "privacy" has multiple definitions, and the border between one and another is not always clear. Random House provides the following three definitions that are relevant to our current national drama:

1. the state of being private; retirement or seclusion.
2. the state of being free from intrusion or disturbance in one's private life or affairs: the right to privacy.
3. secrecy.

"Intrusion" and "disturbance" are highly subjective, and privacy as the state of being private or in reference to one's private affairs immediately invokes the circular problem. As an adjective, "private" offers up five more definitions, four of which are relevant to this discussion:

1. belonging to some particular person: private property.
2. pertaining to or affecting a particular person or a small group of persons; individual; personal: for your private satisfaction.
3. confined to or intended only for the persons immediately concerned; confidential: a private meeting.
4. personal and not publicly expressed: one's private feelings.

There is a long road from "belonging to a particular person" to "personal and not publicly expressed." The definitions of intrusion and disturbance take on a similarly wide range. We don't have to take this game to its invariably frustrating conclusion to grasp the point.

Countless volumes have documented the history of privacy as a concept, from open-air latrines to individual bedrooms. The idea of privacy many of us grew up with -- the proposition that people are entitled to locks on a door -- has few precedents in the long history of mankind, and it has mutated considerably in my lifetime.

Locks are devices that protect our physical selves, our belongings, and our "privacy" -- the idea of secrecy pertaining to what actually goes on behind those closed doors. But within a generation, many of our most cherished secrets have migrated from the physical realm to the immaterial.

Our secrets are now represented as data, in an unfathomably complex infrastructure of servers and lines of communication (and we didn't need the NSA leaks to let us know that the locks on those doors are not deadbolts).

But secrets are no longer something we scrupulously hoard. They have become currency. We barter personal data for convenience on a near continual basis, whether by using a bank card, providing an email address in exchange for a coupon, applying for a mortgage, managing a Facebook page -- or supporting a phantasmagoric government mandate of zero-tolerance for terrorism.

The ethereal web of our intertwined personal currencies has now reached a tipping point, and it's time to start making tough choices.

Unlike paper money, each datum we spend is not a discrete, anonymous unit. It is collected, stored, aggregated, and analyzed to create a set of increasingly accurate inferences about personal thoughts and preferences that people never intended to disclose.

The debate is arguably already out of control, the product of a media and social media machine that chews up issues, spits them out in wads on the sidewalk, and proudly points at them, saying "Look what I did!"

But it's still worth trying to nail down concrete definitions that can enable us to talk about exactly what it is that we want and what we fear.

The most relevant definition of "privacy" at play in current events -- debates over NSA surveillance, stop-and-frisk, the NYPD's massive intelligence apparatus, and the FBI's widespread use of informants -- is this: "the state of being free from intrusion or disturbance in one's private life or affairs," with "private" defined as "personal and not publicly expressed."

This sets up three basic questions we need to answer, just to get started:

1)      What is public expression?
2)      What is intrusion?
3)      What is disturbance?

Let's look at each in turn.

Public Expression

Almost everyone agrees that our personal information -- whether thoughts, views, or phone numbers -- is entitled to some sort of protection. But we engage in a wide variety of transactions in which the line between public and private is ambiguous or inconsistent.

The cell phone is the exemplar of the spiraling complexity of this question. Although it is by no means the only place where privacy transactions occur, the tradeoffs one accepts in carrying and using a cell phone in a typical manner are breathtaking.

Your cell phone number is "private" by default -- which is to say unlisted. This protects you from the most gross and unsubtle level of intrusions and disturbances -- unsolicited calls and texts from total strangers.

But you continually barter your cell phone's information in exchange for services, and the information you offer is increasingly comprehensive.

For starters, your service provider unavoidably has access to your number, which means they can (and do) call and text you with sales pitches, surveys, and billing reminders. That's the tip of the iceberg.

If you keep GPS and WiFi turned on, enabling a host of services, your carrier has access to your precise location at virtually all times -- and not just longitude and latitude. This data is easily and automatically correlated to the kind of places you visit -- what grocery store you use, which restaurants you visit, what bookstores you patronize.

If your email and social media accounts flow into the phone, that's a whole new level of information for barter: the books you order online, the news sources on which you rely, who are your friends and where do they live and what are their phone numbers, email addresses, and social media accounts.

Each piece of data by itself is relatively limited in what it says about you, and each feels innocuous when you give it up. Of course, you provide an email when ordering from Amazon.com. Why wouldn't you? And checking Facebook or Twitter on your phone is just so convenient. All it costs is your account information.

For the most part, we presume, this information just dumps into an anonymous sea of data, disassociated from our individual lives, or at least from any important aspect. But regardless of whether this is true, simply owning and using a cell phone represents a continual disclosure of information -- an expression of thoughts, views, and preferences.

Is this expression public? Not exactly, but it's not private either.

Each vendor of convenience requires you to express something about yourself -- sometimes individually, such as when you type in an address for GPS navigation, or in aggregate, such as when you use the Google Now phone app, which scans a broad spectrum of your data in order to inform you what restaurants are nearby, the gate from which your next flight will depart, and when your packages will arrive.

Based on the privacy settings you have selected (or failed to select), you disclose information so rich that it's hard to truly appreciate how much it says about you.

You also allow certain groups of companies to share that data with other companies and services. Each time you do anything with your phone, you give up a little shred of data, or perhaps a whole bucketful. Few people are rigorous about tracking where that data goes and what a company does with it.

Intrusion


You have expressed information by using your phone, and while it's not exactly public, it's also not secret. It's been given to a third party on the basis of an explicit contract that dictates how the information can be used -- even though you probably haven't read it.

When you agree to the terms of service for Google Now, for instance, you explicitly agree to let it analyze your life and draw inferences about you. It then serves those inferences back to you in the form of reminders and helpful prompts, such as how long it will take you to get home from your current location using your normal mode of transportation.

This situation gets dicier when you agree to share information with companies that share with third parties. If you spend a few minutes Googling information about where to buy widgets and click onto a few promising sites, you will probably notice that ads for widgets miraculously start displaying on other websites you visit.

This feels intrusive, at least depending on how you parse the definition, but you agreed to this, either explicitly by okaying a terms-of-service contract, or implicitly by enabling cookies in your web browser.

If you're not careful about settings and practices, you might later find emails about widgets in your inbox, or tweets and Facebook posts about widgets in your timeline, a higher degree of intrusion. If you're extraordinarily careless, you might get robotic phone calls about widget bargains in your area.

Most people tolerate a certain amount of intrusion in exchange for convenience, a fact that is cited by some defenders of the NSA's massive data collection program, who argue that American citizens have consented to this collection by voting, or failing to vote -- in the same way that they may or may not read the terms of service for Google Now.

While there is some validity to this argument, democracy is a very indirect form of consent, full of complications and considerations. And consent to intrusion is not the only issue at play.

Disturbance

Where the NSA differs substantially from Google or the phone company is its potential for disturbance -- which is almost never consensual. One might support stop-and-frisk as a policy, but nearly everyone would opt out for themselves, given a choice.

No one disputes that telecommunications and Internet service providers can and do use your data in intrusive ways. But they have neither a compelling motive nor a commercial interest in disturbing your private life.

In contrast, government surveillance is entirely predicated on an intent to disturb.

One of the primary reasons for surveillance is to determine whether the government should actively interfere in your life -- by arresting you, infiltrating your social circles, freezing your assets, taking away your security clearance, prohibiting you from air travel, or even targeting you for death.

For the vast majority of people, government data collection is used to rule out such disturbances. For a much smaller number, that data will be used to justify hostile action.

And for a smaller number still, the data will be used to justify that which is unjust -- disturbances of people who have been wrongly identified as a threat. This can mean anything from an intrusive investigation to a wrongful arrest, or even a targeted killing.

The government is not simply collecting data, it's using that data to judge the people represented therein, whether as potential targets for military action overseas or for a referral to the FBI.

Collecting data -- the focus of most NSA debates -- carries different considerations than using that data to judge people.

For many, the act of collection is enough to constitute an intrusion. But judgment is inextricably tied to disturbance, a much more serious issue.

For nearly everyone in the world, the NSA's judgment is "we're not interested." But when the stakes are so high, who wants to be judged at all?

All of this takes place in deep secrecy, which is also reasonable cause for concern. Complicating matters further still is the method by which the vast majority of such judgments are rendered. Here, we veer into entirely new and largely untested waters.

Judgment by Inference

There are two reasons the NSA wants so very much data. The first reason is prosaic. If the data has already been collected, it can be searched almost instantaneously, as opposed to starting with a phone number or email, obtaining a warrant, then waiting for one or several service providers to return the requested information days or even weeks later.

The second reason is exotic and far more problematic.

If someone picks up your cell phone and reads everything on it, they will likely come away with a rich portrait of your life and personality. But -- law and conscience aside -- it's physically impossible for the NSA or Google or anyone to manually read each subscriber's content.

It's different when you collect a massive amount of data. Even when the content of communications is not reviewed, an analyst can use complicated mathematical techniques to draw inferences about individual users.

These algorithms are difficult for non-mathematicians to understand, but not necessarily difficult to employ. It's not totally clear what the NSA is doing with the data it collects, but it is clear this type of analysis is part of the agency's toolkit.

The tools that allow Amazon.com to accurately guess what books you would like can also be used to infer your race, religion, sexual orientation, and political party, even though you have never provided that information to the company.

It's possible to do the same thing with the largest source of data the NSA collects (as of the latest disclosures). Known as metadata, this information ignores the content of your communications -- such as an audio recording of a phone call -- and instead focuses on who you call, when, where, and for how long.

On the face of it, this seems less intrusive than listening to the content. But content-free metadata can be used to make inferences about your religious or political views -- even if you never explicitly stated them during the call. Metadata is currently being used to determine whether callers are probably terrorists.

In theory, the massive quantities of data collected should make these inferences pretty accurate, but the degree of certainty will never reach 100 percent. More troubling, there are no public records about exactly what the success rate is. Ninety-nine percent? Sixty-five percent? Twenty? Does the NSA itself even know?

Instead, we're asked to have blind faith in the power of math and the integrity and competence of a government that all too often disappoints us in spectacular fashion.

Where do we go from here?


So far, the privacy debate sparked by the NSA's surveillance program has foundered in a morass of emotional reactions, bad and incomplete information, and personality-driven analysis. On Twitter and on TV, heated reactions fly as each nugget of new information emerges, and theses are rewritten each time the picture changes.

But the discussion is equally hobbled by the rapidly changing landscape of privacy definitions and expectations, layered onto a woefully outdated legal framework.

The Fourth Amendment's guarantee against "unreasonable searches and seizures" is more ambiguous by the day, setting an extremely subjective standard (reasonableness) for searches of what the Constitution's authors certainly understood as material things.

A concept that would have been foreign to our forefathers now dominates the debate over privacy. Who owns our personal data?

Ownership is crucial to both the legal and philosophical questions raised by the development of social media and by the NSA's surveillance collection, much of which is informed by a 1979 Supreme Court ruling, Smith v. Maryland, which in its most simplistic dimension ruled that companies own most of the data we share with them.

When Smith v. Maryland was written, "metadata" consisted mainly of paper sheets listing phone calls that had been dialed, which typically had to be correlated with subscriber information through manual searches of large printed directories. The contents of a phone call were ephemeral if not purposefully recorded, compared to the contents of an email, which persist indefinitely on servers.

Today's metadata is richer and more informative, and it can be processed using instantaneous analytical techniques that were only science fiction in 1979. Even today, we can barely grasp how much personal information metadata reveals.

Rather than embark on a new round of laws and court rulings that cover the current technologies and will require revision in another 10 or 30 years, it's time to start thinking about underlying principles.
  • What is privacy? Do we define privacy in its strictest sense as "secrecy," or broadly, as being free from intrusion and/or disturbance in our individual lives?
  • When we share information with a third party, do we automatically forfeit secrecy? Is providing information to a company a limited form of public expression? And if so, does that mean the information is no longer private?
  • Who owns the data we provide to companies? Can we take our data back after bartering it for a service?
  • Companies like Amazon and Google use algorithms to make inferences about our preferences -- what we like to read, where we like to go, and where we get our news and information. Can the government access these inferences, along with the data that lies beneath them?
  • Are such inferences an acceptable pretext for starting an investigation?
  • How do we define intrusion and disturbance, legally speaking?
  • How much intrusion are we willing to tolerate when there is no unjust disturbance? Regardless of how one defines it, most people tolerate intrusion on a nearly continuous basis, in exchange for services or simply due to thoughtlessness. When the government intrudes on our privacy, is that the same kind of barter on a society-wide basis? Should we require a more explicit form of consent?
  • Regardless of where we fall on the question of intrusion, should we focus government oversight more intensely on disturbances, such as wiretaps, physical surveillance, arrests, and investigations? Isn't that where the harm is most imminent and grievous?
  • If our data is fair game, what about the government's? Shouldn't we force the government to disclose in broad numbers how many conversations it actually listens to, how many of those conversations involve American citizens, how many investigations are opened as a result of mass surveillance, and how many of those leads ultimately prove pertinent to national security?

The answers to these questions will determine the direction of policies. If we want broad changes to government surveillance practices that protect as many people as possible, we have to focus on intrusion. If we want narrower changes designed to protect most people while preserving the utility of these techniques for law enforcement and counterterrorism, we need to focus on disturbance.

This question will only become more pressing. While mass collection of data is still primarily the province of counterterrorism, it is almost impossible to imagine that it will not find applications in more ordinary law enforcement -- the war on drugs, gun control, fraud detection, drunk driving, and more. It would be foolish to defer dealing with this issue until local police are using predictive algorithms to search car trunks and correlate high school truancy patterns with the likelihood of drug use.

In an age of reactionary politics and infamous gridlock, it may be too much to expect that the U.S. government can tackle these issues in a responsible and comprehensive manner, but if we fail to do so, we are only delaying the moment of truth. These questions will continue to return, in murkier and more complex forms.

It may be satisfying to demand quick fixes in reaction to specific programs, policies, and technologies, but such correctives do little to address the underlying ambiguities that stem from our loose definition of privacy, and our urgent but as yet inchoate expectations of control over our personal information.

For those who worry about government power, ambiguity is a far greater enemy than any individual or institution, even if corrupt. We will never have a government entirely free from honest mistakes and intentional overreach. But when we fail to define our expectations, we invite abuse and make accountability next to impossible.

If we can't clearly articulate what we want and need, don't be surprised when the government takes full advantage of the gaps.