This document is merely a W3C-internal document. It has no official standing of any kind and does not represent consensus of the W3C Membership. It is a strawman that has been produced without significant consultation with the wider privacy community for the purpose of starting a discussion about what makes sense in the space of potential possibilities for privacy rulesets.

This document proposes a scheme for defining privacy rulesets: bundles of user privacy preferences that can be conveyed together with user data in the context of a web site or application interaction.

Introduction

There have been many previous efforts at encapsulating privacy policies in reduced forms [[GEOPRIV-ARCH]], [[ID-MGM]], [[LIC-PRIV]], [[P3P11]], [[FIN-PRIV-NOTICE]], [[MOZ-ICONS]], [[PRIV-ICONS]], [[PRIV-ICONSET]], [[PRIV-LABEL]]. The lessons learned from all of these support the notion that the DAP WG discussed in Prague that for user-defined expressions of privacy preferences, two constraints are key: they need to be simple, and they need to make sense to both users and developers. The experience of the Creative Commons licenses [[CC-ABOUT]] reinforces this as well -- CC started with four license conditions and six licenses, and it seems that in practice only two of the license conditions tend to vary from license to license [[CC-CHOOSE]].

Using the term "license" to describe user privacy preferences doesn't seem quite right, since "license" implies some rights being transferred between the user and the organization collecting his or her data. Many other terms have been suggested: privacy rulesets, bundles, baskets, preferences, policies, expectations. None of them seems perfect, but this proposal will call them privacy rulesets. Other key terms are defined in the glossary below.

A scheme is proposed below for defining privacy rulesets that cover three elements of privacy that seem to matter most to users, are easiest to encapsulate in brief form, and have been addressed by similar previous efforts: sharing, secondary use, and retention of user data. Each element has three possible attributes. When one or more of these attributes are combined, they produce a privacy ruleset. The scheme assumes that a ruleset would be somehow conveyed together with user data in the context of an interaction with a web site or application owned by a company or other organization (known as the data collector), such that the ruleset is meant to convey to the organization what the user's preferences are about the data being conveyed. A given ruleset is meant to govern only the data that gets conveyed with it.

Scope

For simplicity, the rulesets only apply to identified data -- information that can reasonably be tied to an individual. What data collectors do with other kinds of data that is not linkable to an individual or is held in the aggregate is out of scope.

Privacy Elements

The elements and their attributes are defined below.

Sharing

internal: The data can be shared internally within the data collector's organization and with other organizations that help the data collector provide the service requested in the current interaction.

affiliates: The data can be shared with other organizations that the data collector controls or is controlled by.

unrelated-companies: The data can be shared outside of the data collector's organization with other organizations that it does not control and is not controlled by.

public: The data can be made public.

It is important to note that none of the sharing attributes are mutually exclusive -- any of them may be combined to form more permissive grants of sharing abilities than any single one of them on its own.

Secondary Use

It can sometimes be difficult to distinguish between "primary" uses of user data and "secondary" uses. What users believe to be primary uses and what applications providers believe to be primary uses are not always the same, because all of the functionality that contributes to being able to provide a particular application or service is not always evident to users. The attributes below are crafted with the user's conception of secondary use in mind, and therefore attempt to cover all uses of user data that users might want to express a preference about (without making the attributes overly granular).

contextual: The data may only be used for the purpose of completing the current interaction. Contextual uses may include securing, troubleshooting or improving the service being provided or providing advertising in the context of the current interaction.

customization: The data may be used to customize, personalize, or otherwise tailor the current interaction for the user.

marketing-or-profiling: The data may be used for marketing and/or profiling purposes. Marketing may occur over time and via any channel (web, email, telemarketing, etc.). Profiling involves the creation of a collection of information about an individual and applies to profiles created for any purpose other than customization (e.g., for research, to sell to other organizations, etc.).

None of the secondary-use attributes are mutually exclusive.

Retention

The fact that most web servers automatically record logs of user activity -- and that many of these logs are never deleted -- can complicate the task of having applications abide by -- -- user-defined retention policies. The retention attributes -- -- defined below assume that as a general matter, -- -- all data collectors may retain -- -- user data for a baseline period of 35 days for the purposes -- -- of maintenance, security, and troubleshooting. The attributes -- -- express user preferences that apply to retention practices -- -- that go beyond this baseline period.

no: The data may only be retained for the baseline period.

short: The data may be retained beyond the baseline period, but only for a limited time.

long: The data may be retained beyond the baseline period for an unspecified or indefinite amount of time.

The retention attributes are mutually exclusive.

Privacy Rulesets

The attributes listed above could be combined in many different combinations. Not all of them are possible or sensical (i.e., allowing marketing-or-profiling but not retention), and like Creative Commons licenses, there are likely only a handful that users would want to employ regularly. A list of these potentially common rulesets is proposed below.

(The formatting of these is arbitrary: they could just as easily be declared as two-letter codes like Creative Commons attributes, or like URI parameters, or in XML, or some other way).

Least permissive:
sharing=internal
secondary-use=contextual
retention=no

The least permissive ruleset says that the user wants her data shared only internally by the data collector and organizations that help the data collector deliver the service, only used for contextual purposes (which includes contextual advertising), and not retained beyond the baseline period.

Internal customization/personalization:
sharing=internal
secondary-use=customization
retention=short

Some users may want to permit their data to be used internally by the data collector to do individualized analytics or provide some personalization based on recent activity, but not for marketing purposes. This ruleset, which allows data to be retained for a limited period and used for customization but not shared, corresponds to that set of preferences.

Profile-based advertising:
sharing=internal
secondary-use=marketing-or-profiling
retention=long

If users want to allow the data collector to use their data in profiles that are later used to target ads back to them, this ruleset would allow for that, with sharing still limited for internal use but with marketing, profiling, and retention allowed.

Public:
sharing=public
secondary-use=contextual
retention=long

This ruleset lets users express their permission to have their data shared publicly, but not used by the data collector for non-contextual purposes.

Most permissive:
sharing=internal
sharing=affiliates
sharing=unrelated-companies
secondary-use=contextual
secondary-use=customization
secondary-use=marketing-or-profiling
retention=long

The most permissive ruleset allows all three kinds of sharing, all three kinds of secondary use, and indefinite retention.

There are a number of open implementability questions about the rulesets. As discussed at the London F2F, these include:

Glossary

affiliate
An organization that controls, is controlled by, or is under common control with another organization. This comports with the [[AD-INDUSTRY]]'s definition of this term.
data collector
The organization that owns or otherwise controls the web site or application with which the user interaction occurs.
identified data
Information that can reasonably be tied to an individual. See [[P3P11]]'s definition of this term.
privacy ruleset
A combination of privacy rules describing the user's preferences about the sharing, secondary use, and retention attributes of his or her data.
primary use
A use of data that is directly necessary to complete the user's interaction with the web site or application.
profile
A collection of data about an individual.
secondary use
Any use of the user's data other than the primary use(s).
unrelated company
Any organization that is distinct from the data collector and is not an affiliate of the data collector.