This document is merely a W3C-internal document. It has no official standing of any kind and does not represent consensus of the W3C Membership. It is a strawman that has been produced without significant consultation with the wider privacy community for the purpose of starting a discussion about what makes sense in the space of potential possibilities for privacy rulesets.
There have been many previous efforts at encapsulating privacy policies in reduced forms [[GEOPRIV-ARCH]], [[ID-MGM]], [[LIC-PRIV]], [[P3P11]], [[FIN-PRIV-NOTICE]], [[MOZ-ICONS]], [[PRIV-ICONS]], [[PRIV-ICONSET]], [[PRIV-LABEL]]. The lessons learned from all of these support the notion that the DAP WG discussed in Prague that for user-defined expressions of privacy preferences, two constraints are key: they need to be simple, and they need to make sense to both users and developers. The experience of the Creative Commons licenses [[CC-ABOUT]] reinforces this as well -- CC started with four license conditions and six licenses, and it seems that in practice only two of the license conditions tend to vary from license to license [[CC-CHOOSE]].
Using the term "license" to describe user privacy preferences doesn't seem quite right, since "license" implies some rights being transferred between the user and the organization collecting his or her data. Many other terms have been suggested: privacy rulesets, bundles, baskets, preferences, policies, expectations. None of them seems perfect, but this proposal will call them privacy rulesets. Other key terms are defined in the glossary below.
A scheme is proposed below for defining privacy rulesets that cover three elements of privacy that seem to matter most to users, are easiest to encapsulate in brief form, and have been addressed by similar previous efforts: sharing, secondary use, and retention of user data. Each element has three possible attributes. When one or more of these attributes are combined, they produce a privacy ruleset. The scheme assumes that a ruleset would be somehow conveyed together with user data in the context of an interaction with a web site or application owned by a company or other organization (known as the data collector), such that the ruleset is meant to convey to the organization what the user's preferences are about the data being conveyed. A given ruleset is meant to govern only the data that gets conveyed with it.
For simplicity, the rulesets only apply to identified data -- information that can reasonably be tied to an individual. What data collectors do with other kinds of data that is not linkable to an individual or is held in the aggregate is out of scope.
The elements and their attributes are defined below.
internal
: The data can be shared
internally within the data collector's organization and
with other organizations that help the data collector
provide the service requested in the current interaction.
affiliates
: The data can be shared with
other organizations that the data collector controls or is
controlled by.
unrelated-companies
: The data can be
shared outside of the data collector's organization
with other organizations that it does not control and is not
controlled by.
public
: The data can be made public.
It is important to note that none of the sharing
attributes are mutually exclusive -- any of them may be combined
to form more permissive grants of sharing abilities than any
single one of them on its own.
It can sometimes be difficult to distinguish between "primary" uses of user data and "secondary" uses. What users believe to be primary uses and what applications providers believe to be primary uses are not always the same, because all of the functionality that contributes to being able to provide a particular application or service is not always evident to users. The attributes below are crafted with the user's conception of secondary use in mind, and therefore attempt to cover all uses of user data that users might want to express a preference about (without making the attributes overly granular).
contextual
: The data may only be used for
the purpose of completing the current interaction. Contextual
uses may include securing, troubleshooting or improving the
service being provided or providing advertising in the context
of the current interaction.
customization
: The data may be used to customize, personalize, or otherwise tailor the current interaction for the user.
marketing-or-profiling
: The data may be
used for marketing and/or profiling purposes. Marketing may
occur over time and via any channel (web, email, telemarketing,
etc.). Profiling involves the creation of a collection of
information about an individual and applies
to profiles created for any purpose other
than customization (e.g., for research, to sell to other
organizations, etc.).
None of the secondary-use
attributes are mutually exclusive.
The fact that most web servers automatically record logs of user activity -- and that many of these logs are never deleted -- can complicate the task of having applications abide by -- -- user-defined retention policies. The retention attributes -- -- defined below assume that as a general matter, -- -- all data collectors may retain -- -- user data for a baseline period of 35 days for the purposes -- -- of maintenance, security, and troubleshooting. The attributes -- -- express user preferences that apply to retention practices -- -- that go beyond this baseline period.
no
: The data may only be retained for the
baseline period.
short
: The data may be retained beyond
the baseline period, but only for a limited time.
long
: The data may be retained beyond the
baseline period for an unspecified or indefinite amount of
time.
The retention
attributes are mutually exclusive.
The attributes listed above could be combined in many different combinations. Not all of them are possible or sensical (i.e., allowing marketing-or-profiling but not retention), and like Creative Commons licenses, there are likely only a handful that users would want to employ regularly. A list of these potentially common rulesets is proposed below.
(The formatting of these is arbitrary: they could just as easily be declared as two-letter codes like Creative Commons attributes, or like URI parameters, or in XML, or some other way).
sharing=internal
secondary-use=contextual
retention=no
The least permissive ruleset says that the user wants her data shared only internally by the data collector and organizations that help the data collector deliver the service, only used for contextual purposes (which includes contextual advertising), and not retained beyond the baseline period.
sharing=internal
secondary-use=customization
retention=short
Some users may want to permit their data to be used internally by the data collector to do individualized analytics or provide some personalization based on recent activity, but not for marketing purposes. This ruleset, which allows data to be retained for a limited period and used for customization but not shared, corresponds to that set of preferences.
sharing=internal
secondary-use=marketing-or-profiling
retention=long
If users want to allow the data collector to use their data in profiles that are later used to target ads back to them, this ruleset would allow for that, with sharing still limited for internal use but with marketing, profiling, and retention allowed.
sharing=public
secondary-use=contextual
retention=long
This ruleset lets users express their permission to have their data shared publicly, but not used by the data collector for non-contextual purposes.
sharing=internal
sharing=affiliates
sharing=unrelated-companies
secondary-use=contextual
secondary-use=customization
secondary-use=marketing-or-profiling
retention=long
The most permissive ruleset allows all three kinds of sharing, all three kinds of secondary use, and indefinite retention.
There are a number of open implementability questions about the rulesets. As discussed at the London F2F, these include: