wiki:ad-click-attribution-draft-spec

Version 1 (modified by wilander@apple.com, 5 years ago) (diff)

Initial commit.

[Draft Spec] Ad Click Attribution for the Web

This document specifies a web technology for ad click attribution, i.e. attribution sent to a source of an ad click as the result of user activities on the destination of the same click.

Ad Click Attribution and Privacy

A popular business model for the web is to get attribution and payment for conversions, for instance purchases or sign-ups, which result from the click on an ad. Traditionally, such attribution has been facilitated by user identifying cookies sent in third-party HTTP requests to the click source. However, the same technology can be and has been used for privacy invasive cross-site tracking of users. The technology described in this document is intended to allow for ad click attribution while disallowing arbitrary cross-site tracking.

Terminology

Ad click. This document will use the term “ad click” for any kind of user gesture on an ad that invokes the navigation to a link destination, such as clicks, taps, and accessibility tools.

Conversion. A user activity that is notable such as a purchase, a sign-up to a service, or the submission of personal information such as an email address.

The four parties involved in this technology are:

  1. The user. They click on an ad, end up on a destination website, and perform what's deemed to be a conversion, such as a purchase.
  2. The user agent. The web browser that acts on behalf of the user and facilitates ad click attribution.
  3. The ad click source. The first-party website where the user clicks on the ad.
  4. The ad click destination. The destination website where the conversion happens.

Ad Campaign Id. A 6-bit hexadecimal identifier for an ad campaign associated with the ad click destination. This means support for 64 concurrent ad campaigns per ad click destination on the ad click source. Example: merchant.example can run up to 64 concurrent ad campaigns on search.example. The valid hexadecimal values are 00 to 40.

Ad Attribution Data. A 6-bit hexadecimal value encoding the details of the attribution. This data may contain things like specific steps in a sales funnel or the value of the sale in buckets, such as $0, $1-5, $6-20, $21-50, and so on. The valid hexadecimal values are 00 to 40.

Ad Attribution Priority. An optional 6-bit hexadecimal value encoding the priority of the attribution. The priority is only intended for the user agent to be able to pick the most important attribution request if there are multiple. One such case may be after the user has taken step 1 through 3 in a sales funnel and the third step is the most important to get attribution for. The valid hexadecimal values are 00 to 40.

A High Level Scenario

A high level example of a scenario where the described technology is intended to come into play is this:

  1. The user makes an online search on search.example's website.
  2. The user is shown an ad for a product and clicks it.
  3. The ad click source informs the user agent:
    1. That it will want ad click attribution for this click.
    2. What the intended ad click destination is.
    3. What the attribution campaign id is.
  4. The user agent navigates the link and takes note that the user landed on the intended ad click destination.
  5. The user's activity on the ad click destination leads to a conversion.
  6. A third-party HTTP request is made on the ad click destination website to https://search.example/.well-known/ad-click-attribution
  7. The user agent checks for pending ad click attributions for the ad click source/destination pair and if there's a hit, makes or schedules an HTTP request to https://search.example/.well-known/ad-click-attribution with the ad click attribution data. One thing to consider here is whether there should be an option to send the attribution data to the ad click destination too.

Ad Click Source Link Format

The ad click source needs to be an anchor tag with the following properties: <a adcampaignid=”[6-bit ad campaign id]” addestination=”[ad click destination eTLD+1]”>

Formally:

partial interface HTMLAnchorElement {
    [CEReactions=NotNeeded, Reflect] attribute DOMString adcampaignid;
    [CEReactions=NotNeeded, Reflect] attribute DOMString addestination;
}

If an ad click on the above link triggers a top frame navigation that lands, possibly after HTTP redirects, on the [ad click destination eTLD+1], the user agent stores the request for ad click attribution as the triple { [ad click source eTLD+1], [ad click destination eTLD+1], [6-bit ad campaign id] }. If any of the conditions do not hold, such as the ad campaign id being larger than 6-bit, the request for ad click attribution is ignored.

Legacy Triggering of Ad Click Attribution

Triggering of attribution is what happens when there is a conversion.

Existing ad click attribution relies on third-party HTTP requests to the click source and these requests are typically the result of invisible image elements or "tracking pixels" placed in the DOM solely to fire HTTP GET requests. To allow for a smooth transition from these old pixel requests to the new Ad Click Attribution technology, we propose a server-side redirect to a well-known location as a legacy triggering mechanism.

To make an existing pixel request an ad click attribution from the user agent, the top frame context of an ad click destination page needs to do the following:

  1. A secure HTTP GET request to the [ad click source eTLD+1]. This HTTP request may be the result of an HTTP redirect, such as searchUK.example HTTP 302 redirect to search.example. The use of HTTP GET is intentional in that existing “pixel requests” can be repurposed for this and in that the HTTP request should be idempotent.
  2. A secure HTTP GET redirect to [ad click source eTLD+1]/.well-known/ad-click-attribution/[6-bit ad attribution data]/[optional 6-bit ad attribution priority]. This ensures that the [ad click source eTLD+1] is in control of who can trigger click attribution on its behalf and optionally what the priority of the attribution is.

If the user agent gets such an HTTP request, it will cancel the network load, check its stored requests for click attribution, and if there's a match for { [ad click source eTLD+1], [ad click destination eTLD+1] }, it will make or schedule a secure HTTP POST request to [ad click source eTLD+1]/.well-known/ad-click-attribution/[6-bit ad attribution data]/[8-bit ad campaign id] with the referer header set to [ad click destination eTLD+1]. The use of HTTP POST is intentional in that it differs from the HTTP GET redirect used to trigger the attribution and in that it is not expected to be idempotent. If any of the conditions do not hold, such as the ad attribution data being larger than 6-bit, the request for ad click attribution is ignored. We may have to add a nonce to the HTTP POST request to prohibit double counting in cases where the user agent decides to retry the request.

If there are multiple ad click attribution requests for the same { [ad click source eTLD+1], [ad click destination eTLD+1] } pair, the one with the highest Ad Attribution Priority will be the one sent and the rest discarded.

Modern Triggering of Ad Click Attribution

We envision a JavaScript API that is called on an ad click destination page as a modern means to trigger attribution at a conversion. This API call removes the necessity for third-party "pixels" which is great for ad click sources who do not want to be third party resources.

Privacy Considerations

The total entropy in ad click attribution HTTP requests is 12 bits (6+6), which means 4096 unique values can be managed for each pair of ad click source and ad click destination. Example: search.example and merchant.example can track up to 4096 distinct user activities emanating from an ad click. We believe this avoids general cross-site tracking while still providing useful ad click attribution at web scale.

In the interest of user privacy, user agents are encouraged to deploy the following restrictions to when and how they make secure HTTP POST requests to [ad click source eTLD+1]/.well-known/ad-click-attribution/[6-bit ad attribution data]/[6-bit ad campaign id]:

  • The user agent targets a delay of ad click attribution requests by 24-48 hours. However, the user agent might not be running or the user's device may be or disconnected from the Internet, in which case the request may be delayed further.
  • The user agent only holds on to the triple { [ad click source eTLD+1], [ad click destination eTLD+1], [6-bit ad campaign id] } for 7 days, i.e. one week of potential ad click attribution.
  • The user agent doesn't guarantee any specific order in which multiple ad click attribution requests for the same ad click destination are sent, since the order itself could be abused to increase the entropy.
  • The user agent uses an ephemeral session (a.k.a. private or incognito mode) to make ad click attribution requests.
  • The user agent doesn't use or accept any credentials such as cookies, client certificates, or Basic Authentication in ad click attribution requests.
  • The user agent may use a central clearinghouse to further anonymize ad click attribution requests, should a trustworthy clearinghouse exist.
  • The user agent offers users a way to turn ad click attribution on and off. The default setting is on to encourage businesses to move to this technology and abandon general cross-site tracking.
  • The user agent doesn't support ad click attribution in private/incognito mode.

Performance Considerations

The user agent may want to limit the amount of stored ad click attribution data. Limitations can be set per ad click source, per ad click destination, and on the total amount of ad click attribution data.

Open Source Status in WebKit

We have an experimental implementation in WebKit which we continue to refine. Here are a few useful links: