开发人员指南概述

亚马逊广告 API 概览

聚合并联接由 Amazon Marketing Stream 产生的流量和转化数据

Amazon Marketing Stream 可保证至少一次信息传递,因此部分信息可以多次发送。鉴于此原因,流量和转化数据集使用名为 idempotency_id 的字段。为确保对这些数据集中的增量记录进行准确求和,在聚合之前必须删除重复记录。

操作顺序

要获取最准确的结果,必须按以下顺序完成步骤:

  1. 使用 idempotency_id 键识别重复数据。
  2. 聚合并删除流量表中的重复数据。
  3. 聚合并删除转化表中的重复数据。
  4. 联接流量表与转化表。

示例

以下代码示例显示了如何对 sp-traffic 和 sp-conversion 数据集执行此顺序的操作。请注意,来自其他广告程序的流量和转化数据集的架构可能略有不同。请参阅数据指南 以确认每个可用数据集的架构。

注意

此示例采用包含窗函数 row_number() 以及 idempotency_id 键来识别重复项。进一步了解窗函数

-- Identify duplicates in both tables

with sp_conv as
(
    SELECT *, row_number() OVER (PARTITION BY idempotency_id ORDER BY idempotency_id ASC) AS rn
    FROM [INSERT CONVERSION TABLE HERE]

   --Input initial WHERE filter criteria here.
),
sp_traff as
(
    SELECT *, row_number() OVER (PARTITION BY idempotency_id ORDER BY idempotency_id ASC) AS rn
    FROM [INSERT TRAFFIC TABLE HERE]

    --Input initial WHERE filter criteria here.
),

-- Aggregate and remove duplicates from sp-conversions

sp_conv_agg as
(
    SELECT    "advertiser_id"
            , "marketplace_id"
            , "time_window_start"
            , "campaign_id"
            , "ad_group_id"
            , "ad_id"
            , "keyword_id"
            , "placement"
            , "currency"
            , sum("attributed_sales_1d") as "attributed_sales_1d"
            , sum("attributed_sales_1d_same_sku") as "attributed_sales_1d_same_sku"
            , sum("attributed_sales_7d") as "attributed_sales_7d"
            , sum("attributed_sales_7d_same_sku") as "attributed_sales_7d_same_sku"
            , sum("attributed_sales_14d") as "attributed_sales_14d"
            , sum("attributed_sales_14d_same_sku") as "attributed_sales_14d_same_sku"
            , sum("attributed_sales_30d") as "attributed_sales_30d"
            , sum("attributed_sales_30d_same_sku") as "attributed_sales_30d_same_sku"
            , sum("attributed_conversions_1d") as "attributed_conversions_1d"
            , sum("attributed_conversions_1d_same_sku") as "attributed_conversions_1d_same_sku"
            , sum("attributed_conversions_7d") as "attributed_conversions_7d"
            , sum("attributed_conversions_14d_same_sku") as "attributed_conversions_14d_same_sku"
            , sum("attributed_conversions_30d") as "attributed_conversions_30d"
            , sum("attributed_conversions_30d_same_sku") as "attributed_conversions_30d_same_sku"
            , sum("attributed_units_ordered_1d") as "attributed_units_ordered_1d"
            , sum("attributed_units_ordered_1d_same_sku") as "attributed_units_ordered_1d_same_sku"
            , sum("attributed_units_ordered_7d") as "attributed_units_ordered_7d"
            , sum("attributed_units_ordered_7d_same_sku") as "attributed_units_ordered_7d_same_sku"
            , sum("attributed_units_ordered_14d") as "attributed_units_ordered_14d"
            , sum("attributed_units_ordered_14d_same_sku") as "attributed_units_ordered_14d_same_sku"
            , sum("attributed_units_ordered_30d") as "attributed_units_ordered_30d"
            , sum("attributed_units_ordered_30d_same_sku") as "attributed_units_ordered_30d_same_sku"
    FROM sp_conv

    --Ensures only 1 unique idempotency_id record is used (in the event of duplicates)
    WHERE rn = 1

    --GROUP BY all the keys identified above
    GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9

),

-- Aggregate and remove duplicates from sp-traffic

sp_traff_agg as
(
    SELECT    "advertiser_id"
            , "marketplace_id"
            , "time_window_start"
            , "campaign_id"
            , "ad_group_id"
            , "ad_id"
            , "keyword_id"
            , "keyword_text"
            , "placement"
            , "currency"
            , sum("cost") as "cost"
            , sum("impressions") as "impressions"
            , sum("clicks") as "clicks"
    FROM sp_traff

    --Ensures only 1 unique idempotency_id record is used (in the event of duplicates)
    WHERE rn = 1

    --GROUP BY all the keys identified above
    GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
),

-- Join sp-traffic and sp-conversion

SELECT    spt."advertiser_id"
        , spt."marketplace_id"
        , spt."time_window_start"
        , spt."campaign_id"
        , spt."ad_group_id"
        , spt."ad_id"
        , spt."keyword_id"
        , spt."keyword_text"
        , spt."placement"
        , spt."currency"
        , spt."cost"
        , spt."impressions"
        , spt."clicks"
        , spc."attributed_sales_1d"
        , spc."attributed_sales_1d_same_sku"
        , spc."attributed_sales_7d"
        , spc."attributed_sales_7d_same_sku"
        , spc."attributed_sales_14d"
        , spc."attributed_sales_14d_same_sku"
        , spc."attributed_sales_30d"
        , spc."attributed_sales_30d_same_sku"
        , spc."attributed_conversions_1d"
        , spc."attributed_conversions_1d_same_sku"
        , spc."attributed_conversions_7d"
        , spc."attributed_conversions_14d_same_sku"
        , spc."attributed_conversions_30d"
        , spc."attributed_conversions_30d_same_sku"
        , spc."attributed_units_ordered_1d"
        , spc."attributed_units_ordered_1d_same_sku"
        , spc."attributed_units_ordered_7d"
        , spc."attributed_units_ordered_7d_same_sku"
        , spc."attributed_units_ordered_14d"
        , spc."attributed_units_ordered_14d_same_sku"
        , spc."attributed_units_ordered_30d"
        , spc."attributed_units_ordered_30d_same_sku"
FROM sp_traff_agg spt
LEFT JOIN sp_conv_agg spc

--START OF JOIN KEY LOGIC --
ON  spt.advertiser_id   = spc.advertiser_id
AND spt.marketplace_id  = spc.marketplace_id
AND spt.campaign_id     = spc.campaign_id
AND spt.ad_group_id     = spc.ad_group_id
AND spt.ad_id           = spc.ad_id
AND spt.keyword_id      = spc.keyword_id
AND spt.placement       = spc.placement
AND spt.time_window_start = spc.time_window_start
AND spt.currency        = spc.currency
--END OF JOIN KEY LOGIC --

--Add any additional WHERE filtering criteria here

--Change ORDER clause to desired order
ORDER BY time_window_start

警告

为了获得准确的结果,每个 CTE 级别之间的筛选条件必须匹配。在以上示例中,sp_conv 筛选条件应与 sp_traff 筛选条件匹配。同样,两个聚合的 CTE(sp_traffic_aggsp_conv_agg)也必须使用相同的筛选条件。通常,如果在命令链中进行更高级别的预筛选,则效果将更出色。建议对 CTE 之间共享的所有筛选条件进行参数化,以减少出错的机会。