Reach Frequency with Digital and Projection Scaling UDF

You can use the LiveRamp-provided "Reach Frequency with Digital and Projection Scaling" UDF (lr_reach_frequency_projected_scaled) to generate the following metrics based on your audience and exposure data, date range, segments, and other details.

Impressions represent the number of times your ad is seen.
Reach indicates the total number of people who see your ad.
Frequency refers to how often people see your ad.

This UDF supports multi-channel and single-channel measurements, combines linear and digital ads, and offers the following scaling methods:

Digital scaling: This method scales the in-panel measurements back to the recorded totals, primarily to match the total impressions for digital. You can apply this type of scaling to any exposure type (not just for "digital") where you have a record for all exposures seen in the campaign, not just a sample. Scaling is performed daily by exposure type and property.
Projection scaling: Use this scaling method when you only have a sample of the actual exposures. To estimate reach and impressions accurately, this UDF determines the "weight" of each household to project reach/impressions back to the total population. This projection corrects for bias in the exposure sample by redefining the sample for use in measurements. Exposures are joined to the projection and any exposures that are not included in the projection are marked as "unmatched".

To collaborate with your partners using the lr_reach_frequency_projected_scaled UDF, you and your partners need to reference data assets with columns that are consistent with the following definition, which is declared at the top of the UDF's SQL code:

    exposure_table data<rampid STRING, cut_value_column STRING, exposure_type STRING, property STRING, exposure_date DATE>,
    audience_table data<rampid STRING, audience_segment STRING>,
    projection_table data<rampid STRING, weight FLOAT>,
    audience_segments ARRAY<STRING>,
    cut_type STRING,
    measurement_start_date DATE,
    measurement_end_date DATE,
    scale_target_segment STRING,
    exposure_types_to_scale ARRAY<STRING>,
    max_frequency INTEGER

Where:

exposure_table: The name of your exposure table, which includes columns for RampID strings, cut values (column names for certain parameters), and exposure dates
audience_table: The name of your audience table, which includes columns for RampID strings and segment ID integers
audience_segments: An array of segment strings
cut_type: A string representing the parameter you want to use in the analysis, such as campaign, platform, or creative
measurement_start_date and measurement_end_date: The date range for your analysis
scale_target_segment: The audience segment from the audience_table that you want to use in the analysis to be representative of the population.
exposure_types_to_scale: A list of the exposure types (in the exposure_type column) that should be scaled. Usually, this is the list of digital exposure types where one has all exposure records.
max_frequency: The maximum frequency to report on. The output reports on reach and impressions at each frequency, and this value caps the highest frequency in the output. If you don't want a cap, use a large number instead, such as 1000000. The specified maximum frequency will be applied to the output but the output's reach_plus and impressions_plus values will reflect calculated exposures and households at all frequencies.

Consider the following assumptions when configuring the lr_reach_frequency_projected_scaled UDF:

The audience_table and projection_table have no records with unmatched RampIDs. Filter out such records upon input by using the RAMPID_MATCHED function.
The exposure_table has no missing/NULL values for all of its specified input columns except RampID (which can be, and often is, missing/NULL).
Each record in the exposure_table corresponds to a unique exposure. If the exposures might be duplicates, dedupe them before ingesting them to LiveRamp.

This UDF's output fields provide metadata and metrics.

Metadata fields include:

cut_type: The parameter you want to use in the analysis, such as campaign, platform, or creative
cut_value: The value of the corresponding cut_type parameter
frequency: How often people see your ad
measurement_start_date and measurement_end_date: The date range for your analysis
segment: An array of segment strings

Metrics include:

impressions: The number of times your ad is seen
impressions_plus: Impressions that reflect calculated exposures and households at all frequencies
impressions_plus_unscaled: impressions_plus with projection but without digital scaling
impressions_plus_unweighted_unscaled: impressions_plus without digital scaling and projection scaling
impressions_unscaled: Impressions with projection but without digital scaling
impressions_unweighted_unscaled: Impressions without digital scaling and projection scaling
reach: The total number of people who see your ad
reach_plus: Reach that reflects calculated exposures and households at all frequencies
reach_plus_unscaled: reach_plus with projection but without digital scaling
reach_plus_unweighted_unscaled: reach_plus without digital scaling and projection scaling
reach_unscaled: Reach with projection but without digital scaling
reach_unweighted_unscaled: Reach without digital scaling and projection scaling

Tip

The "_unscaled" and "_unweighted_unscaled" versions of these metrics are included to help you verify the digital scaling and projection metrics.

Sample Call

select * from lr_reach_frequency_projected_scaled(
    -- exposure_table
    (select entity_id as rampid, 'campaign' as cut_value_column, exposure_flow_id as exposure_type, property_id as property, exposure_date_utc as exposure_date
    from dpm_sim_exposures),
    -- audience_table
    (select entity_id as rampid, audience_definition_name as audience_segment
    from dpm_sim_audiences),
    -- projection_table
    (select entity_id as rampid, weight
    from dpm_sim_projections),
    -- audience_segments
    array('SmartTV All Households', 'Demographic Audience'),
    -- cut_type
    'campaign',
    -- measurement_start_date
    '2022-12-26',
    -- measurement_end_date
    '2023-03-26',
    -- scale_target_segment
    'SmartTV All Households',
    -- exposure_types_to_scale
    array('Digital OLV', 'Digital Web'),
    -- max_frequency
    20)

FAQs

What if I only want to report on maintained RampIDs?

When specifying the RampID column in your exposure table, specify a column that is a RampID only when the associated RampID is maintained or is otherwise NULL.

Why don't impressions = reach * frequency?

Impressions and reach are calculated separately, so reach * frequency = impressions is not necessarily accurate at a specified frequency.

Reach is a modeled value because not all exposures can be identified, so we don't yet know the true reach. Records with a valid identifier need to have some weight attached to them. The entity or RampID may appear in multiple digital platforms, so the chosen weight must be well-behaved in this cross-channel setting.

Impressions can be calculated directly in the digital case or if you are working with a subsample that is projected to the larger population by associating weights independent of the platform. So, impressions can be reported accurately, but reach is a modeled approximation.

Why are there some cases where reach is greater than impressions?

Reach and impressions differ. Reach can sometimes be greater than impressions if the reach weight of an entity or RampID can exceed its impressions weight when it is present on several platforms. This situation is usually only encountered for smaller properties with low sample size.

What if I don't want to scale?

To turn off digital scaling, pass in an empty array to exposure_types_to_scale. To turn off projection scaling, pass in a set of all known RampIDs with a weight of 1.

What if I don't have audience segments?

If you lack audience segments, pass in all distinct RampIDs in the exposure/projection as the audience table.

In this section: