Skip to main content

Overview of LiveRamp’s Lookalike Modeling

When you want to expand your targeting audience size, you might want to use lookalike modeling to reach people who are similar to a subset of your current customers. For example, you might have identified a subset of your customers who are high spenders and wish to find a larger pool of consumers who share enough similar characteristics to that subset to be effective to target.

With LiveRamp’s lookalike modeling, you can create a modeled audience consisting of the consumers in the population dataset pool who share the characteristics of your seed audience. See below for an example showing how LiveRamp's lookalike modeling works, or see "Create a Lookalike-Modeled Audience" for detailed instructions.

Lookalike Modeling Example

Let’s say that about a fourth of your existing customer base are high spenders (shown in orange):


High Spenders (shown in orange)

Scoring the Characteristics of the Seed Segment

Once you create a seed segment that includes only those high spenders, we then append Data Marketplace data that includes multiple attributes (such as age, gender, income, and more) to those records. Using those attributes, we create a profile of the members of that seed segment that includes all the attributes that enough of those audience members share to be considered predictive.

For example, your seed audience of high spenders might share the attributes of "likes dogs", "has children", and "takes two or more vacations per year" (to varying degrees). So another consumer in the greater population who shares some or all of those attributes is likely to be a potential high spender and will be a good candidate for targeting.

Let's say half of your seed audience members share the attribute "likes dogs" (shown in orange):


Dog Lovers (shown in orange)

And a fourth of them have children (shown in orange):


Has Children (shown in orange)

And almost all of them take two or more vacations per year (shown in orange):


Takes 2+ Vacations Per Year (shown in orange)

Attributes that are shared across most or all of the seed audience members (in this example, taking 2 or more vacations per year) are considered strongly predictive, meaning that attribute likely correlates to the high spending behavior of the seed audience. Attributes that are not widely shared (in this example, having children) are considered weakly predictive.

Scoring the Population Dataset Characteristics

We then compare the predictive characteristics of the seed segment against the characteristics for the records in the population dataset. Each record in the population dataset is then scored based on the number and strength of any shared predictive attributes.

In this simplified example, the image below shows a possible scoring scheme on population data:

  • Gray shows consumers that don’t match the seed audience attributes in any way

  • Blue shows consumers that are a weak match (such as records that have only one weakly predictive attribute, like "has children")

  • Orange shows consumers that are a medium match (such as records that have two predictive attributes or one strongly predictive attribute, like "takes two or more vacations per year")

  • Red shows consumers that are a good match (such as records that have two or more strongly predictive attributes)

  • Green shows consumers that are a strong match (such as records that have all predictive attributes).


Based on the modeling, the consumers in green are more likely to become a high spender than the consumers in blue or orange. If you're mostly concerned with accuracy, you might want to target only the consumers in green. If you'd like more reach, you might include the consumers in red and orange. And if you want maximum reach while still preserving some of the benefits of lookalike modeling, you might also include the consumers in blue.

Determining the Modeled Audience Size

From the characteristics of the seed audience, we create a modeled audience. Once the modeled audience has been created, you adjust the size of the modeled audience to your desired level, based on your goals

  • A smaller size is more likely to reach only consumers who strongly share the characteristics of the seed audience (in this example, the consumers in green). In general, smaller modeled audiences have higher precision but lower reach.

  • A larger size reaches more consumers but might reach more consumers who only weakly share those characteristics (such as the consumers in blue). In general, larger modeled audiences have lower precision but higher reach.


What if the seed audience is not distinctive? There might be times when the members of your seed audience do not share enough characteristics to be predictive. We can still create a modeled audience but will warn you when we feel that there isn’t enough predictive power for that modeled audience to be effective no matter the size.

Once you've created a modeled audience, you can use that modeled audience to target those consumers on your desired destinations.

LiveRamp Population Dataset

You can use our LiveRamp population dataset to build lookalike models. The LiveRamp population dataset includes audiences and simple CPM/percent media licensing and is easy to create with our lookalike user interface in Customer Profiles.


This feature is only available to U.S. accounts. To request this feature, create a Safe Haven case in the LiveRamp Community portal.