Einstein Segmentation FAQ

Definitions of terminologies used in Einstein Segmentation
  • Personas: A machine-generated segment of devices that share the most frequent pattern of attributes.
  • Attributes: An attribute can be a segment/attribute from the first-party data or an attribute from a third-party data provider.
  • Rank: Einstein ranks the personas based on the representativeness and uniqueness of the attributes that define them.
  • Line width: Indicates the number of overlapped devices between two personas.


How often is my data updated?

Weekly, however when you change the configuration - through “Modify” - the results will be updated overnight.


What is the pre-configuration?   

In descending order, the analysis takes data from Axciom, Eyeota, or VisualDNA to run together with all your 1PD.

If none of the three data providers are enabled for your instance, the analysis will only take your 1PD.  


What if I want to modify the pre-configuration of the data providers?  

Click Modify on top right, you will enter Config page where you can choose a data provider you want and run the analysis.

We enable overnight processing upon reconfiguration. You can see the results on the following day.

My org's Einstein segments are almost entirely based on existing segments that they've already built. How is that useful?

For orgs that fall into the category of "data rich" category of organizations, meaning that they've already made plenty of segments, one of the primary use-cases we planned for is discovery -- finding new combinations of segments that the marketer might not have thought of already.

For instance, while your org may have gender and age demographic segments created, Einstein Segmentation might Persona 6:  Male · Gamers as well as Male · Frequent Trader . While the argument can be made that these are far from "nonintuitive', the product has surfaced them along with their reach and made it possible to immediately activate into segments which can then form the basis for targeted campaigns.


My org has a robust amount of viewing and authentication attributes (show, video complete, authenticate, etc), but the system has 49/50 recommendations entirely fixated on one attribute set e.g. (User: Visit Number). Can I get more diversity?

This is an area of active development. The cause of this is that some user attributes are overwhelmingly more common than the other items in the user attribute category, and are crowding out what I'm assuming you actually want to see. The fix is to statistically infer categories of attributes and ensure that there is equal representation of both highly-prevalent and less-prevalent categories in the final result. We're developing several approaches (it's more of the "R" in R&D), but are not in a place where we can commit to a date on completion, unfortunately.

One possible workaround -- if your org is willing to categorize these attributes into buckets (say < 5, between 5 and 10, and > 10), a lot of the noise would be stripped out from the 1P results.

Another workaround is to rerun with 3P data only to see if it improves things.


I only see a small number of personas for my org. Why?

You might want to re-configure with 1P data selected or more 3P data providers. Note that adding more 3P data providers could result in potentially contradictory personas, depending on 3P data quality.

Several of the personas consist of a single segment, consisting of one attribute, that's already a segment. How is that useful?

This is currently expected behavior, and for low numbers of personas, it's indeed not useful. When there are more results that are combinations of attributes, these single attributes can provide visually quantitative context for how much reach you'll sacrifice by activating a persona with > 1 attribute. When we ship V2, these short-length personas will function as "reach" opportunities, vs. longer-length personas.


What is the algorithm that drives Einstein Segmentation?

Frequent pattern. The market uses clustering algorithm for machine-learning driven segmentation. However our Product features patented frequent pattern engine to analyze the audience.

Therefore the personas are NOT mutually exclusive.


What are the key differences between clustering and Frequent Pattern (FP) and why does Einstein chooses FP?

Clustering is to separate and group data points into a number of partitions.

FP is the process to find commonalities and key attribute (attribute combinations) associated with data.

Classic clustering typically separates data into non-overlapping partitions. FP extracts those key commonalities which can then be used to extract personas, those personas can then be used to find users to perform clustering or segmentation.

FP is the most effective way to find common attributes/traits associated with data (e.g. users), and explores a combinatorial space efficiently. It can later be used to perform segmentation, extracting association rules or used as features to feed into more sophisticated ML models when necessary.


What is proprietary about Einstein FP model?

Current state of the art FP scales well enough for conventional applications, however it struggles to scale at DMP level. We have designed and implemented our version of the algorithm to be able to fully utilising our distributed machinery for maximum efficiency.

Current state of the art FP only works with data with homogeneous distributions, we are dealing with very diverse distributions in segments and attributes. Our FP algorithm figures out the best cut-off points and importance level across a very diverse set of attributes (e.g. demographic vs conversion), and making sure important but rare attributes are surfaced appropriately.

Our FP explore combinations beyond the first order combinations when necessary, it can explore combinations on top of combinations (properties of properties)

Have more questions? Submit a request


Please sign in to leave a comment.