30 Data & Analytics Terms Every Marketer Should Know

Data & Analytics Terms 6@2x

Whether you’re new to the game or a seasoned pro, the amount of terms and acronyms in the marketing industry can be overwhelming. A rapidly evolving landscape with technology advancements, emerging channels and new regulations have only amplified this challenge. Curious marketers trying to better understand platforms or tech might be found muttering “WTF is a vMVPD?”

Don’t get too caught up in the acronym du jour, instead focus on mastering key data and analytics terms that power modern marketing solutions. We’ve compiled a list of definitions to help you go beyond top-level terms like “machine learning” to develop a deeper understanding and maybe impress a colleague or two.

Without further ado, here’s our list of 30 data and analytics terms every marketer should know.

1. 1. 1. 1. 1st Party Data: The crown jewel of data - information the marketer has collected directly about their current and prospective customers. This can be as basic as an email, all the way to robust unified customer profiles of demographics, lifestyle and detailed purchase histories.
      2. 2nd Party Data: Data that is shared in a dedicated environment with a clearly defined set of permissions and rights set between each of the parties. As marketing evolves with the deprecation of 3rd party cookies and IDFA, collaborative data partnerships resulting in 2nd party data are becoming more prevalent.
      3. 3rd Party Data (3PD): Data that is primarily from businesses that do not have a direct relationship with consumers. Sources typically include a range of publicly available sources (e.g. census data), websites and mobile devices, or through the licensing of 1st and 2nd party assets.
      4. Algorithm: Advanced processes or sets of rules to be followed in calculations that can be used for all different types of use cases. In marketing, data scientists turn to algorithms like random forest, support vector machine, neural network and more to predict consumer behavior and how groups of people may respond to a brand’s message.
      5. API: Stands for application programming interface, a common method of exchanging data. With tons of data being generated in real-time, marketers have to respond quickly or risk missing opportunities to connect with their customers. API’s allow data to flow efficiently between the different tools and platforms they use.
      6. Bias: Describes how well a model matches the development sample. Low bias = highly matches.
      7. Binary Model: A model that predicts the probability of a single action (e.g. how likely the consumer is to respond to an offer). The answer is always on a scale of No to Yes, 0 to 1. Binary models are proven performers for predicting response, product propensity and price sensitivity in marketing.
      8. Cluster Analysis: A technique to group similar observations into a number of clusters based on the observed values of several variables. The records within each cluster are most similar to each other, whereas the records between clusters are most dissimilar. Clustering is often a technique used to develop audience “personas”.
      9. Custom Model: A solution that is built specifically for a marketer. Custom models often incorporate the marketer’s own data along with other supporting data sources to achieve a specific KPI.
      10. Data Dictionary: Provides detailed information about data that is available for analytics, such as standard definitions of data elements, their meanings, and allowable values. Dictionaries are typically designed for more technical teams and partnership discussions, but are very important for keeping data organized accessible throughout an organization.
      11. Data Enrichment: The process of appending supplemental, predictive data attributes to complete a customer profile. Providing flexibility and customization, data enrichment impacts all stages of the customer journey.
      12. Data Hygiene: The process of ensuring all data is accurate and up-to-date and that all unused data is migrated into the appropriate lifecycle stage for storage, archival or destruction on an ongoing basis. Strong hygiene practices are paramount for accurate audience targeting and meeting compliance regulations.
      13. Deterministic: Data attributes that are known or observed, and not derived from modeling or inferences. Deterministic data is the ideal but achieving enough scale for marketing campaigns is often a challenge, even for the world’s largest brands. Marketers typically use this data set in modeling to find larger audiences that look similar.
      14. Development Sample: A fully mature set of data (such as from a campaign) that includes historical performance information. The data samples is used to build a model and to verify the estimated model performance.
      15. Ensemble Methods: The process of combining multiple machine-learning models for improved marketing performance. Ensembles add more flexibility in custom solutions as each model can identify the best audiences in different ways.
      16. Gains Chart: A type of report that is generated from a custom model build and demonstrates the strength of the model by score groups. This table-based report is often paired with other graphical visualizations for analysis.
      17. Look-alike Model: Evaluates the traits and behaviors common to the “best customers” group, resulting in a large, qualified group of prospects.
      18. Match Key: A unique identifier that is used to join two or more data sets.
      19. Machine Learning: We all hear it, probably daily for a lot of you, but we’ve included it just in case. Algorithms or statistical models that find patterns or make predictions from data without explicit human instructions.
      20. Multi-behavioral Model: A complex model that simultaneously predicts the probability of many actions. They can incorporate profit values for each possible outcome to predict the estimated profit value of an audience.
      21. On-Demand Model: A model that has been previously built but can be used to score other data sets without any customization.
      22. Pre-selection: The process of defining a narrower data set for scoring (e.g. a specific gender, age, income, etc.)
      23. Probabilistic: Based on or adapted to a theory of probability; subject to or involving chance variation. i.e. ‘a likely match’.
      24. Propensity Model: Similar to a look-alike, but is more deliberate in looking specifically at the variables that drive consumers to take certain actions or hold specific opinions.
      25. QA: Stands for “quality assurance” and represents procedures to check for completeness and accuracy of data both before and after a model build. Models don’t always do what you expect and having a human element in the process can ensure decisions are being made with the best data available.
      26. Random Allocation: A process that allows marketers to fairly evaluate results when a record scores equally well across two or more models by randomly assigning the records to one of the models.
      27. Scoring: Using a model to calculate the estimated probability of a desired outcome.
      28. Seed: A set of data that describes a desired audience and is used to generate a look-alike or propensity model (e.g. social followers of a brand).
      29. Suppressions: Records excluded from development samples as to not skew results (e.g. recent buyers).
      30. Variable: A derived data element (e.g. purchased 3 pairs of shoes) that is used for data enrichment or as a predictor in model development.

Are there any definitions that you think we missed? Feel free to reach out to Team Alliant with any questions or leave a comment to help us expand the list!