Data-Driven Marketers’ Guide to Data Management
Marketing has always been highly dependent on data. And in today’s rapidly moving world, the importance of managing vast quantities of diverse data from disparate sources is growing.
“There are three main issues when it comes to data management for marketers,” explains Dr Min Sun, Chief AI Scientist at Appier. “The first is data quality – the maxim of ‘garbage in, garbage out’ is critical for marketers. Once data quality is assured, the second factor to consider is the usefulness of the data. What data is valuable and which data sets can be used together? Finally, companies need to ensure they have robust governance in place. These cover the legal and corporate obligations including jurisdictional frameworks, such as General Data Protection Regulation (GDPR) in Europe, the Personal Data Protection Act (PDPA) in Singapore and others.”
What Is ‘Good’ Data?
There are several dimensions to data quality. Dr Sun says that issues such as consistent errors and inconsistent noise in data can significantly decrease the usefulness and value of even huge data sets.
However, a more subtle issue can come even when the data is completely correct, but the assumptions underlying the selection criteria are biased or skewed. As a result, the data collected can cause an AI model to produce poor results. While some bias in data selection can be useful, it is critical that the bias is recognized and understood.
“Once the bias is known, it is possible to un-bias the data to remove its effect on the machine learning and AI algorithms,” he adds.
The frequency, recency and coverage of data also impact its quality. While both new and old data are useful, typically marketers will prefer newer data as it is more relevant. However, there are times when having both old and new data useful for trend analysis and understanding cause and effect. Having data covering a broad time horizon can be powerful for marketers, says Dr Sun.
What Data Sources Can Marketers Use?
With so much data available to marketers, knowing which sources are most valuable and how they can be used is critical. In most cases, the data can be broadly split into two groups: customers’ online behavior data and offline behavior data. Both are equally valuable for understanding existing customers and making compelling offers to them.
“These days, marketers have access to a large volume of online data, such as customer behavior from websites. For example, they can learn what products a customer buys, the completion of a transaction and then return visits,” explains Dr Sun. “However, there are also offline data sources. These can include visits to physical stores, phone calls made to customer centers and other information that is managed through a CRM.”
Typically, the data pool from online data is very large, but offline data sources like CRM and finance applications can be useful for helping marketers understand things such as how customers paid for purchases. This can help develop models to incentivize sales by knowing when to offer preferred incentives, such as discount coupons, free shipping or options for customers to use convenient payment methods like ‘Buy Now, Pay Later’ services.
Businesses can also, through the use of APIs and other tools, reach audiences from other sources like social media, leveraging the knowledge extracted from their own data to interact with audiences on other platforms.
“We should combine the power of first-party data and the knowledge learned to reach various types of new customers through APIs of other platforms,” Dr Sun adds.
Pulling the Data Together
Finding and collecting the right data is the start of the process. The next hurdle to overcome is integrating the data from different sources so it can be used effectively. That means finding common elements between the data from each source, so customer records are complete and there is no duplication. The ideal way to do that is to find a common identifier that can be used to link the data.
“The first step is to aggregate the data while maintaining data quality,” says Dr Sun. “Ideally, to match Data Point A from one source, to Data Point B from another source by finding a simple identifier.”
For example, your CRM data may identify someone by their name and cell phone number. And another online data source may also hold the cell phone number. In that case, you can reasonably assume that the cell phone number uniquely identifies the data as coming from the same person. Or you may need a combination of attributes to definitively determine that two records from different systems are for the same person.
In reality, things may be more complicated. People either accidentally or intentionally enter numbers incorrectly. Or they may use their full name in some systems but a pseudonym or abbreviated name in others. That can make the task of correctly aggregating the data quite complex.
Another factor in aggregating the data is timing. When two data records for the same person exist, it is important to know which is most recent. However, Dr. Sun points out that it is also important to understand that the most current record may not always be the one you want. Two data records may be indicative of a process where one record shows an effect while the other reflects the cause. In such cases, both records are valuable.
It is also possible that the data you are bringing into your customer data platform is a subset which can result in bias depending on the selection criteria.
“Once you have the data, you can put it into the models and figure what data is useful and if there is missing data that would be valuable to be proactively collected for improving the models,” says Dr Sun.
Data Governance Is Crucial
A crucial element in data management is governance. This covers internal policies, processes and procedures which, in turn, are informed by jurisdictional and legal frameworks.
For example, the European Union’s GDPR requires that organizations holding data pertaining to its citizens follow a set of specific guidelines and rules And in Singapore, the PDPA has specific provisions regarding consent to collect and use information, limits on how data may be used and limits on retention. Many countries now have specific laws pertaining to the reporting of data loss. Australia has its Notifiable Data Breach notification scheme and New Zealand recently amended its data privacy laws.
In practice, this means organizations must be very clear to customers about what data is being collected, how it will be used and have a process in place for data deletion, as well as robust information security. This would be covered within a data governance policy or framework that clearly states this process and makes it clear who has responsibility for different elements of the data governance policy.
Data governance is about much more than compliance. Good data governance is about applying best practices that cover everything from data architecture to security and data retention, ensuring there are appropriate procedures in place to support the best possible data quality. From a governance perspective, this means taking a lifecycle view of data.
If data is the new oil, then it is incumbent on marketers to ensure they collect, refine and use data as a precious resource. Ensuring that marketing data is governed correctly with respect to jurisdictional obligations is also critical. There is a cost to collecting, storing, managing and using data. It is a tangible asset that can support marketers in their ability to understand customers and make them compelling offers that result in increased sales.
* Read our latest white paper ‘The Rise of CDPs and Their Critical Role in Digital Transformation’ to learn more about how you can leverage a customer data platform to turn data into actionable insights for improving customer experience. Got a question? Contact us today!
WE ARE HERE TO HELP
YOU MIGHT ALSO LIKE
Barbara Guerpillon, Head of Unilever Foundry, recently shared her view with us on corporate innovation and the role of AI in marketing. Tell us a bit about your role at Unilever Foundry and how you help drive digital transformation and innovation in big corporations such as Unilever. The role of Unilever Foundry is to connect Unilever with the outside world of startups, and to connect startups with Unilever. We are a two-way bridge helping both Unilever and startups to communicate and work effectively together. Our role is to understand Unilever’s business goals and strategy in the short, medium and long term, and identify areas where technology can fill the gaps without re-inventing the wheel. Innovating as a large corporation remains a big problem. What do you think the biggest hurdles are? It’s certainly a challenge and absolutely a necessity for large organizations, and it’s not easy. However, I strongly believe that if done well both the corporates and the startups can benefit from working together. Engaging with entrepreneurs to address clear problem statements, current or future, has helped us drive a culture of experimentation, adapting ways of working that accelerate the implementation of new technologies. Unilever Foundry represents one
Author | Dr. Min Sun, Chief AI Scientist, Appier The concept of digital transformation is not new, and businesses of all types and sizes have been under pressure to keep up with the pace of technological change for the past several years. However, the global pandemic of 2020 has caused many organizations to speed up their plans and shift quickly with regard to digital-first operations. Retail is one industry that has had to make significant changes to how it meets the needs of consumers in light of the pandemic, but the transformation in the industry was already well underway. Retail has seen several phases of transformation over the past 70 years or so, from the rise of the department store to the rise of the supermarket, and then into mass production which led to the introduction of chain stores such as 7-11. We’re now in the fourth retail revolution, driven by e-commerce led by companies such as Amazon and Alibaba. A major element of this phase has been the use of technology not just to better meet and anticipate the needs of consumers, but also to make all parts of the retail supply chain more efficient. We are seeing an
Broadly speaking, there are two types of machine learning algorithms: supervised and unsupervised learning. Supervised learning is the more common of the two, and is typically easier to implement than unsupervised learning. What Is Supervised Learning? Supervised learning algorithms are designed to learn by example. They are used when the human practitioner knows the answer to a problem, and wants to train the AI to be able to find it out. It is like learning with the assistance of a teacher, guiding the algorithm towards the ‘correct’ answer, as opposed to an unsupervised learning algorithm, which is like a child learning on their own by experimentation and trial and error. To train a supervised learning algorithm, you will need to pair a set of inputs with specific outputs. The algorithm will then search for patterns within the inputs to correlate with the outputs. Based on this training data, the supervised learning algorithm can then take in unseen inputs and determine which label to assign them. The aim? To predict the correct label for newly presented input data in order to categorize it and make sense of it. Supervised learning: · Is simpler and more common than unsupervised learning ·