Let Machine Learning choose your targets for a marketing campaign. Data Analytics with Oracle Analytics Cloud (OAC)

Welcome to a new entry in our series of posts on Advanced Analytics (AA) use cases. In our previous article, we applied Machine Learning (ML) to estimate employee attrition; in this article we’ll go through an Advanced Analytics use case for marketing practitioners.


Imagine this situation: the sales director and marketing director of a fictitious ecommerce company called MaxSell are looking at their corporate finance dashboard, worried because it indicates that the prediction for end-of-year attainment is not on track to meet the objectives. They agree that an end-of-year short-term campaign could potentially improve the attainment KPI, but before launching such a short-term campaign, the marketing director and his team ask themselves the following questions:


  1. Will the number of products sold increase or decrease after the campaign?
  2. Which customers should be targeted for the campaign?
  3. Which products sell best together? (Understanding customer buying behaviour is key in making product recommendations when customers visit the website or use the mobile app).


These are some of the typical questions the marketing team would love to get the answers to, and here is where a data scientist can take the initiative. Clearly, the first question can be answered with forecasting methods to help the marketers understand how the number of products sold will evolve. The second question will require supervised Machine Learning techniques to predict which customers are more likely to respond to the marketing campaign. The last question depends on identifying the strength of association between pairs of products purchased together, and in this case, unsupervised Machine Learning techniques such as market basket analysis are the ones to apply. In this article we’ll focus on the second question and we’ll run through how to apply supervised Machine Learning techniques using Oracle Analytics Cloud (OAC).


ClearPeaks is already helping many businesses in their AA adoption, and in this article we’ll review, as an illustrative example, an AA use case involving Machine Learning techniques. The success of the digital transformation and AA adoption in any business depends on the participation of most of its departments and, of course, marketing departments are essential in increasing revenue and reaching new clients; we cannot think of a better use case for this article than one which will help companies target the right customers every time they launch a new marketing campaign.


1. Use Case


As mentioned earlier, in this blog article we’ll look at how to find the right target customers for a marketing campaign. Targeting customers for a marketing campaign requires demographic data about the customers and response data from previous campaigns. Our input datasets are the Customer.csv and the Response.csv files, which are directly extracted from the MaxSell Enterprise Data Warehouse (EDW).

The input customer dataset is a flat file with information about 58,656 customers; for each customer there are features such as age, annual income, credit score, country, education, number of children, household size, gender, marital status, etc. The response dataset is also a flat file, with information about past marketing campaigns; for each marketing campaign there are attributes like channel, time of day, day of week, conversion flag, source, timestamp, customer id, etc. Find below the headers and sample records of these files:


Figure 1: Customer.csv header and sample record


Figure 2: Response.csv header and sample record


If you would like to reproduce the exercise you can also download the files:




We’ll use a Machine Learning model to predict which customers are more likely to respond to the marketing campaign, thus cutting campaign costs and also increasing the conversion rate.


We will use Oracle Analytics Cloud (OAC), an Oracle Cloud service with modern, AI-powered, self-service analytics capabilities for data preparation, visualization, enterprise reporting, augmented analysis, and natural language processing/generation. With OAC you can connect to flat files, RDBMS or Big Data environments. Among other things, with OAC you can preprocess data and create interactive reports and dashboards, as well as train and apply Machine Learning models. There is no need for coding – just drag and drop the built-in transformations and Machine Learning models onto pipeline canvases.


Now we’ll detail the steps we took with OAC to implement a ML pipeline to identify the target customers for our marketing campaign.


2. Data preprocessing and modeling


First, let’s create a connection to our datasets. Log in to your OAC and create two new datasets.


Figure 3: Create datasets


Select the Customer.csv file and the Response.csv. Check if they appear in the Data tab.


Figure 4: Imported datasets


We can now create the data flow to prepare the data and train our Machine Learning model. Create a data flow: click on Create à Data Flow and select Customer dataset.


Figure 5: Add dataset to the data flow


Add a new step to add the Response dataset; this will generate a join step. Make sure it automatically identifies the CustomerID field as the matching column.


Figure 6: Configure the join step


Add a select columns step:


Figure 7: Select columns


Select the following columns to be removed from our data flow: Phone_No, Campaign_Id, Channel, Time_of_Day, Day_of_Week, Product_LOB, Source, Time_Stamp, Comm_Id, CustomerID_1, Customer_ID. Click on remove selected. Now we will choose the Machine Learning model to train our dataset. Our use case is a binary classification problem, i.e. the model will predict whether the customer will respond to the marketing campaign or not. Add a new step of Train Binary Classification and then choose the CART for model training option. Select Conversion_Flag as the target column and type “yes” instead of “Yes” in the Positive Class in Target field. Positive class here refers to the value “yes” for the people who responded to the campaign positively. Select Over Sample as the balancing method, then name the model as “Campaign_Train” in the save step.


Figure 8: Save model


Save the Data Flow and name it as Campaign_Train_DF. Click on Run Data Flow to execute the pipeline and train our model. Once the pipeline has been executed, we can check the quality of our model. Click on Machine Learning in the menu and then inspect the model.


Figure 9: Inspect the model quality


Figure 10: Confusion matrix


A confusion matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known; the matrix itself is relatively easy to understand.


Figure 11: Confusion Matrix explained


The statistics that evaluate the quality of the model are defined below:


  • Accuracy is the most intuitive performance measure and is simply a ratio of correctly predicted observations to the total observations.

Accuracy = (TP+TN)/ (TP+FP+FN+TN)


  • Precision is the ratio of correctly predicted positive observations to the total predicted positive observations.

Precision = TP / (TP+FP)


  • Recall is the ratio of correctly predicted positive observations to all the observations in the actual class.

Recall = TP / (TP+FN)


  • The F1 score conveys the balance between the precision and the recall. This score takes both false positives and false negatives into consideration. It is not as easy to understand as accuracy, but F1 is usually more useful than accuracy when you have an uneven class distribution.

F1 Score = 2*(Recall * Precision) / (Recall + Precision).


We can apply the model to our customer base to determine which customers are most likely to respond positively to our campaign. Create a new data flow and add the customer dataset. Apply a step to select columns and remove the column phone_no. Add the Apply Model step and select the model we trained Campaign_Train. Then add the Save Data step to save the output dataset. It will contain each customer’s details, together with the prediction of the response to the campaign and a confidence percentage for that prediction.


Figure 12: Campaign customer scoring


3. Visualization of results


Now we have a dataset with our customers and the predictions about whether they will respond to our campaign or not. If we were the marketing director of the company, we would like to have a dashboard in which we could see what to expect regarding the effectiveness of the next marketing campaign, and therefore be able to adopt the correct strategy to choose the best campaign and the best target.


We’ll create a dashboard with OAC; the following simple dashboard is an example of what this could look like, containing the percentage of customers who will respond positively to our campaign and a table with the details of these customers. A couple of filters on age bucket and annual income bucket have been added.


Figure 13: Visualization of results




In this blog article we have detailed the various steps taken when implementing an Advanced Analytics use case in marketing, specifically campaign targeting. We used the Oracle Cloud service OAC to prepare the data, train a model and visualize the results in a dashboard – OAC offers all these capabilities in a single platform. The dashboard would help any marketing director to increase conversion of marketing campaigns and to reduce costs by not targeting customers who are not interested in the product offered. This step-by-step article is just an example of what Advanced Analytics can do for your business, and of how easy is to do it with the proper tools.


Here in ClearPeaks we have a team of data scientists who have implemented many use cases for different industries using the Oracle stack as well as other AA tools. If you are wondering how to start leveraging AA to improve your business, contact us and we will help you along your AA journey.


Stay tuned for future posts!


Marc G