Is something lacking in the current paradigm of audience mapping and message testing in social media? Markus Reimegård and Mehrdad Mamaghani of Prime’s advanced analytics unit Prime Development suggest we take a bigger scope and demonstrate how by presenting a new large scale study on information from Facebook Pages, with illustrations from Prime Studio’s Helena Melander.
Something is off about the way companies today obtain actionable knowledge about their audiences and do message testing in social media. For one thing, we nowadays tend to look at audiences the way a pilot looks at a radar, ourselves in the center, looking at the actors closest at hand, being served information about what “people who like our page also like”. Once the dots on the radar are defined and we know our nearest neighbors, we push out countless combinations of texts, colors and themes to see what sticks in a certain micro-segment. It’s both scary and stupid at the same time, and while sometimes effective it is ultimately a race to the bottom. Once we have micro-targeted our way down to a singular person, we start dividing his or her day and life into bits, and things get even more uncomfortable and grossly overfit, to put it statistically.
This tendency to capitulate in the face of the challenge of creating an actual, interpretable model of any version of reality is not unique. For instance our fascination for neural networks is another display of the same pattern. It’s not like it’s not working – it really is – but we have no clue what really goes into the immensely complex models that we create, and they never aspire to really describe anything, they just try to mimic the “true” output from a series of inputs. We can find countless applications of neural networks in cases where a much simpler model would have done just as well, and where the analytical approach doesn’t add to our understanding of the basic mechanisms.
So, what about lifting our eyes off the radar screen and look out the window at the landscape? It’s a valid question not just from a marketer’s perspective, but also because it could tell us something about the state of things; it has connotations for how we look at the political landscape. In the face of a marketing challenge, the answer could either be to take the CRM, web analytics and the brand survey for a spin, and dive deep into your existing customer segments. Or it could be to go on an exploratory journey to understand why customers or non-customers behave like they do, and if your messages, products and services could be a fit for them. Our suggestion herein is simple: when trying to understand your brand’s current situation, in addition to the microscope, for once try the binoculars.
In line with this thinking, we decided to make a map of the relations between audiences of American media outlets, using Facebook as the proxy. We soon also realized we could just as well add more things, brands, companies and influencers, to make the landscape richer and more nuanced. But let’s start with the media.
Facebook gives anyone the opportunity to fetch information about open Pages. Some of what you get from Facebook’s data platform is a long list of posts and interactions, together with texts in posts, and comments and reactions to those posts. You can do a lot of things with this information, but we were focused on producing our Map. We decided to go for the US, being home market for many major brands, and since it is particularly interesting with its recent upheaval of any remaining distinction between business and politics. We hand-picked a selection of the biggest news outlets, from left to right, and started out by trying to understand their respective audiences a bit better.
If you feed a computer with a large corpus of text from a certain author, or a certain group of people, you could make the computer good at recognizing their type of writing; that’s a typical task for an AI. If you feed it a big chunk of comments from the Wall Street Journal, it learns to recognize the words and phrases en vogue among WSJ commenters. If you do the same with comments from a whole list of different news outlets and then expose the computer to new comments and ask it where those comments belong, you get a sense of how separated their authors/audiences are. We did this pairwise for our list of outlets. Here’s what we saw.
The graph provides us with pairwise comparisons between the audiences of the news outlets. It tells us that the Wall Street Journal has a distinct type of audience, quite different from all the rest, and that Breitbart and Fox News have quite similar audiences, semantically that is. The remaining outlets form a third group with a common type of comments. We could argue that roughly speaking there’s a conservative Republican audience that follows and interacts with the WSJ, a populist right that follows Fox and Breitbart, and a Democrat – or perhaps a cosmopolitan Republican – audience that follows the rest.
That’s a good starting point. Next, we matched the anonymous Facebook IDs of active followers between publications to measure the similarity of news outlets based on the overlaps of audiences who interact with them. We thought that if we did this on a large scale we could be able to sketch out the Map. This demands computing power – just like the text analysis above did – with billions of rows of data to cross-link. It’s doable with a decent desktop computer, but it takes quite some time. We ran the analysis, and added some of the Facebook Pages with the largest following in the US, according to various top lists, as well as a few of our own choice.
Among the ones we couldn’t resist adding were some brands that have recently been hurled into political hot air as they have supported or criticized the US administration or expressed support for parts of its policies and incurred criticism from the opposition, or who have been attacked on Twitter by the US President more or less out of the blue. Think about brands like Patagonia, New Balance, Under Armor and Google.
The whole list of Pages, with their respective lists of active followers, underwent a type of statistical learning pipeline aimed at distilling and representing high-dimensional data. Some information is usually sacrificed when visualizing complex relationships and phenomena on a 2D map and our pipeline was built to keep this loss at a minimum. This analysis provided us with the Pages’ positions on the map.
Lastly, we summarized the amount of interactions from unique Facebook IDs among the Pages to get a grasp of their relative size. We made an illustration of the results, with respect to the Pages’ positions and sizes. Behold the Map of the Islands of Facebook America.
This is where the first phase of this adventure draws to a close, but it’s also where the interesting discussions really can start. We can see challenges for marketers trying to increase their company’s footprint with “your fans may also like” as their tool of choice, like the pilots staring at their radars. What if they knew people are sitting in a neighboring media bubble, unaware of their efforts speaking in a different way, perhaps receptive to their message if they used a different tone of voice?
More importantly: by navigating this map, we get a sense of the state of convergence between business and politics. We can see the partisan divide and the supposedly apolitical actors pulled into that divide. We can see the Fox News/Breitbart/Trump cluster and its dominance. We can also see groups that are far from any political discourse. What if we could inspire those people, less politically invested, to think about the importance of politics, perhaps increase their chances to go to the ballot and vote (44 percent didn’t in the last US election)? What if we could nudge people deeply dug down in either one of the partisan bubbles to consider how they could bridge the chasms between their – our – digital realities? Maybe then we could start uniting people, and do something about the current mess?
- Marketers should strive for a balance between detail and outlook. Not only the instruments but also the hypotheses are key to understand audiences, their expectations and drivers.
- Seek understanding. Many state-of-the-art methods for message testing and audience analysis are black boxes. They often function well, but rarely provide marketers with an understanding of their broader context.
- Build bridges, not bubbles. Rather than reinforcing bubbles of information exchange, those responsible should participate to overbridge and connect them. Science, culture and democracy thrive in open societal climate with the entire public as participants.
Analysis: Mehrdad Mamaghani, Data Scientist, Prime Development
Writing: Markus Reimegård, Head of Unit, Prime Development
Illustrations: Helena Melander, Prime Studio
All analysis performed in R.
Semantic analysis conducted on processed text corpora using pairwise text classification ensembles over weekly periods from January 2016 to August 2017 with up to 3500 comments/week randomly sampled from each media outlet.
Map coordinates produced via t-SNE 2D embedding based on cosine distances between thousands of posts with sample size of 40 million chosen randomly from ~3 billion entries.