∵ Could Cambridge Analytica Have Delivered Donald Trump’s 2016 Presidential Victory? An Anthropologist’s Look at Big Data and Political Campaigning

I first provide some context about Cambridge Analytica’s (ca) activities, linking them to ca parent company, scl Group, which specialised in “public relations” campaigns around the world across multiple sectors (from politics to defence and development), with the explicit aim of behavioural change. I then analyse in more detail the claims made by mathematician and machine learning scholar David Sumpter, who dismisses the possibility that ca might have successfully deployed internet psychographics (e.g. online personality profiling) in the winning 2016 Trump presidential campaign in the US. I critique his arguments, pointing at the need to focus on the bigger picture and on the totality of ca methods, rather than analysing psychographics in isolation. This is followed by a section where I use ca whistleblower Christopher Wylie’s 2019 memoir to show the important role that in-depth qualitative research and methods akin ethnographic immersion might have played in building ca big data capabilities. I provide an angle on big data that sees it as complementary, rather than in opposition to, human insight that comes from qualitative immersion in the social realities targeted by ca. The concluding section discusses additional questions that should be explored to gain a deeper understanding of how big data is changing political campaigning, with an emphasis on the important contribution that anthropology can make to these crucial debates.

by ca. The concluding section discusses additional questions that should be explored to gain a deeper understanding of how big data is changing political campaigning, with an emphasis on the important contribution that anthropology can make to these crucial debates.

Keywords
Cambridge Analytica -big data -political campaigns -psychometrics -internet psychographics -qualitative research -2016 US presidential election -anthropology of algorithms More than four years have passed since the political earthquake caused by Donald Trump's US presidential victory in 2016, and yet the debate over the causes and conditions that made such victory possible continues unabated. Was it a backlash against the rise of the first and only US president of colour Barack Obama? Did Trump manage to mobilise white working class voters with a mix of "economic nationalism" and anti-immigration rhetoric? Was the vote about economic anxiety or racial mobilisation? Whatever one thinks about these thorny questions, one discussion seems particularly hard to settle: could Trump have won, after all, because of data campaigning by the now defunct UK-based political consulting firm Cambridge Analytica (ca from now onwards)? Did big data and online personality profiling allegedly used by ca deliver his victory?
Trump has lost the November 2020 presidential contest to the Democratic Party's candidate Joe Biden. But the tycoon's support in actual votes rose by around 11 million since 2016, indicating that his 2020 campaign -also, according to many media investigations, driven by data campaigning and social media messaging -might have failed to deliver his re-election, but was nonetheless highly effective.1 Trump's propaganda machinery was working against rather unfavourable material conditions. Yet, even after a disastrous handling of the pandemic which resulted in hundreds of thousands of deaths and massive negative economic consequences for the majority of Americans, Trump's campaign was able to embolden his base and gain new voters, and to convince a majority of his supporters that the US response to the pandemic was going well and that the US economy was in good or excellent conditions.2 At the local level, the 2020 Trump campaign reached the media spotlight in the aftermath of the election for having successfully targeted Florida's Latinx communities (many of Cuban and Venezuelan origin) with messaging depicting Biden as somebody who would bring Cuban or Venezuelan-style socialism to the US -Trump did win Florida, and Latinx support for him there increased significantly since 2016. 3 We will likely see more academic studies of the 2020 Trump campaign in the coming months and years, but for now we do have already at our disposal a large and constantly increasing amount of evidence on the 2016 campaign and on the role played by ca. This information comes from journalistic investigations, whistleblowers' accounts, document leaks, and commissions of inquiries from parliaments and government agencies in the US and the UK. While of course we cannot unproblematically assume that the methods used by ca in 2016 are the same as those deployed by the Trump campaign in 2020, the 2016 events provide crucial background for subsequent campaigns.
Despite the available evidence, our academic understanding of these issues is in formation, and there is a relative dearth of in-depth social scientific analyses of the ca data scandal, especially more qualitatively oriented ones -and anthropology is no exception.4 2 See cnn exit polls for the US 2020 presidential election, https://edition.cnn.com/ election/2020/exit-polls/president/national-results (Accessed 22 November 2020). 3 Agiesta, J., Luhby, T., Sparks, G., Struyk, R. (2020). More Latino Voters Support Trump in 2020 than 2016, but Young Americans Favor Biden, Early cnn Exit Polls Show. cnn, 5 November; Santiago, F. (2020). In The available academic literature often eludes in-depth assessments of whether ca methods could have delivered Trump's election. Some of those who have taken a clearer stance in this matter suggest that ca methods could have indeed shifted voter behaviour in unethical, if not illegal, ways. These analyses are frequently informed by concerns about privacy and the integrity of the democratic process.5 In his analysis of the ca case, anthropologist Roberto J. González expresses scepticism that big data could have played a decisive role in delivering Trump's election.6 In this, he joins a chorus of other scholars and public intellectuals who have dismissed that possibility.7 Scholars that are more worried about the possible influence of ca tactics on voter behaviour have not on the whole tried to debunk the arguments of the sceptics.8 In this article, I will try to fill this gap by providing a detailed critique of some of the arguments put forward by sceptics. My work joins the growing scholarly literature that has been sounding alarm bells about ca and similar digital campaigning operations.9 My approach to public anthropology is interdisciplinary. I weave anthropological and social scientific concerns, to show the limits of understanding complex cases such as the ca data scandal exclusively through disciplinary lenses. The contribution anthropology can make to these issues assumes a bifocal approach: a reflexive look at our own discipline and its conceptual and empirical tools, in parallel with an interdisciplinary dialogue that makes such insights relevant to a wider range of academic and non-academic audiences beyond the disciplinary boundaries of anthropology.
We will probably never be able to provide a definitive answer to the question of whether ca data capabilities delivered Trump's victory, mostly because we do not have the smoking gun, that is, the models and datasets used by ca are not available for scrutiny.10 We can however productively assess the significant amount of evidence so far, especially if we reformulate the issue in the following way: -is it possible that ca developed effective political campaigning tools, including, but not limited to, granular data-driven techniques, that could have given a significant advantage to Trump in the 2016 election? Two recent books by ca whistleblowers published in October 2019 -Christopher Wylie's Mindf*ck and Brittany Kaiser's Targeted -provide us with nuanced and layered testimonies of how ca worked from the inside.11 The accounts move beyond, but also help us make sense of, the fragmented evidence from the various commissions of inquiries, leaks and journalistic investigations.
The fact that both authors were social science PhD candidates at the time of working for ca might help to explain why their views might be particularly relevant for a social scientific analysis of the whole affair. Wylie was doing a PhD in machine learning and fashion at Central Saint Martins in London when he was hired by scl Group, ca parent company, at the end of 2013.12 , 13 In his book, he highlights the connections between his interests in fashion and politics, as he was keen to develop big data models that explored how aspects of social and cultural identity such as fashion styles are related to political attitudes and behaviour.14 In March 2018, his whistleblowing activities became public, and in January 2019, true to his passion, Wylie took up the post of research director of Swedish fashion retailer H&M.15 There was also a pragmatic reason that seemed to make employment with scl attractive: scl agreed to pay the tuition fees for his PhD, which were substantial, given his status as a Canadian international student in the UK.16 Financial considerations seemed to weigh big on Kaiser's decision to join scl too. When she was offered a job by scl director Alexander Nix in October 2014, she was in need of money due to an ongoing financial crisis in her family. Kaiser claims that it was her economic situation that made her set aside her reservations about working for a company involved with US Republican cam-paigning17 -she describes herself as a "lifelong Democrat and a devoted activist who had worked for years in support of progressive causes."18 Like Wylie, Kaiser was interested in the relationship between big data and politics, and she juggled her consultancy activities for scl and ca with PhD studies focused on "preventive diplomacy," which included a focus on how big data could inform international peacekeeping organisations to prevent conflict in high-risk areas around the world.19 Thanks to these whistleblowers' accounts, we can now get a better sense of how ca worked as a coherent, complex organisation made first and foremost by humans, in a way that video bites, short media pieces and parliamentary inquiry reports cannot quite do. The books have an ethnographic feel that can push anthropological analysis further. While I use both works as sources to inform my analysis, I focus in this article primarily on Wylie's account, as he discusses the research methodologies employed by ca from the perspective of a qualitatively oriented data scientist. Kaiser's book is more focused on the business and marketing aspects -she worked with ca business development operations.20 I first provide some context about CA activities, linking them to ca parent company, scl Group, which specialised in "public relations" campaigns around the world across multiple sectors (from politics to defence and development), with the explicit aim of behavioural change. I then analyse in more detail the claims made by mathematician and machine learning scholar David Sumpter. In a recent book, Sumpter dismisses the possibility that ca might have successfully deployed internet psychographics (e.g. personality profiling from online user data) in the 2016 Trump campaign.21 I critique his arguments, pointing at the need to focus on the bigger picture and on the totality of ca methods, rather than analysing psychographics in isolation. This is followed by a section where I use ca whistleblower Christopher Wylie's memoir to show the important role that in-depth qualitative research might have played in building ca big data capabilities.22 I provide an angle on big data that sees it as complementary, rather than in opposition to, human insight that comes from qualitative immersion in the social realities targeted by ca. The concluding section discusses additional questions that should be explored to gain a deeper understanding of how big data is changing political campaigning, with an emphasis on the important contribution that anthropology can make to these crucial debates.
The analysis is informed by insights from critical propaganda studies and from work on dual use anthropology.23 The ca story is not just one of effective 20 The two memoirs complement each other temporally as well. (or ineffective) aggressive tactics in voter manipulation and of spectacular breaches of citizens' privacy. It is also one of many instances where scientific research and academic personnel have been deployed from the beginning as the weapon for unethical and often illegal propaganda activities, ranging from foreign interference in domestic politics of Northern and Southern countries, to working in military and security operations directed by powerful and unaccountable government and corporate interests. Together with González, my article contributes towards filling an important gap in the nascent anthropology of algorithms: the study of the use of algorithms in political campaigns.24 It provides one possible answer to Nick Seaver's question "What should an anthropology of algorithms do?": it should critically explore, among other things, the use and abuse of algorithms in election campaigning and political communication more broadly. 25 Besteman and Gusterson have edited an important collection advancing the state of the art in the discipline.26 The style of the collection is accessible and, for the most part, keeps explicit scholarly citations to a substantial set of endnotes. There is generally little canon anthropology mentioned in the chapters, except for the introduction by Gusterson, who acknowledges that the critique of algorithms has, so far, not seen anthropologists in a prominent position -which is the reason why most of the academic literature explicitly discussed by Gusterson in his introduction, by Besteman in her afterword, and by their colleagues in other chapters, is not anthropological.27 While some chapters are based on ethnographic data, several others (such as one on ubiquitous surveillance and another on the logics of quantification) are not, showing that, to study what many now refer to as the "black box" of algorithms, we are frequently pushed beyond ethnography.28 In the introduction to a seminal short collection of essays on the anthropology of big data, Boellstorff and Maurer make a similar point, and mention Tim Ingold's influential intervention "Anthropology is not ethnography."29 In terms of its ethnographic contribution, Besteman and Gusterson's edited collection provides an important range of empirical cases from the US which focus on the use of algorithms in home foreclosures, standardised testing and evaluation in education, detention and deportation of minors in immigration custody, and felony convictions.30 These ethnographic accounts are centred around the notion of "roboprocess," which, in Gusterson's theorisation, emphasises the interaction between people and algorithms as an automated algorithmic process that effectively traps humans in specific situations where the "common sense and situational logic of humans is displaced by and subordinated to the logic of automation and bureaucracy."31 The primary focus of most chapters is on the interface between humans and technology in people's daily encounters with algorithms: the people that the authors engage with tend to be on the receiving end of the negative and often unintended effects of the algorithm, or include the middle-ranking clerks and officials executing the algorithms' decisions.
My article fills an important gap in this respect: the focus in my discussion is on the methods and techniques that those who create and manipulate algorithms employ, and on the effects of the algorithms seen from the perspective of those who are in the control room so to speak -in this I follow Seaver's call for studying the human in and behind the algorithms, and for avoiding the fetishisation of a supposed "autonomy" of algorithms "freed" from human factors.32 While my analysis is not ethnographic -I did not have direct access to the main players of the Cambridge Analytica data scandal and their activitiesit remains nonetheless anthropological in style and scope. The anthropology envisaged here is one that is not tied to a particular canon, and in line with Besteman and Gusterson, aims to develop an anthropological sensibility 29 Boellstorff, T., Maurer, B. (2015) directed to a wider audience which includes both anthropologists and other social scientists interested in these topics.33 I have followed the Cambridge Analytica scandal and the 2016 US presidential election from early on, and my insight is coupled by a long-term engagement as a political commentator with the rise of social media politics, spontaneous protests, and the social media messages of political movements across the political spectrum in the US and Italy.34 In recent years, I have also participated as an activist scholar in various transnational workshops, conferences and informal conversations with progressive party and social movement activists from around the world, and especially from Africa, Europe and North America. In all these events, discussions about the uses and abuses of social media for political campaigning were key, and I had the privilege of rapidly updating my knowledge with expert insiders, including social media campaigners and cybersecurity privacy activists. It is difficult to call this knowledge "ethnographic" in the classic sense, but there is no doubt that I am speaking to some extent "from the inside," and have been autoethnographically immersed in these flows and events -and the "hybrid networks" that they encompasssince my early days as a direct action activist in Cambridge in 2009-2010.35

What Did Cambridge Analytica Actually Do?
Cambridge Analytica and its parent company scl Group shut down in May 2018. ca was hired by Trump's successful 2016 presidential campaign, and is alleged to have played an important part in the Leave campaigns of the Brexit referendum -although its role in the latter remains unclear.36 The firm was tightly knit with, and often indistinguishable in practice from, its parent company scl Group. Both companies were involved in several more election 33 Besteman, Gusterson, Life by Algorithms. scl Group ran four divisions: scl Defence (running defence contracts primarily with US and UK state agencies), scl Social (running projects funded from development aid), scl Elections (the division most closely tied to ca activities, but predating ca by several years) and scl Commercial (running contracts with the private sector, which ca also did).41 Despite the broad variety of sectors the two companies engaged with, the common thread was that they specialised in "public relations" campaigns aimed at behavioural changethe latter is a label well known to anthropologists of development, think for instance about behavioural change campaigns designed to increase condom use in hiv-affected countries.
scl expertise had developed on the back of the British think tank Behavioural Dynamics Institute (bdi), founded in 1990 by Nigel Oakes.42 In 2010, the bdi presented itself as an "academic institute that specialises in understanding influence and persuasion in order to change audiences' attitudes and behaviour. The institute specialises in applying its methodology to military and political campaigns, where the audiences are hostile or friendly, national or international."43 The bdi "methodology draws extensively from group and social psychology and incorporates semiotics, semantics and many elements of cultural anthropology."44 The bdi stated goals hinted at ambitions that went well beyond academic research, and in practice the distinction between bdi and scl/ca was blurred, with several personnel and projects straddling across the academic and business operations. Named academic members and contributors on their archived website ranged across many disciplines, including clinical and social psychology, sociology, political science, international relations, and media & communications. Many were based in academic social science departments in allegedly "independent" units in well renowned universities, while others had overt ties with military and security agencies in the UK and the US. Some had strong affiliations with both worlds. Anthropology featured in some of the academic interests listed, but none of the people profiled with their bios in the bdi website seem to have had graduate training in anthropology. However, there are indications that anthropology might have played a more direct role, as journalist and disinformation expert Peter Pomerantsev claims in his 2019 book on social media and propaganda: Oakes pioneered surveys by teams of anthropology students, who, usually without revealing their mission, spent long periods penetrating a community, enquiring about who people hated and trusted, what they most desired, which friends would influence them, what dictated how they behaved within a group.45 Pomerantsev's account of the pre-social media days of scl seem in fact to give quite a lot of weight to the influence of long-term fieldwork on the development of scl research techniques.46 And of course the countries where scl worked in political, and later military and development campaigns had long been, at least in the Anglo-American academe, the remit of anthropological knowledge. In its more recent data-driven incarnation, ca itself deployed several data scientists and scientific knowledge from academia, in particular from the University of Cambridge. The choice of the name was not simply an act of "prestige appropriation," it also indicates an active recruitment of Cambridge data scientists and deployment of cutting edge research on internet psychographics developed at the university.47 According to leaked documents, ca used the bdi methodology, centred around Target Audience Analysis (taa), a method that is well established in Anglo-American military propaganda circles and has been discussed in many defence publications.48 The alleged innovation of this method in the propaganda and psychological operations (psyop) field is that it aims to study individual, social, cultural, political and economic characteristics of specific groups and their members before a specific campaign is launched. This then informs the crafting of messages that are directed at such groups or some segments of them, and that build on the knowledge gained about a combination of various psychological and socio-cultural traits emerging from the analysis.
Messages are tailored to their target audiences, and knowing such audiences becomes a key priority for successful behavioural change campaigns. The original insights of taa were further developed by ca, which coupled them with the power of big data -including, but not limited to, personality profiling of social media users -with the aim of significantly increasing the effectiveness and reach of targeted messages.49

Internet Psychographics and Beyond
A whole host of arguments have been put forward by academics and public intellectuals to the effect of dismissing the possibility that ca might have in fact delivered Trump's victory. One common element in many of these analyses has been to focus on ca claim that it used internet psychographics to develop successful targeted messages that would bring a candidate to victory.
Many scholars argued that it is unlikely that this was the case, and some highlighted that this was probably just a hyped business message to promote the company, and that buying into that would in itself help the company get more clients.50 Others pointed out that the moral panic around the use and abuse of personal data from online use in the ca case was somewhat misleading, as microtargeting and the use of big data to influence people's behaviour has been a staple of commercial companies and the tech sector for quite some time.51 Few among the sceptics however stress the close links between the military and security establishment and ca, and the history of applied social scientific propaganda studies carried out by scl.52 On the whole, critics tend to narrowly focus on internet psychographics, no doubt driven by legitimate doubts about some of ca marketing spin possibly overstating their case.53 Many scholars who take more seriously the possibility that ca methods might have been effective also focus on internet psychographics and big data.54 For the purposes of this article, I would like to consider the arguments put forward by David Sumpter, a mathematician who works with modelling social behaviour and machine learning and published the popular nonfiction book Outnumbered. Sumpter's work provides a compelling and informative explanation of the complex statistical techniques behind internet algorithms, discussing a vast array of cases, including CA personality profiling claims.55 I argue that Sumpter's analysis -and some important contradictions in it -provide key elements to interpret ca whistleblowers' accounts. By translating complex technical matters for a non-specialist audience, Sumpter helps us humanise the ca story, and evaluate the scientific logics at work in the use and manipulation of statistical data in algorithms and of what is commonly lumped in the category of big data.
I will critique his claim that ca could not actually have done much with psychographics, if they ever used them at all. At the same time, I will use some of Sumpter's arguments about social statistics and big data to show why ca could have indeed gone quite far in successfully influencing voter behaviour to Trump's advantage. It should be noted that Sumpter published his book in June 2018, more than a year before the whistleblowers' accounts and the most recent company document leaks by one of the whistleblowers, so it is quite possible that the mathematician's scepticism about ca data capabilities was informed by the evidence that was available to him then.56 Both Christopher Wylie and Brittany Kaiser have no doubt that a massive amount of data was collected on the US population. Their estimates range from detailed profiles of "tens of millions of Americans …, with potentially hundreds of millions more to come"57 to "some 5,000 data points on every single American over the age of eighteen."58 , 59 These data did not only come from Facebook, but also from the national census, credit histories and anything else ca could get its hands on. The exact amount and nature of the data collected remains unclear, and, according to Kaiser, the company would not go into details over what datasets it owned and how it obtained them.60 Building on rapidly growing research on the usage of social media data to determine personality traits, and in particular work carried out by Michal Kosinski, David Stillwell and other colleagues, Cambridge Analytica allegedly used the ocean personality model (Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism) to profile "large numbers of US voters."61 , 62 Wylie claims that research from the Department of Psychology 56 In January and in October 2020, Brittany Kaiser leaked a substantial number of ca company documents through the Twitter account @HindsightFiles. The files leaked in January 2020 are not available through the original links provided in the Twitter account anymore, but most of them have been stored by the Organized Crime and Corruption Reporting According to widely popularised research carried out by Youyou, Kosinski and Stillwell, computer models based on Facebook "likes" were better than friends, family, spouses and co-workers at assessing somebody's personality traits.65 Among other things, ca allegedly developed campaign messages for their American clients building on this body of research, based on the assumption that such messages were able to change the target's voting behaviour.66 Put it this way, it does sound like a pretty bold claim, especially for a model that is discussed in the literature in rather crude terms. In a meta-analysis of research on personality assessment from social media data, Azucar, Marengo and Settanni state that: "individuals with high openness tend to have larger networks;" "[i]ndividuals with high conscientiousness appear to be cautious in managing their social media profiles;" "individuals with high extraversion have been characterized by higher levels of activity on social media … and have a greater number of friends … than introverted individuals;" and "[i]ndividuals with high neuroticism … use more negative words in their posts [while] agreeable individuals tend to use fewer swear words and express positive emotions more frequently in their posts."67 From an anthropological perspective, ocean looks like little more than a simplistic straightjacket that surely cannot capture and describe the nuances and complexities of individual behaviour in real life social settings.
Simplistic or not, it is still worth taking a look at the available evidence: how was ocean actually used to craft campaign messages?
There is some evidence on the 2016 Trump campaign: in the pile of documents leaked by Kaiser in October 2020, a ca internal report dated November 2016 analyses ca activities during the campaign and their outcomes after the victory. As with other ca company documents, reading this report gives an eerie feeling, as it discusses in professional and technical language the effectiveness Model. In: Costa, P.T. and Widiger, T.A. eds. Personality Disorders and  of campaign messages based on insults, smears and fake news. For instance, the project ca was tasked with by the Make America Number One Super pac (primarily funded by Robert Mercer) is referred to in the internal report with the rather explicit label "Defeat Crooked Hillary", then uncannily shortened, in classic business fashion, to dch.68 In the same vein, the report boasts that "[t] he entire [Make America Number One] team should take confidence in the knowledge that we did work other groups and individuals were unwilling to do in defeating Hillary Clinton."69 The main focus of the project was to target around 9 million voters nationwide, with "a special focus on New Hampshire, Pennsylvania, Virginia, North Carolina, Florida, Ohio, Iowa, Colorado, Nevada, and Michigan," battleground states which could have gone in either direction (e.g. Republican or Democrat) in the election.70 The details provided on the target audience are sketchy, but the report suggests that the bulk of the voters targeted by ca digital media campaigns included Clinton voters who could be persuaded either not to vote for Clinton, or to vote for Trump.71 Elsewhere in the same document, however, it is suggested that Republican voters were also targeted in "get out the vote" campaigns aimed at increasing their chances of showing up to vote.72 The report claims that the most successful ads were run on Facebook and Google Search, but the full range of platforms used included YouTube, Twitter, Google Display Network, SnapChat, Pandora's internet radio, email and traditional television.73 Psychographics are discussed towards the end, in a section on email advertising. Here it is claimed that ca tested its ocean psychographics with two email campaigns.74 One email had a subject line that "was designed to be reassuring to people who ordinarily might have a propensity to worry" -it read "Preserve Freedom and Overcome Hillary's Candidacy" and was crafted with voters scoring high on neuroticism in mind. The email was sent to a group of people who scored high on neuroticism and another group who did not. The first group had 20% higher open rates than the second.
The second email campaign targeted only people scoring high on neuroticism, and had three types of email subject -one with a reassuring message (for example, "Calm the storm, stop Hillary"), another one with a fear-driven message (for example, "Electing Hillary destroys our nation"), and a last one with a generic message (for example, "Information from Make America Number 1"). The fear-based subject line had 10% higher open rate than the generic one, and 20% more opens than the reassuring one.
The report concludes that "[t]hese email campaigns demonstrate the effectiveness of psychographic profiling for enhancing email marketing campaigns."75 Kaiser's leaks also contain documents on the John Bolton Super pac, which funded victorious Republican candidates in the 2014 US mid-term elections in Arkansas and North Carolina.76 This set of documents provides important evidence about how ca allegedly approached the use of psychographics. In one example from North Carolina, the company documents discuss two groups that were targeted with different messages about the same issues: North Carolina Group 3 consisted of young, female voters who displayed high neuroticism and cared most about the economy, national security and immigration. These voters were shown advertisements that highlighted the failures of the current administration's national security policy.
North Carolina Group 4 consisted of an even split of male and female voters who displayed high conscientiousness and agreeableness. These voters cared most about the economy and education, so were shown advertisements that positioned national security as a family and social issue.77 Perhaps there is nothing here as mind-blowing as some sensationalistic accounts seem to imply, but nor is there anything that suggests outright that such an approach did not or could not work. At the very least, this evidence 75 Ibid., p. 89. 76 You can find the link to the John Bolton Super pac files leaked in January 2020 here: https://aleph.occrp.org/entities/43385707.074609bfc8dff1b3c9c63236c811d9964fdc3d05 (Accessed 24 October 2020). There is also a substantial set of additional documents on the use of psychographics in the same work for John Bolton Super pac in the October 2020 leaks; Cambridge Analytica -Select 2016 Related Documents. 77 The source document is available here: https://aleph.occrp.org/entities/d81dc9c2d60 96cf748ad735624e3e181e9c93334.2c4a3fa5e945bf6ed82f2e580df940052f8b05ff#page=1 (Accessed 24 October 2020). should push us to inquire further about the effectiveness of microtargeting in election campaigning. From the perspective of an expert modeller and statistician versed in such algorithmic methods, Sumpter tells us as much: computer models of human behaviour are often too simplistic to have high predictive validity, and when you compare humans with machine learning algorithms, usually humans match or beat the algorithms' predictions -he provides a vast array of examples to make this point.78 When it comes to ca, Sumpter tested the Facebook-based ocean model of Michal Kosinski and colleagues with an anonymised dataset of 20,000 users they released for psychology students.79 For four of the five ocean traits, the Facebook-based model recreated by Sumpter correctly predicted those traits in 60% of the cases -only slightly better than leaving it to chance (50%). For one trait (openness), the model predicted correctly two thirds of the cases. Based on these tests, Sumpter's conclusion is that personality assessment does not work -and hence would have been unlikely to work in the case of ca as well.
There are however a number of other factors emerging from the available evidence, and including claims Sumpter himself makes, that run counter such a clear cut conclusion. First, both Wylie's and Kaiser's accounts show that ca data operations were much more than just psychographics: with allegedly massive amounts of data about possibly hundreds of millions of Americans, ocean was just one tool in the data arsenal of the firm.80 Audience segmentation of voters carried out by ca was not just based on ocean, but included geolocation, census, health, credit and myriads more personal data that ca could get hold of. Both accounts also clearly say that internet psychographics were involved in only some of the models they used: the general point with algorithms -as Sumpter himself shows throughout his book -is that they can find out all kinds of patterns and statistical correlations from user data that would otherwise not be known with traditional statistical and qualitative research methods.
This means that algorithms can find valid patterns that humans who work with them, or other social scientists, might have no explanation for. Unlike earlier forms of artificial intelligence that tried to reproduce certain human 78 Sumpter, Outnumbered. 79 Ibid., pp. 50-54. 80 Wylie, Mindf*ck; Kaiser, Targeted.
logics, algorithms behind contemporary machine learning work on probability and do not need a "theory" to work.81 In some instances the colour of a button in political Facebook ads can be a decisive factor in clickthrough rates (how many times people click on the ad), and hence live experiments with the ads allow the online campaign team to make quick and continuous adjustments to improve the ads' effectiveness.82 Why do people with certain characteristics extracted from algorithmic analysis react better to one colour than another? That could be a whole research project, but fundamentally why that is so is irrelevant to how effective it is.
This also means that models based on the integration of ocean with other data and assumptions, can potentially improve the efficacy of the political messages, regardless of how sound the assumptions are from the perspective of social scientific knowledge.
The descriptions of the ca virtual dashboards available to campaigners suggest something akin nsa-like real time in-depth surveillance that can inform anything from physical campaigners knocking on somebody's door already knowing very detailed personal information about the house residents, to real time automatic adjustments of the kind and quantity of political ads, and who they are targeted to, through monitoring sophisticated feedback statistics about their effects.83 The internal report on the effectiveness of ca tactics for the 2016 Trump campaign mentioned before throws light on what looks like a fine-tuned, relatively complex but also quite clear and well planned digital methodology that supports and scales up human insight and decision-making beyond what human agency on its own could have achieved, with a relatively small and focused amount of resources. When it comes to money, we need to take ca claims with even more caution than usual, but the ca internal report states that they received around usd 5.6 mil from Make America Number One Super pac -a relatively small amount for the reach the campaign claimed to have had. 81 Boellstorff makes a similar point when he notes that "[a]lgorithmic living is displacing artificial intelligence as the modality by which computing is seen to shape society: a paradigm of semantics, of understanding, is becoming a paradigm of pragmatics, of search. Contemporary computational language translation, for instance, does not work by trying to get a computer to intelligently understand language: systems like Google Translate work by matching texts from a vast corpus, without the computer ever 'knowing' what is said"; Boellstorff, T. (2015). Making Big Data, in Theory. In: Boellstorff,T. and Maurer,B.,Data,Now, Denning, S. If that were not enough, in 2014 ca allegedly started to experiment with the covert creation of pages and groups on Facebook and other social media testing campaign messages, without declaring the real owners and administrators, or their algorithmic work. When a critical mass of members or followers was reached, administrators and troll accounts would suggest physical meet-ups that would further invigorate, and often inflame targets, in a highly invasive process of behavioural change, without any real knowledge from the targets of the experiments they became part of "in the wild."84 We do not know whether such experiments were carried out during the 2016 Trump election campaign, but there is no reason to exclude such possibility.
Another aspect that Sumpter glosses over is that, even if the prediction rates of the models he tested were as low as his findings suggest -which might not necessarily be the case given that the data in ca possession was likely of a far bigger magnitude and perhaps better quality than the dataset for student training tested by Sumpter -in the end ca was not indiscriminately targeting all Americans. As mentioned before, according to ca internal report, they were targeting 9 million voters in several battleground states. With the distortions of the electoral college, they only needed to influence the voter behaviour of a relatively small fraction of people. Trump won because of 80,000 votes in Michigan, Pennsylvania and Wisconsin. That gave him enough delegates to win the election, even though he trailed behind Clinton by nearly 3 million votes in the national count.85 Two of those three states (Michigan and Pennsylvania) were targeted by ca.
This means that, if the audience segmentation and big data analysis worked well enough, even a small increase in prediction rates over random chance could have shifted that amount of voters with enough money invested in internet ads. Beyond the figure provided by the ca internal report of around usd 5.6 mil, we know that the 2016 Trump campaign as a whole did spend big on Facebook targeted ads: usd 100 mil, according to the campaign digital director Brad Parscale.86 Interestingly enough, Sumpter himself makes the same point later in his book, citing approvingly another piece of research about voter turnout.87 Bond and others ran a randomised controlled trial of political mobilisation messages on Facebook, in collaboration with the tech company.88 Around 60 million US Facebook users were shown a message with information about their polling station and a button that users could press to indicate that they had voted. The message was accompanied by the faces of Facebook friends who had already pressed the button. The researchers selected a smaller group of users of around 600,000: these users where shown the same message, but without their friends' faces. There was also a control group of around 600,000 users who were shown no message at all. They tested the hypothesis that the "social message" (the one with the friends' faces) was more likely to bring people to the polls than the message without the social element of the friends' faces. They found that this difference was small (0.39%), but when scaled up through such big numbers thanks to the power of social media, the effective difference on voting behaviour was substantial: the researchers estimated that the social message might have brought 60,000 new voters to the polls through the direct effect of the messaging, and 280,000 more voters through the online social contagion effect, as the messages influenced not only the Facebook users directly targeted, but also their friends and friends of friends. In his analysis of these results, Sumpter concludes that "[a] small nudge made a big change to the number of people participating in democracy."89 So why would the Cambridge Analytica case be any different, at least in principle? Why could ca not have influenced through their data campaigning a few dozens of thousands of votes that delivered Trump's victory in battleground states?
The 2012 study that Sumpter mentions in such a positive light reviews other studies of voter mobilisation, arguing for a position that fits the ca case rather well: Voter mobilization experiments have shown that most methods of contacting potential voters have small effects (if any) on turnout rates, ranging from 1% to 10%. However, the ability to reach large populations online means that even small effects could yield behaviour changes for millions of people. Furthermore, as many elections are competitive, these changes could affect electoral outcomes.90

Big Data, Human Insight and In-depth Qualitative Research
At another point in his book, Sumpter captures the essence of his argument: we are too easily swayed by the power of big data, and we often buy into hyped messages that emphasise the power of the machine, somewhat implying that it is the machine that provides insights and solutions, and not the humans working with it.
Commenting on the limitations of election polls and on the criticisms waged against popular election polling blog FiveThirtyEight's mistaken prediction of a Clinton victory in 2016, Sumpter concludes that: Academic research has shown that polls are typically less accurate than prediction markets. As a result, FiveThirtyEight has to find a way of improving its predictions. There is no rigorous statistical methodology for making these improvements; they depend much more on the skill of the individual modeller in understanding what factors are likely to be important in the election. It is data alchemy: combining the statistics from the polls, with an intuition for what is going on in the campaign.91 The "data alchemy" Sumpter refers to is an important reminder that big data are used and manipulated by humans, and humans ultimately make the adjustments and the interpretations needed to make the models and the results useful and usable. But it would be incorrect to use this point as a criticism of the ca method.
Wylie's account confirms just that: it was ca employees, not the algorithms, that were using and abusing models and big data in conjunction with other methodologies, ultimately driven by their human insight. Wylie devotes several pages of his book reporting on in-depth qualitative data collection 90 Bond et al.,p. 295. 91 Sumpter,Outnumbered,p. 99. with Americans in focus groups but also unstructured everyday settings.92 In a rather ethnographic fashion, Wylie ended up watching tv with some of the participants in their homes.93 These activities allegedly informed the data-driven models that were going to be deployed to collect data about, predict and ultimately modify voter behaviour. That is why the US fieldwork conducted in early 2014 was carried out by "sociologists and anthropologists, none of whom were American."94 Deploying a common argument held by many anthropologists, Wylie makes a paean to anthropology as a discipline that understands the deep otherness of societies such as the US, something that, according to the whistleblower, Americans themselves would not be able to do as uncritical insiders to their own society: There's a tendency among Americans to see their country as exceptional, but we wanted to study it like we would study any country, using the same language and sociological approaches. It was fascinating to explore America this way, and because I am not American myself, I felt I was more able to cut through unquestioned assumptions of American culture and notice things that Americans don't see in themselves.95 Wylie goes on to exoticise Americans' "fetishistic" attitude towards guns with a vignette from a trip to rural Virginia -in ways that are now seen as rather questionable within anthropology itself, but would feature well in older anthropological accounts.96 A similar approach is used in other parts of the book to describe his encounters with right-wing Americans, as he and his colleagues were laying down the qualitative groundwork for big data modelling, personality profiling and other sophisticated forms of audience segmentation.
With its numerous references to specific concepts and findings from qualitative social science, Wylie's narration is a clear example of the social theories and human insight behind big data. It makes sense, because by the time we start reading about the power of big data, we have been given the social scientific human-centred conceptual frameworks to see what ca scientists were looking for, and how they allegedly weaponised that large amorphous mass of data they had gotten hold of. For anthropology and other cognate in-depth qualitative disciplines, the descriptions of ca journey through qualitative fieldwork cannot but sound alarming: so many ethnographers and other researchers carrying out in-depth qualitative studies will recognise elements of their own work, and yet, to see where our data and analysis could end up is no pretty sight. It provides another angle on dual use anthropology: not only the dual use of those like Wylie and colleagues who consciously went on to work for companies that weaponised traditional qualitative research methods, but also the unwitting dual use of the research of academics who are not working for such companies and interests, but who might one day see their scientific findings redeployed for highly unethical purposes.
Again, Wylie's story should be read in context. And the context here is provided by the qualitative methods scl, ca parent company, had been using for decades. The basic insight behind Target Audience Analysis (taa) might not be ground breaking, but perhaps this is exactly why the method seems feasible: if you want to change people's opinions and actions about something, do not bombard them indiscriminately with one message, but carry out in-depth studies of your target group, and then craft a variety of messages targeted at specific segments of that group.
Data is collected to find out the group's basic social, economic, psychological, political and cultural characteristics, and to understand its further articulation into subgroups with people holding different views and opinions, and various traits that can be exploited for behavioural change operations. As the 2016 Trump campaign and other populist campaigns across the world have shown us, you can target the most disparate groups with different messages that make them susceptible, in order to gather their support, or to suppress support for rival politicians.
The role of big data here is not that of some magical all-powerful tool that does away with human insight altogether, but rather of a powerful granular information infrastructure that can scale up and increase the effectiveness of such humanly crafted messages to reach out to people and social groups that would otherwise not be reached by conventional campaign methods. Two features make the US an ideal playground for this kind of operations: a very high internet and social media penetration that produces immense amounts of personal data of all kinds; and the absence of regulatory mechanisms protecting personal data capture by private companies, enabling companies to do as they wish with the large masses of personal data made available.

Cambridge Analytica, Data Campaigning and the Future of the Human
To go back to the original question, my answer is that it is quite possible that ca could have effectively provided a significant advantage for Trump in his 2016 presidential election campaign. The discussion is relevant well beyond the confines of the 2016 US election and of the now defunct ca or its successors. ca operations should not be seen as a sensationalised story of "rogue" scientists, but rather as an effort to advance common techniques of contemporary digital capitalism such as audience segmentation and microtargeting in the field of election campaigning, closely related to the relentless rendering of human experience into data for the dual purpose of commodifying and securitising potentially all aspects of humanity.97 If then companies like ca can carry out "persuasion" campaigns of such magnitude and depth, there is reason to worry: such methods are carried out largely with the lack of meaningful and true informed consent, and without affected and targeted actors knowing that they have been manipulated and enrolled for purposes that are not their own. Zuboff's compelling treatment of "surveillance capitalism" (another label for digital capitalism) conceptualises this as the repurposing of human experience for others' ends -in this case the others are the corporations, the colluded politicians and the technological cadres that actively support the infrastructure of digital capitalism.
Many questions remain unanswered and the debates I have drawn upon in this article are still in their nascent form. The epistemological, methodological and empirical insights of anthropology can be of great service.
We need to dig deeper into what it is that makes big data valid and applicable and under which conditions, and explore in more detail how humans are effectively reduced to data points and reconstructed into modelled versions of the "real thing." Such an analysis would need to go back to key epistemological questions about the nature of modelling, about the status of qualitative vis-à-vis quantitative data, and about holism, complexity and discreteness. How are real people broken down into discrete units of data, and then recomposed in all kinds of assemblages through algorithms? What happens to humans and humanity in that process?
A subplot running through this article has to do with dual use and the role of anthropologists and other social scientists in furthering the scope and applicability of such data-driven tools. The question of ethical responsibility and the ongoing collusion between universities and military, intelligence and corporate interests is as cogent as ever. 98 We need to build on the crucial work carried out on these topics by anthropologists, but we also have to revise and adapt their insights to a different organisational and political economy context.99 The ca data scandal was of a quite different magnitude in terms of media coverage and popular reactions from the Wikileaks saga and Edward Snowden's nsa exposé. In the latter, governments have been active in prosecuting and categorising whistleblowers as criminals and traitors, while this did not happen with ca and scl. If anything, the government actors who have been involved with ca and scl refuse to claim responsibility for their actions, and the companies' status as private contractors creates a grey area of responsibility, where governments' doctrine of plausible deniability is pushed to new heights. The ca whistleblowers' accounts read more in the line of corporate professionals breaking ethical guidelines, than agents of military and intelligence interests interfering in domestic and foreign politics.
What is dual use when the boundaries between corporate and academic research are increasingly blurred, and industry applications are seen as less militarised than classic forms of recruitment from state military and security agencies? What kind of ethical smokes and mirrors, self-justifications and public perceptions are at play when somebody is hired by ca rather than the cia?100 Finally, we should not lose focus on the more immediate goals of campaigns such as those carried out by ca for Trump in the US. This is probably the topic for another article, but the ca whistleblowers' accounts confirm that the ca case is as much about big data and dual use as it is about the weaponisation of racial, ethnic, gender, sexual and religious identities, with the return of "culture" to the centre-stage of political campaigning. Anthropologists have largely grown weary of the culture concept and of its essentialist connotations, but is it perhaps time to critically engage with "culture" again? Is Stroeken right when he claims that "the anthropologists' silence [on culture] condones for the larger public the hierarchy of cultures that is used to justify military intervention in 98 See also Laterza, Cambridge Analytica. 99 See for instance Price, Cold War on dual use in the cold war period, and González, American Counterinsurgency on the recent deployment of social scientists in "human terrain" studies. 100 On anthropologists working for the cia during the Cold War, see Price, Cold War.
Iraq, Afghanistan, and soon Africa?"101 Does this apply to today's right-wing propaganda operations and their weaponisation of culture too? Why do so many ethnographies of contemporary far right and populist movements not focus on the media and communication infrastructure that produces rightwing populist propaganda?102 Without strict scrutiny and regulatory frameworks, the kind of methods used by ca are likely to be deployed in current and future political campaigns. Online persuasion operations have certainly not been put on hold because of the pandemic, but could in fact become tools to weaponise the virus for geopolitical and biopolitical agendas that go beyond the immediacy of the Covid-19 health threat. These tactics are also likely to thrive in a context of even deeper and wider internet penetration, as social distance and stay-at-home measures have pushed more and more people online and for longer parts of their daily life. In the 2020 US presidential election campaign, we have seen Trump's public messages repeatedly mention Covid-19 as the "China virus," further fuelling 101 Stroeken, K. ed. (2012). War, Technology, Anthropology. Berghahn, p. 2.Consider for instance political scientist Samuel P. Huntington's notorious theory of the "clash of civilizations" as one example of a gross oversimplification of cultural difference that has been widely used to justify the military and security activities springing from the War On Terror launched by the US in the aftermath of the September 11 attacks; Huntington, S.P. (2011 [1996]). his hostile geopolitics against the Asian superpower.103 This was coupled with calls for "herd immunity" as a response to the pandemic -the proposal, rejected by most scientists, that letting the virus run through the community would build immunity and be a better alternative to the social and economic damages caused by lockdowns. 104 This article and the recent work by Besteman and Gusterson, Boellstorff and Maurer, and Seaver, have hopefully shown the promise of an anthropological understanding of what Genevieve Bell calls "the socio-technical imagination" that underpins big data and algorithms.105 With an emphasis on human agency and on the multiple human and non-human logics that span across technology and society, anthropologists are particularly well positioned to study the processes of continuity and change that sustain rapidly evolving human-technology relations. Anthropology can help us understand what it means to be human in today's technological world and reflect on what the future has in store for humans as social, cultural, political and economic beings. Anthropologists should not miss the opportunity to contribute to these crucial debates. They should push the conversation further, within and beyond their disciplinary boundaries, and should engage with multiple audiences inside and outside academia.