diagram_web/paper.html

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <meta name="author" content="Ruben van de Ven, Ildikó Zonga Plájás, Cyan Bae, Francesco Ragazzi" />
  <title>Algorithmic Security Vision: Diagrams of Computer Vision Politics</title>
  <style>
    code{white-space: pre-wrap;}
    span.smallcaps{font-variant: small-caps;}
    span.underline{text-decoration: underline;}
    div.column{display: inline-block; vertical-align: top; width: 50%;}
    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
    ul.task-list{list-style: none;}
    .display.math{display: block; text-align: center; margin: 0.5rem auto;}

    /*Filenames with code blocks: https://stackoverflow.com/a/58199362*/
    div.sourceCode::before {
      content: attr(data-filename);
      display: block;
      background-color: #cfeadd;
      font-family: monospace;
      font-weight: bold;
    }  </style>
  <link rel="stylesheet" href="paper.css" />
</head>
<body>
<header id="title-block-header">
<h1 class="title">Algorithmic Security Vision: Diagrams of Computer
Vision Politics</h1>
<p class="author"><em>Ruben van de Ven, Ildikó Zonga Plájás, Cyan Bae,
Francesco Ragazzi</em></p>
<p class="date">December 2023</p>
</header>
<section id="abstract" class="level1">
<h1>Abstract</h1>
<div data-custom-style="Body Text">
<p>More images than ever are being processed by machine learning
algorithms for security purposes. Yet what technical and political
transformations do these sociotechnical developments create? This paper
charts the development of a novel set of practices which we term
"algorithmic security vision" using a method of diagramming-interviews.
Based on descriptions by activists, computer scientists and security
professionals, this article marks three shifts in security politics: the
emergence of synthetic data; the increased importance of movement,
creating a cinematic vision; and the centrality of error in the design
and functioning of these systems. The article then examines two tensions
resulting from these shifts: a fragmentation of accountability through
the use of institutionalized benchmarks, and a displacement of
responsibility through the reconfiguration of the human-in-the-loop. The
study of algorithmic security vision thus engenders a rhizome of
interrelated configurations. As a diagram of research, algorithmic
security vision invites security studies to go beyond a singular
understanding of algorithmic politics, and think instead in terms of
trajectories and pathways through situated algorithmic practices.</p>
</div>
</section>
<section id="introduction" class="level1">
<h1>Introduction</h1>
<div data-custom-style="Body Text">
<p>In cities and at borders around the world, algorithms process streams
of images produced by surveillance cameras. For decades, <em>computer
vision</em> has been used to analyze security imagery using arithmetic
to, for example, send an alert when movement is detected in the frame,
or when a perimeter is breached. The increases in computing power and
advances in (deep) machine learning have reshaped the capabilities of
such security devices. These devices no longer simply quantify vast
amounts of image sensor data but qualify it to produce interpretations
in previously inconceivable ways. Pilot projects and off-the-shelf
products are intended to distinguish individuals in a crowd, extract
information from hours of video footage, gauge emotional states,
identify potential weapons, discern normal from anomalous behavior, and
predict intentions that may pose a security threat. Security practices
are substantially reconfigured through the use of machine learning-based
computer vision, or "algorithmic security vision."</p>
</div>
<div data-custom-style="Body Text">
<p>Algorithmic security vision represents a convergence of security
practices and what Rebecca Uliasz calls <em>algorithmic vision</em>: the
processing of images using machine learning techniques to produce a kind
of "vision" that does not make sense of the “visual” but that makes
realities actionable (Uliasz, 2020). It does not promise to eradicate
human sense making, but rather allows a reconsideration of how human and
nonhuman perception is interwoven with sociotechnical routines.
Algorithmic security vision thus draws together actors, institutions,
technologies, infrastructures, legislations, and sociotechnical
imaginaries (see Bucher, 2018: 3). Yet how does algorithmic security
vision work — <em>how</em> does it draw together these entities— and
what are the social and political implications of its use? In this
article we explain how “algorithmic vision” and “security” can map out
sociotechnical practices and explore how their coming together reframes
what it means to see and suspect. We are not concerned with the
technical features of the systems, but with the societal and political
projects that are embedded in technical choices made in their
construction.</p>
</div>
<div data-custom-style="Body Text">
<p>We ground this article in Lucy Suchman’s notions of
<em>figuration</em> and <em>configuration</em> as both a conceptual
frame and a method of analysis which offer insight into the interplay of
technology, imaginaries, and politics. <em>Configuration</em> allows
access to key assumptions about the boundaries that are negotiated in
the practices of algorithmic security vision, and how entities solidify
and stabilize as they circulate. The realities that are made possible by
algorithms are performed and perpetuated in the design and description
of these systems (Suchman, 2006: 239; see also Barad, 2007: 91).</p>
</div>
<div data-custom-style="Body Text">
<p>To grasp the specificities of algorithmic security vision, we turn to
the professionals who work with those technologies. How do people
working with algorithmic security vision make sense of, or
<em>figure</em>, their practices? An important dimension of this paper
is therefore methodological. Suchman, drawing on Haraway, mobilizes the
trope of the <em>figure</em> to examine the construction and circulation
of concepts: "to figure is to assign shape, designate what is to be made
noticeable and consequential, to be taken as identifying” (Suchman,
2012: 49). To expand on traditional textual analysis of such figurations
we introduce time-based diagramming where we combine qualitative
interviews with drawing. With these diagrams that record both voice and
the temporal unfolding of the drawing, the figurations appear in spatial
and temporal dimensions.</p>
</div>
<div data-custom-style="Body Text">
<p>We begin by situating our research in the debates on sensors,
algorithms and power, and outlining our theoretical and methodological
approach. Then, drawing on the time-based diagrams, we discuss three
figurations that challenge us to rethink our understandings of
algorithmic vision in security: algorithmic vision as synthetically
trained and cinematic, and the error as an inherent feature of
algorithmic vision. In a second step, we outline the fragmentation of
accountability through the use of benchmarks, and the reconfigurations
of the human-in-the-loop.</p>
</div>
</section>
<section id="sensors-algorithms-power" class="level1">
<h1>Sensors, Algorithms, Power</h1>
<div data-custom-style="Body Text">
<p>The critical reflection on the politics of algorithmic security
systems is not novel in Geography or in interdisciplinary debates, along
with Science and Technology Studies, Critical Security Studies or Media
Studies (Fourcade and Gordon, 2020; Graham, 1998; Mahony, 2021; Schurr
et al., 2023). Yet aside from a few exceptions (Andersen, 2018;
Bellanova et al., 2021), the politics specific to computer vision in the
security field have been overlooked.</p>
</div>
<div data-custom-style="Body Text">
<p><em>Computer vision</em> is a term used to designate a multiplicity
of algorithms that can process still or moving images, producing
information upon which human or automated systems can make decisions. It
is meant to replicate certain aspects of human cognition. Algorithms can
be used to segment parts of an image, detect and recognize objects or
faces, track people or objects, estimate motion in a video, or
reconstruct 3D models based on multiple photo or video perspectives
(Dawson-Howe, 2014).</p>
</div>
<div data-custom-style="Body Text">
<p>Some scholars working on algorithmic security have addressed the role
of "operative images," which are "images that do not represent and
object, but rather are part of an operation" (Farocki, 2004). Authors
have shown how algorithms organize the regimes of visibility in
platforms such as YouTube and Facebook (Andersen, 2015), in war and
especially in military drone strikes (Bousquet, 2018; Suchman, 2020;
Wilcox, 2017). Others have focused on machine-mediated vision at the
European border by analyzing the functioning of EUROSUR (Dijstelbloem et
al., 2017; Tazzioli, 2018;) and SIVE (Fisher, 2018).</p>
</div>
<div data-custom-style="Body Text">
<p>These studies have contributed to theoretical debates around novel
practices algorithmic power (Bucher, 2018), surveillance capitalism
(Srnicek and De Sutter, 2017; Zuboff, 2019) and platform politics
(Carraro, 2021; Gillespie, 2018). Some works have described the social
and political effects of surveillance and social sorting (Gandy, 2021;
Lyon, 2003), as well as the reinforcement of control and marginalization
of post-colonial, gendered and racialized communities (Fraser, 2019;
Thatcher et al., 2016), defined by Graham as "software-sorted
Geographies" (Graham, 2005).</p>
</div>
<div data-custom-style="Body Text">
<p>These debates have highlighted the entanglement of these technologies
with risk assessment and pre-emptive security logics (Amoore, 2014;
Aradau and Blanke, 2018). Critical work has started catching up with
machine learning as an algorithmic technique (Amoore, 2021; Mackenzie,
2017), marking a shift from the management of "populations" to
"clusters," the acceleration of knowledge feedback loops (Isin and
Ruppert, 2020), foregrounding the normalization of behavior through the
regulation of the "normal" and the "anomaly" (Aradau and Blanke, 2018).
Or, in Pasquinelli’s words, how algorithms "normalize the abnormal
<em>in a mathematical way</em>" (Pasquinelli, 2015: 8 emphasis in
original).</p>
</div>
<div data-custom-style="Body Text">
<p>Yet what characterizes the state of the literature is a segmentation
between work on the politics of the “sensor,” and those on the political
specificities of deep learning models.</p>
</div>
<div data-custom-style="Body Text">
<p>On the one hand, using the notion of “sensor society,” Mark
Andrejevic and Mark Burdon (2015) have noted the prevalence of embedded
and distributed sensors. They have noted a shift from targeted,
purposeful, and discrete forms of information collection to always-on,
ubiquitous, opportunistic ever-expanding forms of data capture.
Andrejevic and Burdon insist that the sensors are only part of the
story; infrastructures are also critical: “It is […] the potential of
the automated processing of sensor-derived data that underwrites the
productive promise of data analytics in the sensor society: that the
machines can keep up with the huge volumes of information captured by a
distributed array of sensing devices” (Andrejevic and Burdon, 2015: 27).
Yet their focus is more on the sensors than on the underlying
algorithmic infrastructures.</p>
</div>
<div data-custom-style="Body Text">
<p>In their work on “sensory power,” Engin Isin and Evelyn Ruppert have
recently analyzed the effect of recent developments in technological
software and infrastructure. Unlike the three traditional forms of power
identified by Foucault (sovereign, disciplinary, and regulatory) they
argue that sensory power operates through apps, devices, and platforms
to collect and analyze data about individuals' bodies, behaviors, and
environments. For Isin and Ruppert, the central notion of sensory power
is the cluster. Clusters do not merely constitute "new" representations
of "old" populations, but rather “intermediary objects of government
between bodies and populations that a new form of power enacts and
governs through sensory assemblages” (Isin and Ruppert, 2020: 7).
Despite their contribution to, thinking about sorting techniques and
their relations to new forms of power Isin and Ruppert, like Andrejevic
and Burdon, bracket the specificities of the underlying deep learning
models.</p>
</div>
<div data-custom-style="Body Text">
<p>A growing body of literature has explored the politics of machine
learning techniques. In her latest work on the “deep border,” Louise
Amoore revisits her 2006 essay on the “biometric border.” Her focus is
on “deep machine learning,” and the “capacity to abstract and to
represent the relationships in high-dimensional data” such as in image
recognition (Amoore, 2021: 6). She shows that the change in border
technologies, from simple IF-THEN algorithmics with pre-determined
variables, to complex, deep, “neural networks” characterized by the
indeterminacy of variables marked a profound change in the logic, and
thus the political effects of these technologies. Like Isin and Ruppert
she is interested in the notion of the “cluster,” which, “with its
attendant logic of iterative partitioning and rebordering, loosens the
state’s application of categories and criteria in borders and
immigration” (Amoore, 2021: 6). Yet her approach overlooks the
importance of sensorial data posited by Andrejevic and Burdon, and Isin
and Ruppert.</p>
</div>
<div data-custom-style="Body Text">
<p>In sum, we still have only a rudimentary understanding of the
politics of algorithmic security vision. So, how does one think
politically about the new relations among sensors, algorithmic vision,
and politics? We propose a methodology for exploratory research that can
help outline a research agenda.</p>
</div>
</section>
<section id="methodology" class="level1">
<h1>Methodology</h1>
<section id="configuration-as-a-methodological-device" class="level2">
<h2>Configuration as a methodological device</h2>
<div data-custom-style="Body Text">
<p>Recent scholarship on technology and security has emphasized the
importance of algorithmic systems as enacted through relations between
human and nonhuman actors (Aradau and Blanke, 2015; Bellanova et al.,
2021; Hoijtink and Leese, 2019; Suchman, 2006). Sociotechnical systems
act in "co-production" (Jasanoff, 2004), as "actants" in a network
(Latour, 2005), or in "intra-action" (Barad, 2007). In these
understandings, technology forms an ontological assemblage, in which
human agency is tied in with the sociomaterial arrangements of which it
is part. Humans and non-humans, technological objects and
infrastructures, all populate complex, sometimes messy networks where
the boundaries between entities are enacted in situated practices
(Haraway, 1988). This conception of technology "draws attention to the
fact that these relations are not a given but that they are constructed
— and thereby relates them back to cultural imaginaries of what
technology should look like and how it should be positioned vis-à-vis
humans and society" (Leese, 2019: 45)</p>
</div>
<div data-custom-style="Body Text">
<p>In this context, how can we understand the characteristics and
effects of security systems built on the analysis of sensor data through
“deep learning,” and the new security politics that they introduce? On
the technical level, the novelty of “algorithmic security vision” does
not lie in the sensors themselves, but in the new abilities of
“artificial intelligence software” (McCosker and Wilken, 2020). The
promise of the systems is that the multiplication of the sensors and
modalities of knowing, the ability to create information feeds that are
under the scrutiny of automated systems means that data collection and
data analysis are no longer separated; surveillance can happen in real
time, capturing life as it unfolds, so that the operators can act on
hotspots, clusters, or the moods and emotions of a crowd (Andrejevic and
Burdon, 2015; Isin and Ruppert, 2020).</p>
</div>
<div data-custom-style="Body Text">
<p>To make sense of such developments, Suchman’s concept of
configuration is a useful methodological “toolkit.” It helps
“delineating the composition and bounds of an object of analysis”
(Suchman, 2012: 48) and allows us to conceptualize algorithmic security
vision as heterogeneous assemblages of human and nonhuman elements whose
agency is "an effect of practices that are multiply distributed and
contingently enacted" (Suchman, 2006: 267). We are interested here in a
framework that underscores "how the entities that come into relation are
not given in advance, but rather emerge through the encounter with one
another" (van de Ven and Plájás, 2022: 52).</p>
</div>
<div data-custom-style="Body Text">
<p>Suchman also draws our attention to the “ways in which technologies
materialize cultural imaginaries, just as imaginaries narrate the
significance of technical artefacts” (2012: 48). For Suchman,
“configuration” is a tool for “studying technologies with particular
attention to the imaginaries and materialities that they join together”
(2012: 48). The configuration of humans and machines is constructed
through discourse and practice, which, drawing on Haraway, she
conceptualizes as “figurations.” Sociotechnical systems thus do not
exist without their intended uses and users. Such discourses are an
important part of individual experience, collective professional
practices, and narratives about technology. Technologies bring together
elements from various registers into stable material-semiotic
arrangements. Those configurations draw attention to the political
effects of everyday practices and how they institute bounded entities
and their relations. If we take Suchman’s suggestion that algorithmic
security vision is complex and multiple, how can we get to "know" it as
an object of research, while acknowledging its partiality? When taking
the coming together of algorithms, vision and (in)security as
configuring imaginaries and practices in heterogeneous and complex
networks, how can we explore their politics?</p>
</div>
</section>
<section id="time-based-diagramming" class="level2">
<h2>Time-based Diagramming</h2>
<div data-custom-style="Body Text">
<p>Suchman defines <em>figuration</em> as “action that holds the
material and the semiotic together in ways that become naturalized over
time, and in turn requires ‘unpacking’ to recover its constituent
elements” (2012: 49). The first step in her methodology therefore
requires us to “reanimate the figure at the heart of a given
configuration, in order to recover the practices through which it comes
into being and sustains its effects.”</p>
</div>
<div data-custom-style="Body Text">
<p>In her work, Suchman has used a variety of methods of inquiry to
“reanimate the figure.” Qualitative interviews and ethnography have been
instrumental in producing the raw material for the analysis. In this
paper, we expand the methodological toolkit envisaged by Suchman to
multimodal methods that go beyond text to capture the materiality of
imaginaries and practices. We explore the epistemic possibilities of
capturing figurations as both semiotic and material traces.</p>
</div>
<div data-custom-style="Body Text">
<p>The result of our theoretical and methodological quest is a tool that
allows us to produce “time-based diagramming.” We use this method for
both elicitation and multimodal data collection. We presented our
participants with a large digital tablet, and asked them to draw a
diagram while answering our questions. Ruben van de Ven programmed an
interface that could play back the recorded conversation in drawing and
audio. The participants could not delete or change their drawings, so
their hesitations and corrections remained. The ad hoc <em>figuring
out</em> of the participants’ descriptions thus remains part of the
recording.<a href="#fn1" class="footnote-ref" id="fnref1"
role="doc-noteref"><sup>1</sup></a> In the phase of data analysis, the
software allows the diagrams to be annotated, creating short clips. The
diagrams thus enable a practice of combination and composition
(O’Sullivan, 2016), providing for a material-semiotic support to analyze
various imaginaries of algorithmic security vision.</p>
</div>
<div data-custom-style="Body Text">
<p>Diagramming is a key method in the field of technology, most notably
in the conceptualization and design of computational practices
(Mackenzie, 2017; Soon and Cox, 2021: 221). We don’t assume that the
materiality of the drawings brings us any closer to the materiality of
the actors’ practices, which are of a different order. Our interest is
in the possibilities offered by the diagrams: they are composed of
elements that are not necessarily similar, but are connected by their
mere appearance on the same plane, thus allowing heterogeneous elements
to co-exist. Diagrams are composed of parts that can be separated and
recombined in different ways, creating new formations and expressions
(O’Sullivan, 2016). Using such a multimodal tool seemed a pertinent
methodological setup to capture <em>figurations</em> and
<em>configurations</em> (see van de Ven and Plájás, 2022)<em>.</em></p>
</div>
<div data-custom-style="Body Text">
<p>We interviewed twelve professionals who developed, deployed or
contested computer vision technologies in the field of (in)security.<a
href="#fn2" class="footnote-ref" id="fnref2"
role="doc-noteref"><sup>2</sup></a> We asked them to describe the coming
together of computer, vision and (in)security from their professional
vantage points.<a href="#fn3" class="footnote-ref" id="fnref3"
role="doc-noteref"><sup>3</sup></a> In what follows, we focus on three
figurations and two configurations that emerged from the diagrams.</p>
</div>
</section>
</section>
<section id="figurations-of-algorithmic-security-vision" class="level1">
<h1>Figurations of algorithmic security vision</h1>
<div data-custom-style="Figure">
<p><img src="assets//media/image1.png"
style="width:1.67431in;height:1.11736in" /><img
src="assets//media/image2.png"
style="width:2.04861in;height:1.11736in" /><img
src="assets//media/image3.png"
style="width:2.27986in;height:1.08056in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 1. Collage of excerpts from the conversations. Computer
vision is often depicted as camera based. The third drawing depicts a
"sensor hotel" on top of a light post in the Burglary-Free
Neighborhood.</p>
</div>
<div data-custom-style="Body Text">
<p>To understand the politics involved in the introduction of
algorithmic vision in security practices, the first step was to see how
the practitioners we spoke with <em>figured</em> their own practices
through the use of our diagramming method. Our aim was to capture
through shapes, relations, associations, and descriptions, the actors,
institutions, technical artifacts, and processes in situated practices
of algorithmic security vision.</p>
</div>
<div data-custom-style="Body Text">
<p>When we asked our interviewees what unites practices of computer
vision in (in)security, they start by foregrounding the camera and the
(algorithmically processed) visual image. However, when they began
drawing these assemblages based on examples, complexities emerged. In an
example of crowd detection developed for the securitization of the
Hague’s seaside boulevard, multiple sensors are installed on lampposts
and benches to count passersby. Based on behaviors and moving patterns
in the public space, operators can know, how many people are on the
boulevard at a certain moment, and whether these are individuals, or
small or large groups — the latter of which might be seen as a potential
security threat. The Burglary-Free Neighborhood in Rotterdam uses a
“sensor hotel” installed under the hood of street lamps (Diagram 1)
where the trajectory of pedestrians is analyzed with sounds like
breaking glass, gunshots or screams. In the security assemblages
described by our interviewees, the camera is but one element. During the
diagramming, the figure of the visual is pushed out of focus.</p>
</div>
<div data-custom-style="Body Text">
<p>In analyzing the twelve diagrams, three central figurations in
camera-based algorithmic security practices emerged that help us to
rethink some central notions of the literature on algorithmic security:
(1) a figure of “vision” as increasingly trained synthetically, not
organically; (2) a figure of vision as cinematic and moving in time, not
photographic; (3) a figure of the error as a permanent dimension of
algorithmic vision, not as something that could be solved or
eliminated.</p>
</div>
<section id="from-skilled-vision-to-synthetic-vision" class="level2">
<h2>1. From skilled vision to synthetic vision</h2>
<div data-custom-style="Figure">
<p><img src="assets//media/image4.png"
style="width:3.38681in;height:2.57708in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 2. Sergei Miliaev distinguishes three sources of training
data for facial recognition technologies.</p>
</div>
<div data-custom-style="Body Text">
<p>In most of our conversations, algorithmic security vision is
understood to involve a particular subset of algorithms: deep neural
networks.<a href="#fn4" class="footnote-ref" id="fnref4"
role="doc-noteref"><sup>4</sup></a> Such a machine learning-based vision
brings to the fore one key dimension of security practices: the question
of training, and the ability to “see.” Training has been assumed as part
of the discussion around the socialization of security professionals
(Amicelle et al., 2015; Bigo, 2002), and algorithmic systems (Fourcade
and Johns, 2020) but scant attention has been paid to how training
elaborates upon and incorporates specific sets of skills.</p>
</div>
<div data-custom-style="Body Text">
<p>Some authors have explored the way in which the “seeing” of security
agents is trained at the border, by building on the literatures on
“skilled vision” (Maguire et al., 2014) or “vision work” (Olwig et al.,
2019). Maguire mobilized work in anthropology that locates human vision
as a “embodied, skilled, trained sense” (Grasseni, 2004: 41) that
informs standardized practices of local “communities of vision” (see
also Goodwin, 1994). Skilled vision is useful in that it draws attention
to the sociomaterial circumstances under which vision becomes a trained
perception (Grasseni, 2018: 2), and how it becomes uniform in
communities through visual apprenticeship. This literature examines the
production of “common sense” by taking training, exercise, peer
monitoring and other practices of visual apprenticeship as locus of
attention. Yet these works fail to capture the specificities of the type
of machine learning we encountered in our research. How then is visual
apprenticeship reconfigured under algorithmic security vision?</p>
</div>
<div data-custom-style="Body Text">
<p>In our conversations, the “training of the algorithms,” figures as a
key stake of algorithmic security vision. The participants in our
diagram interviews explained how deep learning algorithms are trained on
a multiplicity of visual data which provides the patterns a system
should discriminate on. In Diagram 2, Sergei Miliaev, head of the facial
recognition research team at VisionLabs in Rotterdam illustrated this
point.</p>
</div>
<div data-custom-style="Body Text">
<p>Miliaev distinguishes three sources for training images: web
scraping, “operational” data collected through its partners or clients,
and “synthetic” data. The first two options, Miliaev argues, have some
limitations. Under European data protection regulation it is very
difficult to obtain or be allowed to use data “from the wild” because it
is often illegal to collect data of real people in the places where the
algorithm will be used. Additionally, partners sometimes resist sharing
their operational footage outside of their own digital infrastructures.
Finally, when engineering a dataset, one cannot control what kind of
footage is encountered in “the wild.” This has led to the emergence of a
new phenomenon: training data generated in the lab.</p>
</div>
<div data-custom-style="Body Text">
<p>Synthetic training data is often collected by acting in front of a
camera. We see this in the case of intelligent video surveillance
(<em>Intelligente</em> <em>Videoüberwachung</em>) deployed in Mannheim
since 2018. Commenting on this case, chief of police
(<em>Polizeidirektor)</em> Dirk Herzbach explains that self-defense
trainers imitated 120 body positions to create the annotated data used
to train the behavior recognition technology. In another example, Gerwin
van der Lugt, developer of software that detects violent behavior,
stated that given the insufficiency of data available, they “rely on
some data synth techniques,” such as simulating violent acts in front a
green screen. Sometimes even the developers, computer scientists or
engineers themselves re-enact certain movements or scenes for training
their algorithms. In Diagram 3, two developers involved in the project
at seaside boulevard in Scheveningen give a striking example of how such
enactments of suspicious events require the upfront development of a
threat model that contains visual indicators that distinguish threat (a
positive detection) from non-threat (a negative detection). The acting
of the developers embeds these desirable and undesirable traits into the
computer model.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image5.png"
style="width:3.95in;height:3.075in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 3. Two developers involved in a project at the seaside
boulevard in Scheveningen describe the use of computer vision to
distinguish the legal use balloons from their illegal use for inhaling
the nitrous oxide gas.</p>
</div>
<div data-custom-style="Body Text">
<p>Sometimes the meaning of “synthetic” or “fake” data is pushed. Sergei
Miliaev explains how in the context of highly sensitive facial
recognition algorithms, software companies use faces generated entirely
through artificial neural networks to train their algorithms. Miliaev
mentions Microsoft’s DigiFace-1M (Bae et al., 2023), a training dataset
containing one million algorithmically generated faces. Such synthetic
training sets complicate the borders between sensor-originated and other
types of images. In the use of artificially generated images, one GPU
generates bytes that are interpreted by another. Algorithmic vision
occurs without direct reference to people or things that live outside
electronic circuits.</p>
</div>
<div data-custom-style="Body Text">
<p>These technical developments offer a changing figure of the skilled
vision of security that calls for new research directions. While these
technologies are still in their infancy, our interviewees see it as a
token of “better” and “fairer” technology that can circumvent racial
bias, as any minority can be generated to form an equal distribution in
the training dataset (Stevens and Keyes, 2021). But with an emerging
concern for algorithmic hallucination (Ji et al., 2023), glitches or
undesirable artifacts in the generated data, one wonders what kind of
vision is trained using such collections. Learning from synthetic data
thus produces an internalized vision, providing insights by circulating
data through a chain of artificial neural networks. While appearing in
new technological assemblages, the processing of images to form
archetypes is reminiscent of the composite photographs created by Galton
(1879). His composites were used to train police officers to identify
people as belonging to a particular group, circulating and reinforcing
the group boundaries based on appearance (Hopman and M’charek, 2020).
Which boundaries does “fake” or synthetized training data perpetuate?
Skilled vision shifts attention to the negotiations that happen before
algorithmic vision is trained, such as how algorithmic vision depends on
access to data and regulations around data protection.</p>
</div>
<div data-custom-style="Body Text">
<p>With the use of synthetic data, the question of a "community of
lookers" — the embodied social and material practices through which
apprenticeship is perpetuated — appears in a new light. Such a community
becomes more dispersed as generative models circulate freely online. For
instance, a generative model from Microsoft, trained on images shared
online, is used for training an authentication system in the Moscow
Metro system. Such models are informed by communities of looking from
which their training data is sourced, and the norms of that platform.
These norms then circulate with the model and become "plugged-in" to
other systems. Algorithmic vision, trained on synthetic data, is thus a
composable vision, in which different sources of training data mobilize
imagery from all kinds of aesthetic apprenticeships. The cascading of
generative and discriminative models thus reshapes security practices.
Furthermore, to comprehend changes in the politics of vision, attention
to the training of vision, as a moment of standardization and
operationalization, could be extended to the training of security
professionals.</p>
</div>
</section>
<section id="figuring-time-from-photographic-to-cinematic-vision"
class="level2">
<h2><span data-custom-style="Heading 3 Char">2. Figuring time: from
photographic to cinematic vision</span></h2>
<div data-custom-style="Body Text">
<p>Conversations with practitioners revealed yet another dimension of
the figure of vision in flux: its relation to time and movement. Deep
learning-based technologies distinguish themselves from earlier
algorithmic security systems based on their status as prediction models,
which by definition raises questions on the temporal dimensions of their
processing (Sudmann, 2021). Yet, how algorithmic security vision
reconfigures temporalities has yet to receive scholarly attention in CSS
and related disciplines. While literature on border studies has located
border security in multiple places and temporalities (e.g. Bigo and
Guild, 2005), scholarship on image-based algorithmic security practices
have often focused on a photography-centric paradigm: biometric images
(Pugliese, 2010) facial, iris and fingerprint recognition (Møhl, 2021),
and body scanners (Leese, 2015). These technologies capture immutable
features of suspect identities. In the diagrams, however, vision appears
less static. Instead, two central dimensions of the figure of vision
appear: the ability to capture and make sense of the movement of the
bodies in a fixed space, and the movement of bodies across spaces.</p>
</div>
<div data-custom-style="Body Text">
<p>On the first point, we notice increasing attention to corporeality,
how physical movements render certain individuals suspicious. This
process takes place through the production and analysis of motion by
composing a sequence of frames. Gerwin van der Lugt, who helped develop
a violence detection algorithm at Oddity.ai, stresses how “temporal
information integration” is the biggest technical challenge in detecting
violence in surveillance footage: a raised hand might be either a punch
or a high-five. In Diagram 4, van der Lugt visualizes the differences
between the static and dynamic models. A first layer of pose or object
detection often analyzes a merely static image. Oddity.ai then uses
custom algorithms to integrate individual detections into one that
tracks movement. It is then the movement that can be assessed as violent
or harmless. From these outputs, Oddity.ai runs “another [...] process
that [they] call temporal information integration—it’s quite
important—to [...] find patterns that are [even] longer.” This case
illustrates how algorithmic security vision temporarily attributes risk
to bodies, in accordance with the ways violence is imagined and
choreographed in the training data.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image6.png"
style="width:5.80694in;height:4.61944in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 4. Top: Frame 1 is processed by YOLO, an object detection
model, producing Output 1 (O1). Other frames are processed
independently. Bottom: Frames 1 to 10 are combined for processing by the
customized model (“M”), where it produces outputs (O 1-10, O 11-20).
These outputs are then processed in relation to one another by the
temporal information integration to find body patterns over longer
periods. Drawn by Gerwin van der Lugt.</p>
</div>
<div data-custom-style="Body Text">
<p>Our interviewees figured movement in a second way. Bodies are tracked
in space, leading to an accumulation of suspicious data over time. Ádám
Remport explained how facial recognition technology (FRT) works by
drawing a geographical map featuring building blocks and streets. In
this map (Diagram 5), Person A could visit a bar, a church, or an
NGO.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image7.png"
style="width:5.13194in;height:2.37014in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 5. Ádám Remport explains how a person’s everyday routes can
be inferred when facial recognition technology is deployed in various
sites, montaging a local, photographic vision into spatio-temporal,
cinematic terms.</p>
</div>
<div data-custom-style="Body Text">
<p>If “FRT is fully deployed and constantly functioning,” explains
Remport, people can be “followed wherever [they] go.” Remport’s drawing
therefore suggests that in this setting it is not important to be able
to identify the person under surveillance; what matters is that this
person can be tracked over different surveillance camera feeds. The
trajectories of bodies and their “signature” marked through the
reconstruction of their habitual movements through space are used as a
benchmark for the construction of suspicion. Cinematic vision is thus
made possible thanks to the broader infrastructure that allows for the
collection and analysis of data over longer periods of time, and their
summarization through montage.</p>
</div>
<div data-custom-style="Body Text">
<p>The emerging centrality of movement thus opens up a new research
agenda for security, focused not on <em>who</em> and what features are
considered risky, but <em>when</em>, and through which movements
specific bodies become suspicious. While earlier studies on biometric
technologies have located the operational logic on identification,
verification, authentication, thus <em>knowing</em> the individual
(Ajana, 2013; Muller, 2010), figuring algorithmic security vision as
cinematic locates its operational logic in the mobility of embodied life
(see Huysmans, 2022). While many legal and political debates revolve
around the storage of images as individual frames, and the privacy
issues involved, less is known about the consequences of putting these
frames into a sequence on a timeline and the movements that emerge
through the integration of frames over time.</p>
</div>
</section>
<section id="managing-error-from-the-sublime-to-the-risky-algorithm"
class="level2">
<h2>3. Managing error: from the sublime to the risky algorithm</h2>
<div data-custom-style="Body Text">
<p>Our third emerging figuration concerns the place of the error. A
large body of literature examines actual and speculative cases of
algorithmic prediction based on self-learning systems (Azar et al.,
2021). Central to these analyses is the boundary-drawing performed by
such algorithmic devices, enacting (in)security by rendering their
subjects as more- or less-risky others (Amicelle et al., 2015: 300;
Amoore and De Goede, 2005; Aradau et al., 2008; Aradau and Blanke, 2018)
based on a spectrum of individual and environmental features (Calhoun,
2023). In other words, these predictive devices conceptualize risk as
something produced by, and thus external to, security technologies.</p>
</div>
<div data-custom-style="Body Text">
<p>In this critical literature on algorithmic practices, practitioners
working with algorithmic technologies are often critiqued for
understanding software as “sublime” (e.g. Wilcox, 2017: 3). However, in
our diagrams, algorithmic vision appears as a practice of managing
error. The practitioners we interviewed are aware of the error-prone
nature of their systems but know it will never be perfect, and see it as
a key metric that needs to be acted upon.</p>
</div>
<div data-custom-style="Body Text">
<p>The most prominent way in which error figures in the diagrams is in
its quantified form of the true positive and false positive rates, TPR
and FPR. The significance and definition of these metrics is stressed by
CTO Gerwin van der Lugt (Diagram 6). In camera surveillance, the false
positive rate could be described as the number of fales positive
classifications relative to the number of video frames being analyzed.
Upon writing down these definitions, van der Lugt corrected his initial
definitions, as these definitions determine the work of his development
team, the ways in which his clients — security operators — engage with
the technology, and whether they perceive the output of the system as
trustworthy.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image8.png"
style="width:4.36111in;height:2.29028in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 6. Gerwin van der Lugt corrects his initial definitions of
the true positive and false positive rates, and stresses the importance
of their precise definition.</p>
</div>
<div data-custom-style="Body Text">
<p>The figuration of algorithmic security vision as inherently imprecise
affects the operationalization of security practices. Van der Lugt’s
example concerns whether the violence detection algorithm developed by
Oddity.ai should be trained to categorize friendly fighting
(<em>stoeien</em>) between friends as “violence” or not. In this
context, van der Lugt finds it important to differentiate what counts as
false positive in the algorithm’s evaluation metric from an error in the
algorithm’s operationalization of a security question.</p>
</div>
<div data-custom-style="Body Text">
<p>He gives two reasons to do so. First, he anticipates that the
exclusion of <em>stoeien</em> from the category of violence would
negatively impact TPR. In the iterative development of self-learning
systems, the TPR and FPR, together with the true and false
<em>negative</em> rates must perform a balancing act. Van der Lugt
outlines that with their technology they aim for fewer than 100 false
positives per 100 million frames per week. The FPR becomes indicative of
the algorithm’s quality, as too many faulty predictions will desensitize
the human operator to system alerts.</p>
</div>
<div data-custom-style="Body Text">
<p>This leads to van der Lugt’s second point: He fears that the
exclusion of <em>stoeien</em> from the violence category might cause
unexpected biases in the system. For example, instead of distinguishing
violence from <em>stoeien</em> based on people’s body movements, the
algorithm might make the distinction based on their age. For van der
Lugt, this would be an undesirable and hard to notice form of
discrimination. In developing algorithmic (in)security, error is figured
not merely as a mathematical concept but (as shown in Diagram 6) as a
notion that invites pre-emption — a mitigation of probable failure — for
which the developer is responsible. The algorithmic condition of
security vision is figured as the pre-emption of error.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image9.png"
style="width:3.91944in;height:3.06806in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 7. By drawing errors on a timeline, van Rest calls attention
to the pre-emptive nature of error in the development process of
computer vision technologies.</p>
</div>
<div data-custom-style="Body Text">
<p>According to critical AI scholar Matteo Pasquinelli, “machine
learning is technically based on formulas for error correction” (2019:
2). Therefore, any critical engagement with such algorithmic processes
needs to go beyond citing errors, “for it is precisely through these
variations that the algorithm learns what to do” (Amoore, 2019: 164),
pushing us to reconsider any argument based on the inaccuracy of the
systems.</p>
</div>
<div data-custom-style="Body Text">
<p>The example of <em>stoeien</em> suggests that it is not so much a
question if, or how much, these algorithms err, but how these errors are
anticipated and negotiated. Thus, taking error as a hallmark of machine
learning we can see how practices of (in)security become shaped by the
notion of mathematical error well beyond their development stages. Error
figures centrally in the development, acquisition and deployment of such
devices. As one respondent indicated, predictive devices are inherently
erroneous, but the quantification of their error makes them amenable to
"risk management.”</p>
</div>
<div data-custom-style="Body Text">
<p>While much has been written about security technologies as a device
<em>for</em> risk management, little is known about how security
technologies are conceptualized as objects <em>of</em> risk management.
What happens then in this double relation of risk? The figure of the
error enters the diagrams as a mathematical concept, throughout the
conversations we see its figure permeate the discourse around
algorithmic security vision. By figuring algorithmic security vision
through the notion of error, risk is placed at the heart of the security
apparatus.</p>
</div>
</section>
</section>
<section
id="con-figurations-of-algorithmic-security-vision-fragmenting-accountability-and-expertise"
class="level1">
<h1>Con-figurations of algorithmic security vision: fragmenting
accountability and expertise</h1>
<div data-custom-style="Body Text">
<p>In the previous section we explored the changing <em>figurations</em>
of key dimensions of algorithmic security vision, in this section we
examine how these figurations <em>configure</em>. For Suchman, working
with configurations highlights “the histories and encounters through
which things are figured <em>into meaningful existence</em>, fixing them
through reiteration but also always engaged in ‘the perpetuity of coming
to be’ that characterizes the biographies of objects as well as
subjects” (Suchman, 2012: 50, emphasis ours) In other words, we are
interested in the practices and tensions that emerge as figurations
become embedded in material practices. We focus on two con-figurations
that emerged in the interviews: the delegation of accountability to
externally managed benchmarks, and the displacement of responsibility
through the reconfiguration of the human-in-the-loop.</p>
</div>
<section id="delegating-accountability-to-benchmarks" class="level2">
<h2>Delegating accountability to benchmarks</h2>
<div data-custom-style="Body Text">
<p>The first configuration is related to the evaluation of the error
rate in the training of algorithmic vision systems: it involves
datasets, benchmark institutions, and the idea of fairness as equal
representation among different social groups. Literature on the ethical
and political effects of algorithmic vision has notoriously focused on
the distribution of errors, raising questions of ethnic and racial bias
(e.g. Buolamwini and Gebru, 2018). Our interviews reflect the concerns
of much of this literature as the pre-emption of error figured
repeatedly in relation to the uneven distribution of error across
minorities or groups. In Diagram 8, Ádám Remport draws how different
visual traits have often led to different error rates. While the general
error metric of an algorithmic system might seem "acceptable," it
actually privileges particular groups, which is invisible when only the
whole is considered. Jeroen van Rest distinguishes such errors from the
inherent algorithmic imprecision in deep machine learning models, as
systemic biases (Diagram 7), as they perpetuate inequalities in the
society in which the product is being developed.</p>
</div>
<div data-custom-style="Body Text">
<p>To mitigate these concerns and manage their risk, many of our
interviewees who develop and implement these technologies, externalize
the reference against which the error is measured. They turn to a
benchmark run by the American National Institute of Standards and
Technology (NIST), which ranks facial recognition technologies by
different companies by their error metric across groups. John Riemen,
who is responsible for the use of forensic facial recognition technology
at the Center for Biometrics of the Dutch police, describes how their
choice for software is driven by a public tender that demands a "top-10"
score on the NIST benchmark. The mitigation of bias is thus outsourced
to an external, and in this case foreign, institution.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image10.png"
style="width:6.05417in;height:2.16389in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 8. Ádám Remport describes that facial recognition
technologies are often most accurate with white male adult faces,
reflecting the datasets they are trained with. The FPR is higher with
people with darker skin, children, or women, which may result in false
flagging and false arrests.</p>
</div>
<div data-custom-style="Body Text">
<p>We see in this outsourcing of error metrics a form of delegation that
brings about a specific regime of (in)visibility. While a particular
kind of algorithmic bias is rendered central to the NIST benchmark, the
mobilization of this reference obfuscates questions on how that metric
was achieved. That is to say, questions about training data are
invisibilized, even though that data is a known site of contestation.
For example, the NIST benchmark datasets are known to include faces of
wounded people (Keyes, 2019). The Clearview company is known to use
images scraped illegally from social media, and IBM uses a dataset that
is likely in violation of European GDPR legislation (Bommasani et al.,
2022: 154). Pasquinelli (2019) argued that machine learning models
ultimately act as data compressors: enfolding and operationalizing
imagery of which the terms of acquisition are invisibilized.</p>
</div>
<div data-custom-style="Body Text">
<p>Attention to this invisibilization reveals a discrepancy between the
developers and the implementers of these technologies. On the one hand,
the developers we interviewed expressed concerns about how their
training data is constituted to gain a maximum false positive rate/true
positive rate (FPR/TPR) ratio, while showing concern for the legality of
the data they use to train their algorithms. On the other hand,
questions about the constitution of the dataset have been virtually
non-existent in our conversations with those who implement software that
relies on models trained with such data. Occasionally this knowledge was
considered part of the developers' intellectual property that had to be
kept a trade secret. A high score on the benchmark is enough to pass
questions of fairness, legitimizing the use of the algorithmic model.
Thus, while indirectly relying on the source data, it is no longer
deemed relevant in the consideration of an algorithm. This illustrates
well how the invisibilization of the “compressed” dataset, in
Pasquinelli’s terms, into a model, with the formalization of guiding
metrics into a benchmark, permits a bracketing of accountability. One
does not need to know how outcomes are produced, as long as the
benchmarks are in order.</p>
</div>
<div data-custom-style="Body Text">
<p>The configuration of algorithmic vision’s bias across a complex
network of fragmented locations and actors, from the dataset, to the
algorithm, to the benchmark institution reveals the selective processes
of (in)visibilization. This opens up fruitful alleys for new empirical
research: What are the politics of the benchmark as a mechanism of
legitimization? How does the outsourcing of assessing the error
distribution impact attention to bias? How has the critique of bias been
institutionalized by the security industry, resulting in the
externalization of accountability, through dis-location and
fragmentation?</p>
</div>
</section>
<section id="reconfiguring-the-human-in-the-loop" class="level2">
<h2>Reconfiguring the human-in-the-loop</h2>
<div data-custom-style="Body Text">
<p>A second central question linked to the delegation of accountability
is the configuration in which the security operator is located. The
effects of delegation and fragmentation in which the mitigation of
algorithmic errors is outsourced to an external party, becomes visible
in the ways in which the role of the security operator is configured in
relation to the institution they work for, the software’s assessment,
and the affected publics.</p>
</div>
<div data-custom-style="Body Text">
<p>The public critique of algorithms has often construed the
<em>human-in-the loop</em> as one of the last lines of defense in the
resistance to automated systems, able to filter and correct erroneous
outcomes (Markoff, 2020). The literature in critical security studies
has however problematized the representation of the security operator in
algorithmic assemblages by discussing how the algorithmic predictions
appear on their screen (Aradau and Blanke, 2018), and how the embodied
decision making of the operator is entangled with the algorithmic
assemblage (Wilcox, 2017). Moreover, the operator is often left guessing
at the working of the device that provides them with information to make
their decision (Møhl, 2021).</p>
</div>
<div data-custom-style="Body Text">
<p>What our participants’ diagrams emphasized is how a whole spectrum of
system designs emerges in response to similar questions, for example the
issue of algorithmic bias. A primary difference can be found in the
degree of understanding of the systems that is expected of security
operators, as well as their perceived autonomy. Sometimes, the human
operator is central to the system’s operation, forming the interface
between the algorithmic systems and surveillance practices. Gerwin van
der Lugt, developer of software at Oddity.ai that detects criminal
behavior argues that “the responsibility for how to deal with the
violent incidents is always [on a] human, not the algorithm. The
algorithm just detects violence—that’s it—but the human needs to deal
with it.”</p>
</div>
<div data-custom-style="Body Text">
<p>Dirk Herzbach, chief of police at the Police Headquarters Mannheim,
adds that when alerted to an incident by the system, the operator
decides whether to deploy a police car. Both Herzbach and Van der Lugt
figure the human-in-the-loop as having full agency and responsibility in
operating the (in)security assemblage (cf. Hoijtink and Leese,
2019).</p>
</div>
<div data-custom-style="Body Text">
<p>Some interviewees drew a diagram in which the operator is supposed to
be aware of the ways in which the technology errs, so they can address
them. Several other interviewees considered the technical expertise of
the human-in-the-loop to be unimportant, even a hindrance.</p>
</div>
<div data-custom-style="Body Text">
<p>Chief of police Herzbach prefers an operator to have patrol
experience to assess which situations require intervention. He is
concerned that knowledge about algorithmic biases might interfere with
such decisions. In the case of the Moscow metro, in which a facial
recognition system has been deployed to purchase tickets and open access
gates, the human-in-the-loop is reconfigured as an end user who needs to
be shielded from the algorithm’s operation (c.f. Lorusso, 2021). On
these occasions, expertise on the technological creation of the suspect
becomes fragmented.</p>
</div>
<div data-custom-style="Body Text">
<p>These different figurations of the security operator are held
together by the idea that the human operator is the expert of the
subject of security, and is expected to make decisions independent from
the information that the algorithmic system provides.</p>
</div>
<div data-custom-style="Figure">
<p><img src="assets//media/image11.png"
style="width:5.80694in;height:2.79375in" /></p>
</div>
<div data-custom-style="Caption">
<p>Diagram 9. Riemen explains the process of information filtering that
is involved in querying the facial recognition database of the Dutch
police.</p>
</div>
<div data-custom-style="Body Text">
<p>Other drivers exist, however, to shield the operator from the
algorithm’s functioning, challenging individual expertise and
acknowledging the fallibility of human decision making. In Diagram 9,
John Riemen outlines the use of facial recognition by the Dutch police.
He describes how data from the police case and on the algorithmic
assessment is filtered out as much as possible from the information
provided to the operator. This, Riemen suggests, might reduce bias in
the final decision. He adds that there should be no fewer than three
humans-in-the-loop who operate independently to increase the accuracy of
the algorithmic security vision.</p>
</div>
<div data-custom-style="Body Text">
<p>Instead of increasing their number, there is another configuration of
the human-in-the-loop that responds to the fallibility of the operator.
For the Burglary-Free Neighborhood project in Rotterdam, project manager
Guido Delver draws surveillance as operated by neighborhood residents,
through a system that they own themselves. By involving different
stakeholders, Delver hopes to counter government hegemony over the
surveillance apparatus. However, residents are untrained in assessing
algorithmic predictions raising new challenges. Delver illustrates a
scenario in which the algorithmic signaling of a potential burglary may
have dangerous consequences: “Does it invoke the wrong behavior from the
citizen? [They could] go out with a bat and look for the guy who has
done nothing [because] it was a false positive.” In this case, the worry
is that the erroneous predictions will not be questioned. Therefore, in
Delver’s project the goal was to actualize an autonomous system, “with
as little interference as possible.” Human participation or
“interference” in the operation are potentially harmful. Thus, figuring
the operator — whether police officer or neighborhood resident — as
risky, can lead to the relegation of direct human intervention.</p>
</div>
<div data-custom-style="Body Text">
<p>By looking at the figurations of the operator that appear in the
diagrams we see multiple and heterogeneous configurations of
regulations, security companies, and professionals. In each
configuration, the human-in-the-loop appears in different forms. The
operator often holds the final responsibility in the ethical functioning
of the system. At times they are configured as an expert in
sophisticated but error-prone systems; at others they are figured as end
users who are activated by the alerts generated by the system, and who
need not understand how the software works and errs, or who can be left
out.</p>
</div>
<div data-custom-style="Body Text">
<p>These configurations remind us that there cannot be any theorization
of “algorithmic security vision,” both of its empirical workings and its
ethical and political consequences without close attention to the
empirical contexts in which the configurations are arranged. Each
organization of datasets, algorithms, benchmarks, hardware and operators
has specific problems. And each contains specific politics of
visibilization, invisibilization, responsibility and accountability.</p>
</div>
</section>
</section>
<section id="a-diagram-of-research" class="level1">
<h1>A diagram of research</h1>
<div data-custom-style="Body Text">
<p>In this conclusion, we reflect upon a final dimension of the method
of diagraming in the context of figurations and configurations: its
potential as an alternative to the conventional research program.</p>
</div>
<div data-custom-style="Body Text">
<p>While writing this text, indeed, the search for a coherent structure
through which we could map the problems that emerged from analyzing the
diagrams in a straightforward narrative proved elusive. We considered
various organizational frameworks, but consistently encountered
resistance from one or two sections. It became evident that our
interviews yielded a rhizome of interrelated problems, creating a
multitude of possible inquiries and overlapping trajectories. Some
dimensions of these problems are related, but not to every problem.</p>
</div>
<div data-custom-style="Body Text">
<p>If we take for example the understanding of algorithmic security
vision as practices of error management as a starting point, we see how
the actors we interviewed have incorporated the societal critique of
algorithmic bias. This serves as a catalyst for novel strategies aimed
at mitigating the repercussions of imperfect systems. The societal
critique has driven the development of synthetic datasets, which promise
equitable representation across diverse demographic groups. It has also
been the reason for the reliance on institutionalized benchmarks to
assess the impartiality of algorithms. Moreover, different
configurations of the human-in-the-loop emerge, all promised to rectify
algorithmic fallibility. We see a causal chain there.</p>
</div>
<div data-custom-style="Body Text">
<p>But how does the question of algorithmic error relate to the shift
from photographic to cinematic vision that algorithmic security vision
brings about? Certainly, there are reverberations. The relegation of
stable identity that we outlined, could be seen as a way to mitigate the
impact of those errors. But it would be a leap to identify these
questions of error as the central driver for the increased incorporation
of moving images in algorithmic security vision.</p>
</div>
<div data-custom-style="Body Text">
<p>However, if we take as our starting point the formidable strides in
computing power and the advancements in camera technologies, we face
similar problems. These developments make the analysis of movement
possible while helping to elucidate the advances in the real-time
analysis that are required to remove the human-in-the-loop, as trialed
in the Burglary-Free Neighborhood. These developments account for the
feasibility of the synthetic data generation, a computing-intense
process which opens a vast horizon of possibilities for developers to
detect objects or actions. Such an account, however, does not address
the need for such a synthetic dataset. A focus on the computation of
movement, however, would highlight how a lack of training data
necessitates many of the practices described. Synthetic data is
necessitated by the glaring absence of pre-existing security datasets
that contain moving bodies. While facial recognition algorithms could be
trained and operated on quickly repurposed photographic datasets of
national identity cards or drivers’ license registries, no dataset for
moving bodies has been available to be repurposed by states or
corporations. This absence of training data requires programmers to
stage scenes for the camera. Thus, while one issue contains echoes of
the other, the network of interrelated problematizations cannot be
flattened into a single narrative.</p>
</div>
<div data-custom-style="Body Text">
<p>The constraints imposed by the linear structure of an academic
article certainly necessitate a specific ordering of sections. Yet the
different research directions we highlight form something else. The
multiple figurations analyzed here generate fresh tensions when put in
relation with security and political practices. What appears from the
diagrams is a network of figurations in various configurations. Instead
of a research <em>program</em>, our interviews point toward a larger
research <em>diagram</em> of interrelated questions, which invites us to
think in terms of pathways through this dynamic and evolving network of
relations.</p>
</div>
</section>
<section id="interviewees" class="level1">
<h1>Interviewees</h1>
<ul>
<li><div data-custom-style="Normal">
<p>Gerwin van der Lugt develops software that detects “high-impact
crimes” in camera streams.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>András Lukács is a senior researcher and coordinator of the AI Lab at
the Department of Mathematics of the Eötvös Loránd University in
Budapest, and an AI adviser for the Hungarian Ministry of Technology and
Innovation.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Guido Delver is an engineer and coordinator of the Rotterdam-based
Burglary-Free Neighborhood that builds autonomous systems into street
lamps to reinforce public security.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Attila Batorfy is a journalist and data visualization expert who
teaches journalism, media studies and information graphics at the Media
Department of Eötvös Loránd University.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Peter Smith is a senior security expert working for a European
organization employing border technologies.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Adam Remport is a Hungarian legal expert and activist working on
state actors’ use of biometric technologies.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>John Riemen is head of the Center for Biometricts for the Dutch
police.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Jeroen van Rest is a safety expert and senior consultant in
risk-based security at TNO, the Netherlands.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Two anonymous respondents involved in the Living Lab Scheveningen, in
the Hague.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Sergei Miliaev is principal researcher and facial recognition team
lead at VisionLabs in Amsterdam.</p>
</div></li>
<li><div data-custom-style="Normal">
<p>Dirk Herzbach is chief of police of the Polizeipräsidium
Mannheim.</p>
</div></li>
</ul>
</section>
<section id="references" class="level1">
<h1>References</h1>
<div data-custom-style="Bibliography">
<p>Ajana B (2013) <em>Governing Through Biometrics</em>. London:
Palgrave Macmillan UK. DOI: <a
href="https://doi.org/10.1057/9781137290755"><span
data-custom-style="Hyperlink">10.1057/9781137290755</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Amicelle A, Aradau C and Jeandesboz J (2015) Questioning security
devices: Performativity, resistance, politics. <em>Security
Dialogue</em> 46(4): 293–306. DOI: <a
href="https://doi.org/10.1177/0967010615586964"><span
data-custom-style="Hyperlink">10.1177/0967010615586964</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Amoore L (2014) Security and the incalculable. <em>Security
Dialogue</em> 45(5). SAGE Publications Ltd: 423–439. DOI: <a
href="https://doi.org/10.1177/0967010614539719"><span
data-custom-style="Hyperlink">10.1177/0967010614539719</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Amoore L (2019) Doubt and the algorithm: On the partial accounts of
machine learning. <em>Theory, Culture &amp; Society</em> 36(6). SAGE
Publications Ltd: 147–169. DOI: <a
href="https://doi.org/10.1177/0263276419851846"><span
data-custom-style="Hyperlink">10.1177/0263276419851846</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Amoore L (2021) The deep border. <em>Political Geography</em>.
Elsevier: 102547.</p>
</div>
<div data-custom-style="Bibliography">
<p>Amoore L and De Goede M (2005) Governance, risk and dataveillance in
the war on terror. <em>Crime, Law and Social Change</em> 43(2): 149–173.
DOI: <a href="https://doi.org/10.1007/s10611-005-1717-8"><span
data-custom-style="Hyperlink">10.1007/s10611-005-1717-8</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Andersen RS (2015) <em>Remediating Security</em>. 1. oplag.
Ph.d.-serien / københavns universitet, institut for statskundskab. Kbh.:
Københavns Universitet, Institut for Statskundskab.</p>
</div>
<div data-custom-style="Bibliography">
<p>Andersen RS (2018) The art of questioning lethal vision: Mosse’s
infra and militarized machine vision. In: <em>_Proceeding of EVA
copenhagen 2018_</em>, 2018.</p>
</div>
<div data-custom-style="Bibliography">
<p>Andrejevic M and Burdon M (2015) Defining the sensor society.
<em>Television &amp; New Media</em> 16(1): 19–36. DOI: <a
href="https://doi.org/10.1177/1527476414541552"><span
data-custom-style="Hyperlink">10.1177/1527476414541552</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Aradau C and Blanke T (2015) The (big) data-security assemblage:
Knowledge and critique. <em>Big Data &amp; Society</em> 2(2):
205395171560906. DOI: <a
href="https://doi.org/10.1177/2053951715609066"><span
data-custom-style="Hyperlink">10.1177/2053951715609066</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Aradau C and Blanke T (2018) Governing others: Anomaly and the
algorithmic subject of security. <em>European Journal of International
Security</em> 3(1). Cambridge University Press: 1–21. DOI: <a
href="https://doi.org/10.1017/eis.2017.14"><span
data-custom-style="Hyperlink">10.1017/eis.2017.14</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Aradau C, Lobo-Guerrero L and Van Munster R (2008) Security,
technologies of risk, and the political: Guest editors’ introduction.
<em>Security Dialogue</em> 39(2-3): 147–154. DOI: <a
href="https://doi.org/10.1177/0967010608089159"><span
data-custom-style="Hyperlink">10.1177/0967010608089159</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Azar M, Cox G and Impett L (2021) Introduction: Ways of machine
seeing. <em>AI &amp; SOCIETY</em>. DOI: <a
href="https://doi.org/10.1007/s00146-020-01124-6"><span
data-custom-style="Hyperlink">10.1007/s00146-020-01124-6</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Bae G, de La Gorce M, Baltrušaitis T, et al. (2023) DigiFace-1M: 1
million digital face images for face recognition. In: <em>2023 IEEE
Winter Conference on Applications of Computer Vision (WACV)</em>, 2023.
IEEE.</p>
</div>
<div data-custom-style="Bibliography">
<p>Barad KM (2007) <em>Meeting the Universe Halfway: Quantum Physics and
the Entanglement of Matter and Meaning</em>. Durham: Duke University
Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Bellanova R, Irion K, Lindskov Jacobsen K, et al. (2021) Toward a
critique of algorithmic violence. <em>International Political
Sociology</em> 15(1): 121–150. DOI: <a
href="https://doi.org/10.1093/ips/olab003"><span
data-custom-style="Hyperlink">10.1093/ips/olab003</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Bigo D (2002) Security and immigration: Toward a critique of the
governmentality of unease. <em>Alternatives</em> 27. SAGE Publications
Inc: 63–92. DOI: <a
href="https://doi.org/10.1177/03043754020270S105"><span
data-custom-style="Hyperlink">10.1177/03043754020270S105</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Bigo D and Guild E (2005) Policing at a distance: Schengen visa
policies. In: <em>Controlling Frontiers. Free Movement into and Within
Europe</em>. Routledge, pp. 233–263.</p>
</div>
<div data-custom-style="Bibliography">
<p>Bommasani R, Hudson DA, Adeli E, et al. (2022) On the opportunities
and risks of foundation models. Available at: <a
href="http://arxiv.org/abs/2108.07258"><span
data-custom-style="Hyperlink">http://arxiv.org/abs/2108.07258</span></a>
(accessed 2 June 2023).</p>
</div>
<div data-custom-style="Bibliography">
<p>Bousquet AJ (2018) <em>The Eye of War</em>. Minneapolis: University
of Minnesota Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Bucher T (2018) <em>If...Then: Algorithmic Power and Politics</em>.
New York: Oxford University Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Buolamwini J and Gebru T (2018) Gender shades: Intersectional
accuracy disparities in commercial gender classification.
<em>Proceedings of Machine Learning Research</em> 81.</p>
</div>
<div data-custom-style="Bibliography">
<p>Calhoun L (2023) Latency, uncertainty, contagion: Epistemologies of
risk-as-reform in crime forecasting software. <em>Environment and
Planning D: Society and Space</em>. SAGE Publications Ltd STM:
02637758231197012. DOI: <a
href="https://doi.org/10.1177/02637758231197012"><span
data-custom-style="Hyperlink">10.1177/02637758231197012</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Carraro V (2021) Grounding the digital: A comparison of Waze’s ‘avoid
dangerous areas’ feature in Jerusalem, Rio de Janeiro and the US.
<em>GeoJournal</em> 86(3): 1121–1139. DOI: <a
href="https://doi.org/10.1007/s10708-019-10117-y"><span
data-custom-style="Hyperlink">10.1007/s10708-019-10117-y</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Dawson-Howe K (2014) <em>A Practical Introduction to Computer Vision
with OpenCV</em>. 1st edition. Chichester, West Sussex, United Kingdon;
Hoboken, NJ: Wiley.</p>
</div>
<div data-custom-style="Bibliography">
<p>Dijstelbloem H, van Reekum R and Schinkel W (2017) Surveillance at
sea: The transactional politics of border control in the Aegean.
<em>Security Dialogue</em> 48(3): 224–240. DOI: <a
href="https://doi.org/10.1177/0967010617695714"><span
data-custom-style="Hyperlink">10.1177/0967010617695714</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Farocki H (2004) Phantom images. <em>Public</em>. Available at: <a
href="https://public.journals.yorku.ca/index.php/public/article/view/30354"><span
data-custom-style="Hyperlink">https://public.journals.yorku.ca/index.php/public/article/view/30354</span></a>
(accessed 6 March 2023).</p>
</div>
<div data-custom-style="Bibliography">
<p>Fisher DXO (2018) Situating border control: Unpacking Spain’s SIVE
border surveillance assemblage. <em>Political Geography</em> 65: 67–76.
DOI: <a href="https://doi.org/10.1016/j.polgeo.2018.04.005"><span
data-custom-style="Hyperlink">10.1016/j.polgeo.2018.04.005</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Fourcade M and Gordon J (2020) Learning like a state: Statecraft in
the digital age32.</p>
</div>
<div data-custom-style="Bibliography">
<p>Fourcade M and Johns F (2020) Loops, ladders and links: The
recursivity of social and machine learning. <em>Theory and Society</em>:
1–30. DOI: <a href="https://doi.org/10.1007/s11186-020-09409-x"><span
data-custom-style="Hyperlink">10.1007/s11186-020-09409-x</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Fraser A (2019) Curating digital geographies in an era of data
colonialism. <em>Geoforum</em> 104. Elsevier: 193–200.</p>
</div>
<div data-custom-style="Bibliography">
<p>Galton F (1879) Composite portraits, made by combining those of many
different persons into a single resultant figure. <em>The Journal of the
Anthropological Institute of Great Britain and Ireland</em> 8. [Royal
Anthropological Institute of Great Britain; Ireland, Wiley]: 132–144.
DOI: <a href="https://doi.org/10.2307/2841021"><span
data-custom-style="Hyperlink">10.2307/2841021</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Gandy OH (2021) <em>The Panoptic Sort: A Political Economy of
Personal Information</em>. Oxford University Press. Available at: <a
href="https://books.google.com?id=JOEsEAAAQBAJ"><span
data-custom-style="Hyperlink">https://books.google.com?id=JOEsEAAAQBAJ</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Gillespie T (2018) <em>Custodians of the Internet: Platforms, Content
Moderation, and the Hidden Decisions That Shape Social Media</em>.
Illustrated edition. Yale University Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Goodwin C (1994) Professional vision. <em>American
Anthropologist</em> 96(3).</p>
</div>
<div data-custom-style="Bibliography">
<p>Graham S (1998) Spaces of surveillant simulation: New technologies,
digital representations, and material geographies. <em>Environment and
Planning D: Society and Space</em> 16(4). SAGE Publications Sage UK:
London, England: 483–504.</p>
</div>
<div data-custom-style="Bibliography">
<p>Graham SD (2005) Software-sorted geographies. <em>Progress in human
geography</em> 29(5). Sage Publications Sage CA: Thousand Oaks, CA:
562–580.</p>
</div>
<div data-custom-style="Bibliography">
<p>Grasseni C (2004) Skilled vision. An apprenticeship in breeding
aesthetics. <em>Social Anthropology</em> 12(1): 41–55. DOI: <a
href="https://doi.org/10.1017/S0964028204000035"><span
data-custom-style="Hyperlink">10.1017/S0964028204000035</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Grasseni C (2018) Skilled vision. In: Callan H (ed.) <em>The
International Encyclopedia of Anthropology</em>. 1st ed. Wiley, pp. 1–7.
DOI: <a href="https://doi.org/10.1002/9781118924396.wbiea1657"><span
data-custom-style="Hyperlink">10.1002/9781118924396.wbiea1657</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Haraway D (1988) Situated knowledges: The science question in
feminism and the privilege of partial perspective. <em>Feminist
Studies</em> 14(3). Feminist Studies, Inc.: 575–599. DOI: <a
href="https://doi.org/10.2307/3178066"><span
data-custom-style="Hyperlink">10.2307/3178066</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Hoijtink M and Leese M (2019) How (not) to talk about technology
international relations and the question of agency. In: Hoijtink M and
Leese M (eds) <em>Technology and Agency in International Relations</em>.
Emerging technologies, ethics and international affairs. London ; New
York: Routledge, pp. 1–24.</p>
</div>
<div data-custom-style="Bibliography">
<p>Hopman R and M’charek A (2020) Facing the unknown suspect: Forensic
DNA phenotyping and the oscillation between the individual and the
collective. <em>BioSocieties</em> 15(3): 438–462. DOI: <a
href="https://doi.org/10.1057/s41292-020-00190-9"><span
data-custom-style="Hyperlink">10.1057/s41292-020-00190-9</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Hunger F (2023) <em>Unhype artificial ’intelligence’! A proposal to
replace the deceiving terminology of AI.</em> 12 April. Zenodo. DOI: <a
href="https://doi.org/10.5281/zenodo.7524493"><span
data-custom-style="Hyperlink">10.5281/zenodo.7524493</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Huysmans J (2022) Motioning the politics of security: The primacy of
movement and the subject of security. <em>Security Dialogue</em> 53(3):
238–255. DOI: <a href="https://doi.org/10.1177/09670106211044015"><span
data-custom-style="Hyperlink">10.1177/09670106211044015</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Isin E and Ruppert E (2020) The birth of sensory power: How a
pandemic made it visible? <em>Big Data &amp; Society</em> 7(2). SAGE
Publications Ltd: 2053951720969208. DOI: <a
href="https://doi.org/10.1177/2053951720969208"><span
data-custom-style="Hyperlink">10.1177/2053951720969208</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Jasanoff S (2004) <em>States of Knowledge: The Co-Production of
Science and Social Order</em>. Routledge Taylor &amp; Francis Group.</p>
</div>
<div data-custom-style="Bibliography">
<p>Ji Z, Lee N, Frieske R, et al. (2023) Survey of hallucination in
natural language generation. <em>ACM Computing Surveys</em> 55(12):
1–38. DOI: <a href="https://doi.org/10.1145/3571730"><span
data-custom-style="Hyperlink">10.1145/3571730</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Keyes O (2019) The gardener’s vision of data: Data science reduces
people to subjects that can be mined for truth. <em>Real Life Mag</em>.
Available at: <a
href="https://reallifemag.com/the-gardeners-vision-of-data/"><span
data-custom-style="Hyperlink">https://reallifemag.com/the-gardeners-vision-of-data/</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Latour B (2005) <em>Reassembling the Social: An Introduction to
Actor-Network-Theory</em>. Clarendon Lectures in Management Studies.
Oxford; New York: Oxford University Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Leese M (2015) ‘We were taken by surprise’: Body scanners, technology
adjustment, and the eradication of failure. <em>Critical Studies on
Security</em> 3(3). Routledge: 269–282. DOI: <a
href="https://doi.org/10.1080/21624887.2015.1124743"><span
data-custom-style="Hyperlink">10.1080/21624887.2015.1124743</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Leese M (2019) Configuring warfare: Automation, control, agency. In:
Hoijtink M and Leese M (eds) Technology and Agency in International
Relations. Emerging technologies, ethics and international affairs.
London ; New York: Routledge, pp. 42–65.</p>
</div>
<div data-custom-style="Bibliography">
<p>Lorusso S (2021) The user condition. Available at: <a
href="https://theusercondition.computer/"><span
data-custom-style="Hyperlink">https://theusercondition.computer/</span></a>
(accessed 18 February 2021).</p>
</div>
<div data-custom-style="Bibliography">
<p>Lyon D (2003) <em>Surveillance as Social Sorting: Privacy, Risk, and
Digital Discrimination</em>. Psychology Press. Available at: <a
href="https://books.google.com?id=yCLFBfZwl08C"><span
data-custom-style="Hyperlink">https://books.google.com?id=yCLFBfZwl08C</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Mackenzie A (2017) <em>Machine Learners: Archaeology of a Data
Practice</em>. The MIT Press. DOI: <a
href="https://doi.org/10.7551/mitpress/10302.001.0001"><span
data-custom-style="Hyperlink">10.7551/mitpress/10302.001.0001</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Maguire M, Frois C and Zurawski N (eds) (2014) <em>The Anthropology
of Security: Perspectives from the Frontline of Policing,
Counter-Terrorism and Border Control</em>. Anthropology, culture and
society. London: Pluto Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Mahony M (2021) Geographies of science and technology 1: Boundaries
and crossings. <em>Progress in Human Geography</em> 45(3): 586–595. DOI:
<a href="https://doi.org/10.1177/0309132520969824"><span
data-custom-style="Hyperlink">10.1177/0309132520969824</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Markoff J (2020) Robots will need humans in future. <em>The New York
Times: Section B</em>, 22 May. New York. Available at: <a
href="https://www.nytimes.com/2020/05/21/technology/ben-shneiderman-automation-humans.html"><span
data-custom-style="Hyperlink">https://www.nytimes.com/2020/05/21/technology/ben-shneiderman-automation-humans.html</span></a>
(accessed 31 October 2023).</p>
</div>
<div data-custom-style="Bibliography">
<p>McCosker A and Wilken R (2020) <em>Automating Vision: The Social
Impact of the New Camera Consciousness</em>. 1st edition. Routledge.</p>
</div>
<div data-custom-style="Bibliography">
<p>Møhl P (2021) Seeing threats, sensing flesh: Human–machine ensembles
at work. <em>AI &amp; SOCIETY</em> 36(4): 1243–1252. DOI: <a
href="https://doi.org/10.1007/s00146-020-01064-1"><span
data-custom-style="Hyperlink">10.1007/s00146-020-01064-1</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Muller B (2010) <em>Security, Risk and the Biometric State</em>.
Routledge. DOI: <a href="https://doi.org/10.4324/9780203858042"><span
data-custom-style="Hyperlink">10.4324/9780203858042</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>O’Sullivan S (2016) On the diagram (and a practice of diagrammatics).
In: Schneider K, Yasar B, and Lévy D (eds) <em>Situational Diagram</em>.
New York: Dominique Lévy, pp. 13–25.</p>
</div>
<div data-custom-style="Bibliography">
<p>Olwig KF, Grünenberg K, Møhl P, et al. (2019) <em>The Biometric
Border World: Technologies, Bodies and Identities on the Move</em>. 1st
ed. Routledge. DOI: <a
href="https://doi.org/10.4324/9780367808464"><span
data-custom-style="Hyperlink">10.4324/9780367808464</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Pasquinelli M (2015) Anomaly detection: The mathematization of the
abnormal in the metadata society. Panel presentation.</p>
</div>
<div data-custom-style="Bibliography">
<p>Pasquinelli M (2019) How a machine learns and fails – a grammar of
error for artificial intelligence. Available at: <a
href="https://spheres-journal.org/contribution/how-a-machine-learns-and-fails-a-grammar-of-error-for-artificial-intelligence/"><span
data-custom-style="Hyperlink">https://spheres-journal.org/contribution/how-a-machine-learns-and-fails-a-grammar-of-error-for-artificial-intelligence/</span></a>
(accessed 13 January 2021).</p>
</div>
<div data-custom-style="Bibliography">
<p>Pugliese J (2010) <em>Biometrics: Bodies, Technologies,
Biopolitics</em>. New York: Routledge. DOI: <a
href="https://doi.org/10.4324/9780203849415"><span
data-custom-style="Hyperlink">10.4324/9780203849415</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Schurr C, Marquardt N and Militz E (2023) Intimate technologies:
Towards a feminist perspective on geographies of technoscience.
<em>Progress in Human Geography</em>. SAGE Publications Ltd:
03091325231151673. DOI: <a
href="https://doi.org/10.1177/03091325231151673"><span
data-custom-style="Hyperlink">10.1177/03091325231151673</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Soon W and Cox G (2021) <em>Aesthetic Programming: A Handbook of
Software Studies</em>. London: Open Humanities Press. Available at: <a
href="http://www.openhumanitiespress.org/books/titles/aesthetic-programming/"><span
data-custom-style="Hyperlink">http://www.openhumanitiespress.org/books/titles/aesthetic-programming/</span></a>
(accessed 9 March 2021).</p>
</div>
<div data-custom-style="Bibliography">
<p>Srnicek N and De Sutter L (2017) <em>Platform Capitalism</em>. Theory
redux. Cambridge, UK ; Malden, MA: Polity.</p>
</div>
<div data-custom-style="Bibliography">
<p>Stevens N and Keyes O (2021) Seeing infrastructure: Race, facial
recognition and the politics of data. <em>Cultural Studies</em> 35(4-5):
833–853. DOI: <a
href="https://doi.org/10.1080/09502386.2021.1895252"><span
data-custom-style="Hyperlink">10.1080/09502386.2021.1895252</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Suchman L (2006) <em>Human-Machine Reconfigurations: Plans and
Situated Actions</em>. 2nd edition. Cambridge University Press.</p>
</div>
<div data-custom-style="Bibliography">
<p>Suchman L (2012) Configuration. In: <em>Inventive Methods</em>.
Routledge, pp. 48–60.</p>
</div>
<div data-custom-style="Bibliography">
<p>Suchman L (2020) Algorithmic warfare and the reinvention of accuracy.
<em>Critical Studies on Security</em> 8(2). Routledge: 175–187. DOI: <a
href="https://doi.org/10.1080/21624887.2020.1760587"><span
data-custom-style="Hyperlink">10.1080/21624887.2020.1760587</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Sudmann A (2021) Artificial neural networks, postdigital
infrastructures and the politics of temporality. In: Volmar A and Stine
K (eds) <em>Media Infrastructures and the Politics of Digital Time</em>.
Amsterdam University Press, pp. 279–294. DOI: <a
href="https://doi.org/10.1515/9789048550753-017"><span
data-custom-style="Hyperlink">10.1515/9789048550753-017</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Tazzioli M (2018) Spy, track and archive: The temporality of
visibility in Eurosur and Jora. <em>Security Dialogue</em> 49(4):
272–288. DOI: <a href="https://doi.org/10.1177/0967010618769812"><span
data-custom-style="Hyperlink">10.1177/0967010618769812</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Thatcher J, O’Sullivan D and Mahmoudi D (2016) Data colonialism
through accumulation by dispossession: New metaphors for daily data.
<em>Environment and Planning D: Society and Space</em> 34(6). SAGE
Publications Ltd STM: 990–1006. DOI: <a
href="https://doi.org/10.1177/0263775816633195"><span
data-custom-style="Hyperlink">10.1177/0263775816633195</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Uliasz R (2020) Seeing like an algorithm: Operative images and
emergent subjects. <em>AI &amp; SOCIETY</em>. DOI: <a
href="https://doi.org/10.1007/s00146-020-01067-y"><span
data-custom-style="Hyperlink">10.1007/s00146-020-01067-y</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>van de Ven R and Plájás IZ (2022) Inconsistent projections:
Con-figuring security vision through diagramming. <em>A Peer-Reviewed
Journal About</em> 11(1): 50–65. DOI: <a
href="https://doi.org/10.7146/aprja.v11i1.134306"><span
data-custom-style="Hyperlink">10.7146/aprja.v11i1.134306</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Wilcox L (2017) Embodying algorithmic war: Gender, race, and the
posthuman in drone warfare. <em>Security Dialogue</em> 48(1): 11–28.
DOI: <a href="https://doi.org/10.1177/0967010616657947"><span
data-custom-style="Hyperlink">10.1177/0967010616657947</span></a>.</p>
</div>
<div data-custom-style="Bibliography">
<p>Zuboff S (2019) <em>The Age of Surveillance Capitalism: The Fight for
a Human Future at the New Frontier of Power</em>. First edition. New
York: Public Affairs.</p>
</div>
</section>
<section class="footnotes footnotes-end-of-document"
role="doc-endnotes">
<hr />
<ol>
<li id="fn1" role="doc-endnote"><div data-custom-style="Footnote Text">
<p><span data-custom-style="Footnote Characters"></span> The interface
software and code is available at <a
href="https://git.rubenvandeven.com/security_vision/svganim"><span
data-custom-style="Hyperlink">https://git.rubenvandeven.com/security_vision/svganim</span></a>
and <a href="https://gitlab.com/security-vision/chronodiagram"><span
data-custom-style="Hyperlink">https://gitlab.com/security-vision/chronodiagram</span></a></p>
</div>
<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></li>
<li id="fn2" role="doc-endnote"><div data-custom-style="Footnote Text">
<p><span data-custom-style="Footnote Characters"></span> The interviews
were conducted in several European countries: the majority in the
Netherlands, but also in Belgium, Hungary and Poland. Based on an
initial survey of algorithmic security vision practices in Europe we
identified various roles that are involved in such practices. Being a
rather small group of people, these interviewees do not serve as
“illustrative representatives” (Mol &amp; Law 2002, 16-17) of the field
they work in. However, as the interviewees have different cultural and
institutional affiliations, and hold different positions in working with
algorithms, vision and security, they cover a wide spectrum of
engagements with our research object.</p>
</div>
<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></li>
<li id="fn3" role="doc-endnote"><div data-custom-style="Footnote Text">
<p><span data-custom-style="Footnote Characters"></span> The interviews
were conducted by the first two authors, and at a later stage by Clemens
Baier. The conversations were largely unstructured, but began with two
basic questions. First, we asked the interviewees if they use diagrams
in their daily practice. We then asked: “when we speak of ‘security
vision’ we speak of the use of computer vision in a security context.
Can you explain from your perspective what these concepts mean and how
they come together?” After the first few interviews, we identified some
recurrent themes, which we then specifically asked later interviewees to
discuss.</p>
</div>
<a href="#fnref3" class="footnote-back" role="doc-backlink">↩︎</a></li>
<li id="fn4" role="doc-endnote"><div data-custom-style="Footnote Text">
<p><span data-custom-style="Footnote Characters"></span> Using
anthropomorphizing terms such as “neural networks,” “learning” and
“training” to denote algorithmic configurations and processes is
suggested to hype “artificial intelligence.” While we support the need
for an alternative terminology as proposed by Hunger (2023), here we
preserve the language of our interviewees.</p>
</div>
<a href="#fnref4" class="footnote-back" role="doc-backlink">↩︎</a></li>
</ol>
</section>
</body>
</html>