1. Who is the author? Who are they writing for and/or against?2. Identify and quote a main claim from the reading that you agree or disagree with. Explain your position. (Include the page number, so that you can refer back to this quote later.)3. Offer an example of the kind of evidence the author uses to support this claim. Is it convincing?
2053951716661365.pdf

Unformatted Attachment Preview

Original Research Article
A place for Big Data: Close and
distant readings of accessions data
from the Arnold Arboretum
Big Data & Society
July–December 2016: 1–20
! The Author(s) 2016
DOI: 10.1177/2053951716661365
bds.sagepub.com
Yanni Alexander Loukissas
Abstract
Place is a key concept in environmental studies and criticism. However, it is often overlooked as a dimension of situatedness in social studies of information. Rather, situatedness has been defined primarily as embodiment or social context. This
paper explores place attachments in Big Data by adapting close and distant approaches for reading texts to examine the
accessions data of the Arnold Arboretum, a living collection of trees, vines and shrubs established by Harvard University in
1872 (The original interactive data visualizations can be found online: http://www.lifeanddeathofdata.org). Although it is an
early and unconventional example of the phenomenon, there are several reasons that the Arboretum is a useful site for
investigating the relationship between Big Data and place. First, the category of place is embedded in a range of data fields
used in the Arboretum’s records. Second, the Arboretum has long sought to be a place in which scientists and citizens alike
can encounter large collections of data firsthand. Third, the place has shaped fluctuations in the daily production of data
over the course of the Arboretum’s 144 year history. Furthermore, Arboretum data can help us see place in ways not
necessarily tied to geolocation. Each of these place attachments suggests a different way in which data can be environmental: by being about, in, from, or generative of place. Taken together, these attachments offer a model for examining
other data in relation to their environments. Moreover, the paper contends that rather than being detached from place, as
prevailing discourses suggest, Big Data bring together more and further reaching place attachments than data sets of
smaller sizes.
Keywords
Environmental data, place, situated knowledge, close reading, distant reading
Introduction
A key concept in environmental criticism, ‘place’ is
often overlooked as a dimension of situatedness in
social studies of information. In this paper, I reflect
on the place of Big Data through an analysis of accessions records from the Arnold Arboretum. Established
in 1872, and located on 281 acres within the Boston
neighbourhood of Jamaica Plain, the Arboretum is a
long-lived collection of trees, vines, and shrubs managed by Harvard University. Equal parts urban laboratory and ‘zoo for plants’, it is one of the most
comprehensive and well-documented collections of its
kind in the world1 (Figure 1).
Although seemingly modest in size – hosting around
15,000 living plants today and about 70,000 over the
course of its history – the Arboretum is an apt site for
investigating Big Data’s attachments to place for several
reasons. First, place itself is an important kind of data
for the Arboretum. Indeed, its collections are assembled
from sites of scientific and cultural significance around
the world. Second, the Arboretum has long sought to be
a place in which scientists and citizens alike can encounter large collections of data first hand, simply by walking
the landscape and discovering the variety of carefully
tagged plants. Third, when understood as a set of conditions for production, the place has shaped fluctuations
School of Literature, Media and Communication, Georgia Institute of
Technology, Atlanta, GA, USA
Corresponding author:
Yanni Alexander Loukissas, Program in Digital Media, School of
Literature, Media and Communication, Georgia Institute of Technology,
TSRB 85 5th Street NW, Room 318A, Atlanta, GA 30308, USA.
Email: yanni.loukissas@lmc.gatech.edu
Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://
www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further
permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-accessat-sage).
2
Big Data & Society
Figure 1. Map of the Arnold Arboretum. Courtesy of the Arnold Arboretum Archives. ß President and Fellows of Harvard College.
in botanical data over the course of the Arboretum’s
long history. Finally, when looked at abstractly, the
Arboretum’s data can help us see place in new ways,
which are not limited to aspects of geolocation. As I
will show, each of these attachments to place is a different way in which data are subject to environmental criticism. Moreover, the dimensions of place attachment
identified in this paper – and the means of identifying
them – suggest a place-based approach that might influence other studies of Big Data. If we are to illuminate
what is distinctive about Big Data as a cultural form, we
must attend to the relationships between data and place
that they manifest. Because of their size and scope, Big
Data have more and further reaching place attachments
than data at other scales.
Having said this, the accessions data of the Arnold
Arboretum do not conform to present-day definitions
of Big Data as high magnitude in a variety of dimensions: volume (terebytes or petabytes), velocity, variety,
scope, resolution, flexibility, and relations with other
data sets (Kitchin and Lauriault, 2014). However, this
litany of attributes accounts for only the most ambitious of contemporary practices with Big Data (Kitchin
and McArdle, 2016). My use of the term is more in line
with the work of boyd and Crawford, who characterise
Big Data as a phenomenon with not only technological
but also cultural and scholarly dimensions (boyd and
Crawford, 2012). I approach Big Data as an epistemological and performative shift in ways of doing
research, with a long history involving data sets that
Loukissas
were previously unmanageable. Seen in this way, we
might say that the Arnold Arboretum has been
making Big Data for over a century.
In the 19th and early 20th centuries, arboreta – as
well as libraries, museums, and zoos – held the Big
Data of their day. Institutions like the Arnold
Arboretum prefigured Big Data by drawing together
representative specimens from far and wide. The most
ambitious of these institutions sought to establish
themselves as comprehensive models of the world
(Battles, 2004). As with contemporary holders of Big
Data, these institutions continually outstripped strategies for managing all the records necessary to organise,
preserve, and study their contents. The Arboretum’s
historical data illustrates, better than most, a variety
of environmental issues in Big Data. At the
Arboretum, data are about place, in place, from
place, and even generative of place. Learning about
these long-standing forms of place attachment can
prompt us to challenge settled conceptions about the
relationship between data and place in contemporary
life.
Data and place
There is a long history of scholarship on the place of
information within discourses on cyberspace (Kalay
and Marx, 2001), cities (Mitchell, 1995), networking
(Graham, 1998), interaction (Dourish, 2006), and
development (Irani et al., 2010). However, discussions
of Big Data often downplay the significance of place.
Meanwhile, popular media depict Big Data as increasingly commonplace: a ubiquitous tool for governments
(Morozov, 2014), science (Anderson, 2008), and business management (Lohr, 2012). In scholarship, Big
Data and place are sometimes treated as incompatible
concepts. An influential article by Dalton and Thatcher
argues that Big Data distracts from attention to place.
‘Relying solely on ‘‘Big Data’’ methods’, they write,
‘can obscure concepts of place and place-making
because places are necessarily situated and partial’
(Dalton and Thatcher, 2014: 6). Rather, I understand
Big Data as situated and partial because of their attachments to distributed places.
Although place has been an important a topic of
interest in the social sciences (Gieryn, 2000), my readings of data in this paper are influenced by literary and
cultural studies. Buell, a leading voice for ecocriticism,
draws together many conceptions of place in his book,
The Future of Environmental Criticism (Buell, 2009).
Perhaps the most succinct of these is offered by
Agnew, who writes of places as ‘discrete if ‘‘elastic’’
areas in which settings for the constitution of social
relations are located and with which people can identify’ (Agnew, 2013: 263). Buell also expounds on the
3
multiple dimensions of place attachment in texts,
including temporal and imagined conceptions of
place. My development of the notion of place attachment for social studies of information builds on these
important precedents but is grounded in readings of Big
Data manifest at the Arnold Arboretum. In this article,
I define place as a framework with both social and spatial dimensions, in which data are created, displayed,
and/or managed, and which, reciprocally, is shaped by
those practices. Indeed, data are not simply site-specific
tools; they have the power to reconfigure place.
My reflections on the relationship between data and
place at the Arboretum only serve to refine existing
scholarship on the grounding of data within social studies of information and science, technology, and society
(STS). Scholars of information have examined how
the meaning and significance of the term ‘data’ has
evolved over the past few centuries (Day, 2014;
Drucker, 2011; Gitelman, 2013, 2014) as well as how
it differs in use across academic and professional
domains (Borgman, 2015; Star and Griesemer, 1989).
Borgman traces data to its earliest use in theology in
1646, when it was applied as a plural of the term datum.
It was not until the late 18th century, writes Borgman,
that data was used to describe the results of empirical
observations of the kind associated with scientific practice at the Arboretum.
Meanwhile, scholars in STS have developed empirical accounts of how data are situated in specific scientific contexts (Bowker and Star, 1999; Latour, 1987).
This scholarship has largely sought to complicate a
widely held, but simplistic perspective: that data are
universal, invariable, and altogether immaterial.
Latour deftly captures this purified conception of
data in the concept of ‘inscription’. In a frequently
referenced
paper
entitled
‘Visualisation
and
Cognition: Drawing Things Together’, Latour (1990)
explains inscriptions as things created for the production of scientific arguments. As he writes, ‘you have to
invent objects, which have the properties of being
mobile but also immutable, presentable, readable and
combinable with one another’ (Latour, 1990: 7).
Many scholars have challenged this instrumentalised
definition by exposing ways in which data practices,
and data themselves, vary from one context to the
next. Research on the diversity of data has been conducted in studies of laboratories (Cetina, 1999; Keller,
2003; Latour and Woolgar, 1979), museums (Star and
Griesemer, 1989), healthcare (Bowker and Star, 1999),
climate debates (Edwards, 2010), and space exploration
(Vertesi and Dourish, 2011).
Today, in the varied work practices at the Arnold
Arboretum, data are used as scientific evidence but also
simply as a tool for landscape management. I rely on a
grounded approach to data, with the aim of studying
4
the term as it is used in multiple ways in practice.
Aligned with this thinking, Borgman suggests that
understanding data means asking, ‘when are data?’
(Borgman, 2015) She writes, ‘entities become data only
when someone uses them as evidence of a phenomenon,
and the same entities can be evidence of multiple phenomena’. (Borgman, 2015: 28) In other words, data
must be performed. Moreover, data are more that
merely representational. In common parlance, the term
data can be used to mean secondary, digital representations of objects that hold scientific and cultural import.
However, my findings support the view that data are
part of an ontological ‘looping effect’ whereby they
help to shape the practices and institutions that create
them (Hacking, 1991; Kitchin and Lauriault, 2014).
Finally, I have found that prior scholarship in information studies and STS scrutinises data primarily
through case studies of discrete technological moments
or controversies; these studies provide an event-based
reading of data. In contrast, this paper contributes to
the development of an emergent place-based perspective (Galison and Thompson, 1999; Kirsch, 2011;
Livingstone, 2003). Though the concept of place
has been important to environmental criticism, it has
been largely overlooked in discourses on situatedness
(Buell, 2009). Rather, situatedness is defined primarily
as embodiment or social context (Haraway, 1988;
Suchman, 2007).
I contend that all data can be studied through a local
lens, in terms of their place attachments. Even Big Data
are connected to ‘local knowledge’, grounded in and
inseparable from their social, material, and spatial conditions (Geertz, 1985). Although data are reliably transferred across global communication networks;
everywhere, they remain marked by local artefacts:
traces of the conditions and values that are particular
to their origins. Accepting this claim necessitates a significant shift in our expectations of digital data, given
that the digital was invented to be independent of any
substrate (Hayles, 2008). Indeed, all data – not just
those created at arboreta and other sites for documenting nature – can be read through their attachments to
environments. However, data do not speak for themselves. Reading is a means of enacting data, which is
also locally situated. In this paper, I use close and distant readings to not only discover but also produce,
salient connections between Big Data and place.
Close and distant readings
Examining place attachments in data requires adopting
appropriate methods. In this paper, I make use of a
combination of techniques, which I will refer to as
close readings and distant readings. These are complimentary ways of interpreting accessions records from
Big Data & Society
the Arboretum: one up-close, the other from a distance.
This hybrid model of analysis owes much to developments in ‘close’ and ‘distant’ reading as methods of
interrogating texts in literary and cultural studies
(Jänicke et al., 2015). When used as a method of analysis for literary texts, Culler explains that close readings attend to ‘how meaning is produced or conveyed’
(2010: 22). Meanwhile, distant reading aims, paradoxically, not to read. Instead, the later technique, pioneered in literature by Moretti, aims to ‘generate an
abstract view by shifting from observing textual content
to visualizing global features of a single or of multiple
text(s)’ (Jaenicke and Franzini, 2015: 2). Moretti uses
traditional methods of graphical display, such as maps,
graphs, and trees to illuminate large-scale narrative and
geographic patterns in texts. Both close and distant
readings reveal not just what is in a data set, but how
that data might be enacted.
Through close and distant readings, I treat data as
texts: cultural expressions subject to interpretive and
speculative examination. However, accessions data
resemble indices more than prose. As such, they require
a great deal more context to decipher. Additionally,
both techniques suggest their own relationship to
place. The terms close and distant seem to describe a
spatial relationship between the analyst and the data.
However, my distance from the Arboretum data is not
so simply summarised. All my readings of the
Arboretum rely on the interpretations of Arboretum
staff members, who use their own local knowledge to
identify place attachments in the data that are not
immediately apparent. Rather, the difference between
close and distant reading techniques, applied to data,
hinges on the pervasiveness of the features being investigated. Close readings focus on isolated features in a
data set; distant readings illuminate features common
throughout.
Creating both kinds of readings for this paper relied
on a prolonged ethnographic engagement with the
Arnold Arboretum. During the period of 2012 to
2014, I lived and worked in close proximity to the
Arboretum. I conducted nine semi-structured interviews with researchers, administrators, and technologists at the institution and did archival work at their
library. But more importantly, I was a participant
observer in both formal and informal engagements,
including: a course on landscape architecture, a series
of outings to map a ‘wild’ portion of the Arboretum,
and an intensive two-day workshop that brought
together Arboretum staff with STS scholars (http://
stsdesignworkshop.tumblr.com). Over the course of
the final year of this engagement, I worked with
Arboretum staff to develop close and distant reading
techniques appropriate for looking at their data.
Beyond the findings about place attachments in Big
Loukissas
Data, this approach furthers the development of interpretive digital methods and their adaptation from traditional humanities subjects to the study of other forms
of data.
Reading accessions data as texts
Seeing data as texts accessible to traditions of hermeneutic inquiry means reading them within an interpretive
context. I argue that it would be difficult to understand
these records without considering the way they are culturally and materially situated in place. Indeed, accessions records have a long history of development and use
at the Arboretum. For one thing, they were not always
recognisable as data. The Arboretum has weathered
many successive regimes of documentation. Thus, each
organism has germinated within a social and technological setting, its care and curation managed through
the instruments and information structures deployed
during its lifetime. These place-based practices, and the
documents they produce, register what is valued about
individual organisms at the Arboretum and, in turn,
how those values change over time (Figure 2).
Today, plants collected from around the world and
across time are held together at the Arboretum by a
custom digital record system called BG-Base, a database system developed specifically for this collection.
Each entry in the Arboretum’s data set includes an
accession number, an extensive list of scientific,
common, and abbreviated names, redundant ways of
identifying the time of accession, the form and mechanism of reception, individuals associated with the
plant, various descriptions of the place the accession
hails from, its condition in the wild, and an additional
catch-all category. A list of fields used by the
Arboretum includes the following:
ACC_NUM, HABIT, HABIT_FULL, NAME_NUM,
NAME, ABBREV_NAME, COMMON_NAME_
PRIMARY, GENUS, FAMILY, FAMILY_COM
MON_NAME_PRIMARY, APG_ORDER, LIN_
NUM, ACC_DT, ACC_YR, RECD_HOW, RECD_
NOTES,
PROV_TYPE,
PROV_TYPE_FULL,
PSOURCE_LABEL_ONE_LINE,
COLLECTOR,
COLL_ID, COLLECTED_WITH, COUNTRY_
FULL, SUB_CNT1, SUB_CNT2, SUB_CNT3,
LOCALITY,
LAT_DEGREE,
LAT_MINUTE,
LAT_SECOND,
LAT_DIR,
LONG_DEGREE,
LONG_MINUTE, LONG_SECOND, LONG_DIR,
ALTITUDE, ALTITUDE_UNIT, DESCRIPTION,
COLLECTION_MISC
If found within a library, museum, or archive, many of
these fields would be incorporated into metadata: the
information necessary to catalogue a book or other
5
object, such as details of their contents, context, quality, structure, and accessibility. At the Arboretum, this
locally defined selection of fields is known simply as
‘accessions data’. However, accessions data are
shaped by many of the same local forces that affect
metadata (Edwards et al., 2011; Mayernik et al.,
2011). Furthermore, as with metadata, each accession
record exists as part of …
Purchase answer to see full
attachment