The Pleiadic Gaze

Uncorrected script for my presentation at the 19th International Congress of Classical Archaeology, Cologne/Bonn, 22 – 26 May 2018. Entitled "The Pleiadic gaze: Looking at archaeology from the perspective of a digital gazetteer," it was scheduled to be delivered on Saturday, 26 May 2018 in Panel 12.1 | Part 2: Classical Archaeology in a Digital World (The AIAC presidential panel).

Slides may be had here -- for the moment only in Microsoft Powerpoint format (sorry) -- with corresponding slide numbers in the text below. I'll re-edit this page anon to link everything together in a more open way, and to provide live hyperlinks to web resources mentioned in the text and depicted in the slides.

Herewith, the text:

{2} Pleiades is an on-line, open gazetteer and graph of ancient places, spaces, and peoples.

Unlike traditional gazetteers it includes not only named features, but also

{3} geographic entities whose historical names are lost

{4} and even geographic entities that may never have had a name.

{5} Unlike traditional geographic information systems, Pleiades catalogs not only those places that can be abstracted into a point, a line, or a polygon on a map,

{6} but also those whose locations are lost, uncertain, disputed, or comprehended only through the aggregate locations of other, related places.

{7} This flexibility arises from fundamental decisions taken at an early stage in the project (that is, between 1999 and 2006 when we got our first Pleiades grant from the US National Endowment for the Humanities).

{8} We were certainly at pains to preserve and extend the information that had been carefully gathered during the Classical Atlas Project's 12-year, 200-person quest to produce the Barrington Atlas of the Greek and Roman World, which was published by Princeton University Press in the year 2000. Some digital techniques had been used in the preparation of the Atlas (not least email!). But given the transformation in data management and geospatial computing technologies that transpired during the lifetime of the Atlas project, there could be no question that newer techniques would be needed in any follow-on effort to keep the information underpinning the atlas up-to-date.

I should pause to point out that we are blessed to have present today at least one member of that sacred band of Barrington Atlas heroes: Liz Fentress.

{9} In thinking about a follow-on effort, we were influenced then in no small way by the recent emergence of the so-called "wiki way": we wondered if the crowd-sourcing approach being pioneered by Wikipedia could be adapted to the task of updating and expanding all our atlas information. The era of steady funding for longue durée academic projects was ending. Maybe we could use a wiki approach to assemble a globally distributed nerd army to succeed that Atlas Project warrior band. But if web-mediated, incremental, and asynchronous content creation could be a way forward for any descendant of this great atlas, it was obvious that we would have to change the fundamental structure of the information. The Atlas and its maps would have to be digitized, reorganized into little, discrete pieces, and somehow surfaced on line in a readily edited format. And we would need processes for both pre- and post-publication review of changes.

Beyond data management, there were more and different user needs. How would one work effectively and economically with the restructured data? How would it appear online for human use? Could it be made available for other projects, digital or otherwise? How much could be done automatically?

It was clear by the turn of the century that a paper atlas and its twelve-hundred-page, two-volume companion directory could not adequately serve the users of new media and new computing tools.

{10} Indeed, in a laudatory (but critical) review of the Atlas for the Journal of Roman Archaeology in 2001, Susan Alcock, Hendrick Dey, and Grant Parker imagined a digital Barrington Atlas -- more portable and less constrained by the limitations of map frame and scale. Such a thing exists now, thanks to Princeton University Press, who brought out an iPad version of the Barrington Atlas a few years ago. It is a beautiful and wonderful thing, as far as it goes, but I think that if you've used it at all, you'll agree that it doesn't do much other than saving you from having to carry around a big double folio of dead trees.

{11} Those of us gathered around the Atlas' birthplace in Chapel Hill, North Carolina, wanted portability and digital versions of the maps too. But we also wanted an environment in which other, more conceptual issues could be addressed. Among these was a concern also raised by Alcock, Dey, and Parker: the Atlas's selective omission in many areas of the results of archaeological survey. In their view, the Atlas could be forgiven somewhat since:

to replicate survey data to scale in hard copy would be an ordeal

{12} but

(electronic formats ... will have no such excuse)

No pressure.

We would need something that would let us add new features and refine existing coordinate pairs. To record finer or different temporal characteristics. To classify places more flexibly and indicate change in use over time. To add toponyms. To indicate relationships between places. To express uncertainty. To link information to scholarly literature, primary sources, physical objects, and archaeological data.

And so, in the midst of a technological phase of the widely remarked, twentieth-century "spatial turn" in the humanities -- a moment that most people equated with on-line maps, historical GIS, and spatial computing -- we took the road less traveled. We pushed the map to the side and put places themselves first.

In pursuing place, we had helpful guides in Ti-Fu Yuan and other human geographers of the late 20th century who explored the idea of place as a cognitive or experiential construct. In Pleiades, therefore, places are conceptual entities: we apply the term to any locus of human attention, material or intellectual, in a real-world geographic context, whether or not it can be named or mapped or visited today. The spatial aspects of Pleiades places (that is, latitude and longitude coordinates in space), as well as their ancient and modern names, are subordinated to this idea of place, becoming optional attributes in the information construct, rather than first-class entities.

Various technical architectures and associated editorial processes were considered for dealing with all these demands. You'll perhaps be happy to hear that we do not have time today to discuss any of them in detail, because I want to talk about a topic that Ortwin Daly introduced earlier in this panel.

{13} Suffice it to say that we decided to use and customize an open-source, web-based content management system in order to put Pleiades on line. It was this plan that earned us our first round of funding and that attracted Sean Gillies to the project.

{14} Sean served as Pleiades' chief engineer for over 7 years. Now employed by Mapbox.com, he still deserves credit for the shape and function of the Pleiades web application as it appears today, as well as its underlying data. This credit is especially due in three areas:

First, Sean designed and built the code we needed to support geospatial indexing and mapping, functions that our content management framework didn't handle natively;

{15} Second, Sean led work on the linked data and export formats we needed to meet user needs, including one that evolved into one the most widely used web formats for spatial data today: GeoJSON.

Thirdly, Sean kept a relentless focus on clean, clear data structures and the paths to them, hiding implementation detail and privileging stability.

I'll be focusing my remarks for a bit on these last two areas: the formats in which we surface Pleiades data and the mechanisms whereby our users -- both sentient and algorithmic -- interact with the data. Why? Because it turns out that the way we do this is what makes Pleiades worthwhile. It's what makes us more than a big encyclopedia of not-very-consistent information about ancient places. It makes us more than a data management tool for a particular scholarly endeavor. And it's all about citation.

{16} Citation -- the glue that holds together so much of the scholarly enterprise -- was particularly ill served in the so-called "web GISs" of the late 90s and early 2000s. Whereas at least with a paper atlas, one could refer to Map number, grid square, and label in order to cite a specific place, most early on-line map systems seemed almost hermetically sealed.

{17} Despite the ubiquity of hyperlinks -- the central affordance of the World-Wide Web and arguably its only distinguishing feature -- one could not count on making a stable link to a particular place, map view, zoom level, or coordinate location. All the specifics of these interactions were hidden behind the user interface and a simple, top-level web address or some kind of nasty, ephemeral search string. Would that such barbarism had been just a passing fad! But now, twenty years on, many online GIS and mapping environments still behave this way. They mimic desktop mapping software, embodying the assumption that whatever the system can do, it should only do it for the individual person interacting with it right now.

Discovery, reference, and review; collection and reuse of information: these are all fundamental scholarly activities that are completely dependent on stable citation. They cannot function under a regime like this. The tantalizing possibility of computationally actionable citation -- the idea that computer programs might exploit links and connected resources to do complex discovery, correlation, and even reasoning without direct human supervision -- seemed in 2006 like a dream straight out of science fiction.

On the world-wide-web, the identifiers necessary for citation should be front-and-center: they are the strings of characters that you put into the location bar of your browser in order to retrieve a page. They are the essential magic in a hyperlink. Their technical name is "Uniform Resource Identifier," a phrase usually abbreviated with the acronym URI. URIs (or yourees, as they're sometimes pronounced) . are cool.

{18}

They're cool because, if you construct them sensibly and connect them to interesting information and take care of them so they don't rot, they make citation happen. In throwing off the normalizing tyranny of a single map view to embrace the radical equality of all places, Pleiades was citation ready. Because Sean Gillies (and others present at the creation) payed attention to emerging best practice and cared about scholarly communication, Pleiades was born citation-friendly.

{19} May I present a Pleiades URI? We've seen several of them already, at the top of slides from the beginning of my talk. You can think of them as the passport numbers for ancient places. They're simple.

{20} Each one uniquely identifies a Pleiades place resource. And we promise to keep them stable for as long as Pleiades exists. We embed them into all our export formats so that even when Pleiades does die, or when the World-Wide Web is replaced by something else that does things differently, a copy of our dataset can be retrieved from one of several digital archives and put back together with any other data that used our URIs for citation.

There's a growing body of such data.

{21} The Peripleo search engine demonstrates geographic connections between items in scores of different datasets concerned with ancient places and objects. Peripleo is a demonstration tool, developed under the auspices of an internation project known as the Pelagios Commons and funded by the Andrew W. Mellon Foundation. Peripleo's principal developer is Rainer Simon, a Senior Scientist at the Austrian Institute of Technology. The datasets indexed by Peripleo include not only Pleiades and a number of other digital gazetteers, but also several numismatic databases, epigraphic websites, university and museum collections, as well as textual resources, and archaeological repositories. There are too many to enumerate here or put on a single slide, so I'll just make mention of those I know to be related to people in this room:

  • {22}The Fasti Online database of archaeological reports
  • {23} iDAI gazetter of the German Archaeological Institute
  • {24} The Inscriptions of Ancient Sicily

{ ASK IF ANYONE ELSE HERE HAS A DATASET IN PERIPLEO}

The gazetteers that have been indexed by Peripleo are not just reference points for other datasets. They cite each other, using URIs on the Pleiades model. {25} So, taking Corinth as an example, we can see that eight different gazetteers contain one or more records related to Corinth, and these can all be brought together by way of their mutual citations.

{ TALK THROUGH THE CITATION NETWORK IN THE GRAPH }

{26} With the place entries in the gazetteers collated, it's then possible to present together all the records from the other databases that cite one of those gazetteer entries. As of last night, Peripleo could identify 1,897 object records related to Corinth.

{27} This slide captures Peripleo with the results list opened and one of the records selected. If this were the live site, we could click right through to the corresponding record in the Mantis database of the American Numismatic Society.

{28} Here's a capture of that page. Notice the URI? Mantis has cool URIs too. So does every other data source indexed by Peripleo. That's why this all works.

{29} So, is Peripleo the tool you need for in-depth research and analysis on an archaeological topic? Unlikely. But it demonstrates a very important fact with significant implications for future research work in archaeology: computationally actionable citation is here. We have scores of datasets on a variety of useful archaeological themes that can be quickly assessed for interrelationships of interest and then combined, as needed, to support a variety of research tasks. Geography is just one of the axes of citation we can exploit. Gazetteers for other things like named time periods, prosopography, materials, or building techniques already exist too or are being built. The opportunities and consequences should be obvious: if you use comparative or connective data in your work, learn how to exploit these new tools. If you produce datasets in the course of your research, define URIs for items of interest therein, publish the data on-line under open license, and liberally cite the URIs from other datasets whenever it is appropriate to do so.

There are several ways in which the Pleiades community is working to make this network of actionable citation more robust and more useful for the study of the ancient world. I'd like to use my remaining time to touch on a few of those that have specific bearing for archaeology.

{30} One of our earliest and biggest efforts has been in increasing the precision and improving the accuracy of the spatial coordinates we provide. The scales used in the Barrington Atlas limited the effective precision of any coordinates digitized from the maps to a range between two and twenty kilometers. New Pleiades coordinates have come from a variety of sources, but {31} increasingly, we've come to rely on OpenStreetMap. OSM is a global collaborative resource for high-resolution, real-world mapping that often captures archaeological monuments, structures, and districts that remain in situ. And, despite its name, it takes in much more than just streets.

The data we inherited from the Atlas also had scale-related limits on the types of features mapped, and therefore on the initial set of places created from that data in Pleiades. Things like temples, sanctuaries, churches, monuments, and tombs only appeared in the Barrington when they lay outside settlements. Pleiades has no such limitations, and so our contributors have begun adding these more compact places in many areas and connecting them to each other using a prototype vocabulary of topographical and thematic relationship types.

{32} Recent work on the place resource for Nineveh demonstrates what's possible. Jamie Novotny and colleagues, working under the auspices of Karen Radner at Munich, have added new place resources for palaces, temples, and other features attested at Nineveh, connecting to the place resource for the settlement itself.

What about -- as Alcock, Dey, and Parker labeled it -- "the small stuff" that the Barrington omitted? The findspots of coins and inscriptions, the kilns, olive presses, and agglomeration rurale? The interpreted results of regional survey.

{33} We're making a start by working with Alessandro Battisti on the data published at rusafricum.org. This data derives from the joint Italian and Tunisian Thugga survey directed by Mustapha Khanoussi, Samir Aounallah, and Mariette de Vos.

{34} Rus Africum records have cool URIs and the data has already been put into Pelagios.

{35} The graph view in Peripleo demonstrates that, where a Rus Africum site matches up with a Pleiades place, Alessandro and his colleagues have already noted the equivalence and made an appropriate citation.

{36} Alessandro has also been working with Jeffrey Becker, one of Pleiades' volunteer Associate Editors, to improve Pleiades coordinates on the basis of Rus Africum's data. Where appropriate, they're adding the locations first to OpenStreetMap, and through it, to Pleiades.

{37} What remains to sort out are the Rus Africum features not in Pleiades. Here are a few examples. How much should come into Pleiades (with appropriate citation and provenance, of course) and how much should remain solely in the Rus Africum dataset? I don't know the answer to that question yet, but I'm confident of one thing. As a community, we'll weigh carefully factors like citation reliability and long-term utility as we work toward a solution.

These are a few of the ways in which the Pleiades community is working to support citation, to make Pleiades more useful for archaeologists, and to better use and reflect the results of archaeological work. But my time is up, so I'll conclude with a recruiting pitch.

{38} The Pleiades nerd army is an all-volunteer force. If you're interested in helping build and maintain Open Linked Data for Ancient Studies, please consider joining us. There are many ways to help, either by working on the content in the Pleiades gazetteer itself or by publishing datasets or software applications that use or link to it.

{39} I have stickers.