The Imperiia Project: a spatial history of the Russian Empire

Setting a metadata standard

In summer 2023 the Imperiia team partnered with the Harvard Map Collection on the “Undoing Empire” project. The project was awarded a Harvard Library Advancing Open Knowledge grant to sustain work across a six-month period. It had three goals:
  1. Create a database of biodiversity in 19th-century Ukraine,
  2. Create an inclusive strategy for mapping historical places, and
  3. Develop best practices for producing data that can be preserved via the Harvard Geospatial Library and the Harvard Library (HOLLIS) catalog.
The partnership was (and still is) a natural one. At the Map Collection, librarians Belle Lipton and Marc McGee want to encourage researchers to produce high-quality spatial data, while the Imperiia team wants the spatial data we produce to be discovered and used. Together, we are addressing a common problem in digital scholarship: Most researcher-generated datasets, which are built from a variety of archival and other materials, are not preserved for the long term in open-access library systems. Even in cases where datasets are discoverable, they are frequently missing important contextual information, such as detailed source citations or descriptions of the methods used to convert those sources into data. These problems make it nearly impossible for others to make sense of — let alone use — these pieces of scholarship.

If a book needs great footnotes, a dataset needs great metadata.

Composing either one is anticlimactic at best and tedious at worst. But what if we reinvented the documentation of decision-making? What if we found a way to streamline the production of metadata? What if we developed a model for productive collaboration between researchers and librarians? What if we built real — possibly entertaining — human conversation into the process?

To that end, we tested workflows, documented practices, designed interview questions, and generally attempted to smooth the path from inception to publication of spatial data. We found that as important as it is to iterate on a data model, the best way to produce valuable data is to sit down and talk with other stakeholders.

Key Output: The Harvard Map Collection GIS Data Curation Services Template

This page has paths: