Imperiia: a spatial history of the Russian Empire

V03. The Tsar's Trans-Atlantic Voyagers

What it is

This is an open-access dataset intended for research and analysis. Its source is a dataset provided by the U. S. National Archives consisting of half a million passenger arrival records and ship manifests across six decades, from 1834 to 1897. Our edition consists of 11 related tables (csv), 6 spatial data files (shapefiles and geojson), and 2 metadata files (ReadMe; codebook). Together the files describe 527,394 passengers, 10,761 voyages, 781 ships, 681 occupations, 182 last known residences, 150 routes, and 78 ports.

Why it matters

The original dataset contains passenger records with name, age, town of last residence, destination, and codes for sex, occupation, literacy, country of origin, transit and/or travel compartment. It also contains manifest records - think of them as voyage records - including ship name, arrival date, and arrival port.

As unique and vast as it is, the original data is not easy to use. The records are full of inconsistencies and ambiguities, and are not easily ingested by visualization and GIS tools. They are, like most historical sources, a wonderful example of messy data. Our work to generate an enhanced and usable edition fell into four buckets:

Fear not: inconsistencies and ambiguities still turn up everywhere, but we are relatively confident that with this new edition you will find them productive and stimulating rather than frustrating.


Publication date: March 29, 2023

