This page was created by Yipeng Zhou.  The last update was by Paul Vadan.

The Imperiia Project: a spatial history of the Russian Empire

The Tsar's Trans-Atlantic Voyagers

What it is

This is an open-access dataset intended for research and analysis. Its source is a dataset provided by the U. S. National Archives consisting of half a million passenger arrival records and ship manifests across six decades, from 1834 to 1897. Our edition consists of 11 related tables (csv), 6 spatial data files (shapefiles and geojson), and 2 metadata files (ReadMe; codebook). Together the files describe 527,394 passengers, 10,761 voyages, 781 ships, 681 occupations, 182 last known residences, 150 routes, and 78 ports.

Why it matters

The original dataset contains passenger records with name, age, town of last residence, destination, and codes for sex, occupation, literacy, country of origin, transit and/or travel compartment. It also contains manifest records - think of them as voyage records - including ship name, arrival date, and arrival port.

As unique and vast as it is, the original data is not easy to use. The records are full of inconsistencies and ambiguities, and are not easily ingested by visualization and GIS tools. They are, like most historical sources, a wonderful example of messy data. Our work to generate an enhanced and usable edition fell into four buckets:

Fear not: inconsistencies and ambiguities still turn up everywhere, but we are relatively confident that with this new edition you will find them productive and stimulating rather than frustrating.

Usage Notes

This is a relational database. Each feature in a table has a unique identifier. The most important connections across tables are made through the MID and RouteID. See the Data Model.

The spatial data files (ports and routes) contain the relevant data from the corresponding csv files (via spatial join).

Note on the "sex" designation ("passengers" file). In many cases there is an apparent discrepancy between the first name, sex, and occupation/family role. Users wanting to explore ratios of male/female immigrants should cross-check the "passenger" data using the 3 gender indicator files.


Publication date: March 29, 2023

This page has paths:

This page references: