The Imperiia Project: a spatial history of the Russian EmpireMain MenuProjectsDashboardsData CatalogMapStoriesGalleriesGamesWho said history was boring?Teach Our ContentCiting the ProjectKelly O'Neilldc20b45f1d74122ba0d654d19961d826c5a557f5The Imperiia Project // Davis Center for Russian and Eurasian Studies, Harvard University
Routes of emigration from the Russian Empire to the USA, 1834-1897
12023-03-16T10:28:22-04:00Yipeng Zhoubaef370094247c455a6c8632f4ff98d54bc4c5ee91Routes of emigration from the Russian Empire to the USA, 1834-1897 (thin)plain2023-03-16T10:28:22-04:00Yipeng Zhoubaef370094247c455a6c8632f4ff98d54bc4c5ee
This page is referenced by:
12023-03-16T10:02:01-04:00The Tsar's Trans-Atlantic Voyagers26Explore the social history of tsarist migration with passenger records from American ports.plain2024-03-01T11:03:50-05:00
What it is
This is an open-access dataset intended for research and analysis. Its source is a dataset provided by the U. S. National Archives consisting of half a million passenger arrival records and ship manifests across six decades, from 1834 to 1897. Our edition consists of 11 related tables (csv), 6 spatial data files (shapefiles and geojson), and 2 metadata files (ReadMe; codebook). Together the files describe 527,394 passengers, 10,761 voyages, 781 ships, 681 occupations, 182 last known residences, 150 routes, and 78 ports.
Why it matters
The original dataset contains passenger records with name, age, town of last residence, destination, and codes for sex, occupation, literacy, country of origin, transit and/or travel compartment. It also contains manifest records - think of them as voyage records - including ship name, arrival date, and arrival port.
As unique and vast as it is, the original data is not easy to use. The records are full of inconsistencies and ambiguities, and are not easily ingested by visualization and GIS tools. They are, like most historical sources, a wonderful example of messy data. Our work to generate an enhanced and usable edition fell into four buckets:
We cleaned and tidied, documenting our process every step of the way.
We reorganized the data according to a relational model.
We added calculated fields as well as context fields drawn from other sources.
We created vector data - port locations, route lines, and locations of last known residences - to facilitate spatial analysis.
Fear not: inconsistencies and ambiguities still turn up everywhere, but we are relatively confident that with this new edition you will find them productive and stimulating rather than frustrating.