Text extracted from digitised maps of eastern Africa circa 1880-1940

This dataset comprises an Excel spreadsheet of text extracted from almost 2,000 digital images of maps and documents held in the War Office Archive, covering a large part of eastern Africa between c.1880 and 1940. The items were catalogued and digitised with generous funding from Indigo Trust. The harvested text includes names of historical settlements and ethnic regions in eastern Africa, descriptions of historical land use, topography and vegetation, and notes of ethnographic, military or administrative context. Auto-extraction of the text was carried out using the Google Vision API. The spreadsheet provides text found, confidence scores from Google Vision relating to transcription accuracy, locations of text on each image, and links to a geographical search interface that enables access to the relevant images and catalogue records hosted on the BL website. A large number of erroneous text results were cleaned from the full set of Google Vision responses, but some errors remain – for example, where individual characters have been incorrectly transcribed within words, though the words themselves should still be identifiable. In addition, not all words appearing on the maps were captured.

Additional information

UniqueID

117ed6c1-d9ba-481a-bae6-74d389f6a441

BL Dataset Provider

User Access Level

BL Labs Assistance

Authors

Dykes, Nick

Language

Year Added

Contact Person

British Library Labs

Location

Repository Cloud

Official URL

https://doi.org/10.23636/1176

Is It Being Updated

Any Issues With Access

No

Files

War_Office_Archive_Eastern_Africa_Text_Harvested.xlsx, War_Office_Archive_Eastern_Africa_Text_Harvested__2_.xlsx

T&C Needed

Rights Assessment

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.