Your cart is currently empty!
Digitised Books. c. 1510 – c. 1900. JSONL (OCR derived text + metadata)
The dataset comprises metadata and OCR generated text from 49,455 digitised books published between c. 1510 – c. 1900. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JSON Lines (JSONL) text format.
Additional information
UniqueID | 7bf6279d-b8b1-45f4-8fe4-a0c06fdba87c |
---|---|
BL Dataset Provider | |
User Access Level | |
BL Labs Assistance | |
Contributors | van Strien, Daniel, and Filipe Bento |
Institution | British Library Labs |
Language | |
Year Added | |
Contact Person | British Library Labs |
Location | Repository Cloud |
Official URL | |
Is It Being Updated | |
Any Issues With Access | No |
Files | 1700_1799.tar.gz, 1870_1879.tar.gz, 1860_1869.tar.gz, 1890_1899.tar.gz, 1880_1889.tar.gz, 1510_1699.tar.gz, 1800_1809.tar.gz, 1810_1819.tar.gz, 1840_1849.tar.gz, unk.tar.gz, 1820_1829.tar.gz, 1850_1859.tar.gz, 1830_1839.tar.gz |
T&C Needed | |
Rights Assessment |
Only logged in customers who have purchased this product may leave a review.
Reviews
There are no reviews yet.