User:Bertspaan/maps
Structured Data for Maps
Goal:
Design a metadata format that:
- captures transformations needed for the georectification of (historical) maps;
- as well as the pixel mask that can be used to remove the non-cartographic parts of those maps.
This metadata format should work with Wikidata (d:Q15726418), Wikimedia Commons (File:Amsterdam1688.jpg) and IIIF ([1]).
This metadata format could use JSON Schema (https://json-schema.org/) to describe and verify metadata.
Resources
[edit]Phabricator ticket
[edit]- Phabricator: phab:T227036
Sample maps
[edit]- File:Northern provinces of the United States - drawn and engraved for Thomson's New general atlas, 1817; Hewitt Sc. ... NYPL434391.tiff -> http://maps.nypl.org/warper/maps/13071#Preview_tab (NYPL MapWarper)
- File:Pigot and Co (1842) p2.138 - Map of Lancashire.jpg -> http://britishlibrary.georeferencer.com/id/11020006456 (Klokan version 2)
- File:Larousse, Plan de Paris, 1900 - David Rumsey.jpg -> https://davidrumsey.georeferencer.com/maps/553129769171/view (Klokan version 4)
- File:1768 Jeffreys Wall Map of India and Ceylon - Geographicus - India-jeffreys-1768.jpg -> https://warper.wmflabs.org/maps/1998#Preview_tab (Commons MapWarper)
Documentation / Presentations
[edit]Wikidata / SDC properties and resources for maps
[edit]Existing properties
[edit]- d:Wikidata:WikiProject Maps --> properties, statistics
- d:User:Jheald/BL18C/tracking --> d:User:Jheald/BL18C/queries some queries
- https://tools.wmflabs.org/hay/propbrowse/
Property proposals
[edit]- d:Wikidata:Property proposal/external georeferencer URL
- d:Wikidata:Property proposal/georeferencing data
- georeferencing control point data
- georeferencing pixel mask data
- georeferencing mask geoshape
- d:Wikidata:Property proposal/based on tabular data
- d:Wikidata:Property proposal/image revision-id
- d:Wikidata:Property proposal/region within image
Data to be stored
[edit]In [Map Warper](http://maps.nypl.org/warper/), the following information is captured for each georectified map:
- A [list of ground control points](http://maps.nypl.org/warper/maps/15345/gcps.json) (GCPs), each of those points is a mapping between a pixel coordinate on the scanned map and a real-world coordinate. In Map Warper, these are always [WGS 84](https://en.wikipedia.org/wiki/World_Geodetic_System) latitude/longitude coordinates.
- A mask to remove the non-cartographic parts of the scanned map, in pixel coordinates. In Map Warper, this is a [GML polygon](http://maps.nypl.org/shared/masks/15345.gml).
- [GDAL](https://gdal.org/programs/gdaltransform.html), the open source software Map Warper uses to georectify maps also only expects a list of GCPs (QGIS, another open source GIS application used for map georectification also uses GDAL). However, GDAL can transform maps to any projection, not just WGS 84.
The following JSON structure could store the GCPs and mask:
{
"maps": [
{
"gcps": [
{
"image": [420, 503],
"world": [4.900, 52.162]
},
{
"image": [1801, 1700],
"world": [4.991, 52.362]
},
{
"image": [1001, 1201],
"world": [4.224, 52.962]
}
],
mask: [
[100, 2010],
[4032, 2010],
[4100, 300],
[100, 200]
]
}
]
}
Or we could use a GeoJSON-based format, like so: https://commons.wikimedia.org/wiki/Data:Pigot_and_Co_(1842)_p2.138_-_Map_of_Lancashire.georef.map
- A scanned map can consist of multiple maps (e.g. map sheets, inset maps). We can do this by allowing multiple georectified maps in a single JSON \`maps\` object.
- Is it necessary to allow for other projections? The GeoJSON standard decided to [only support WGS 84](https://tools.ietf.org/html/rfc7946#section-4), but their arguments are less important for this proposal; we're using GDAL anyway, and GDAL works with all map projections.
- The mask polygon is different from a GeoJSON polygon: our mask does not support holes.
Tools
[edit]When we've designed this metadata format, we can:
- Build a web interface to mask scanned maps. This tool should accept IIIF images and Wikimedia Commons images. Example: https://i.imgur.com/6hXf4xZ.png
- Create a command line script that uses [GDAL](https://gdal.org/programs/gdalwarp.html) to rectify and crop maps using the JSON rectification metadata.
- Create a IIIF \`manifest.json\` file that combines the original map with the georectified map.
- Create software that translates [Web Mercator requests](https://en.wikipedia.org/wiki/Web_Mercator_projection) to IIIF requests, using the JSON rectification metadata.
- Start gathering maps from as many repositories of georectified maps as possible, and convert their metadata to our new format. For example: [NYPL](http://maps.nypl.org/warper/), [Map Warper](http://mapwarper.net/), Wikimedia Commons, [David Rumsey Map Collection](https://www.davidrumsey.com/view/georeferenced-maps), [Stadsarchief Amsterdam](https://beeldbank.amsterdam.nl/beeldbank?f_sk_documenttype=kaart&f_string_geoserver_store%5B0%5D=%2A), and many more!
`
Currently, only the bounding box is stored as structured data:
https://commons.wikimedia.org/wiki/File:Helsingin_kartta_Nummelin_1876.png
- Map Warper (NYPL): http://maps.nypl.org/shared/masks/15345.gml
- Map Warper (WMF): https://warper.wmflabs.org/mapimages/178.gml
- Klokan (version 2): http://british-library.georeferencer.com/map/pce6av5ibirax6d6SvwIyl/201408122139-LUdazO/visualize (and then use View Source)
https://docs.google.com/presentation/d/1OkJZVjF471LPywNSCtuTuvjUEBRdHMI0znas_a1Ms6k/edit#slide=id.g59a6f0df93_0_2 https://observablehq.com/@bertspaan/proposal-for-wikimania-2019-hackathon