ArcGIS REST Services Directory Login
JSON

Layer: BuildingFootprints (ID: 0)

Name: BuildingFootprints

Display Field: OBJECTID

Type: Feature Layer

Geometry Type: esriGeometryPolygon

Description: <DIV STYLE="text-align:Left;"><DIV><DIV><P><SPAN>The building extraction is done in two stages:</SPAN></P><P><SPAN>Semantic Segmentation – Recognizing building pixels on the aerial image using DNNs</SPAN></P><P><SPAN>Polygonization – Converting building pixel blobs into polygons</SPAN></P><P><SPAN /></P><P><SPAN>Semantic Segmentation:</SPAN></P><P><SPAN>DNN architecture</SPAN></P><P><SPAN>The network foundation is ResNet34 which can be found here. In order to produce pixel prediction output, we have appended RefineNet upsampling layers described in this paper. The model is fully-convolutional, meaning that the model can be applied on an image of any size (constrained by GPU memory, 4096x4096 in our case).</SPAN></P><P><SPAN /></P><P><SPAN>Training details</SPAN></P><P><SPAN>The training set consists of 5 million labeled images. Majority of the satellite images cover diverse residential areas in US. For the sake of good set representation, we have enriched the set with samples from various areas covering mountains, glaciers, forests, deserts, beaches, coasts, etc. Images in the set are of 256x256 pixel size with 1 ft/pixel resolution. The training is done with CNTK toolkit using 32 GPUs.</SPAN></P><P><SPAN /></P><P><SPAN>Metrics</SPAN></P><P><SPAN>These are the intermediate stage metrics we use to track DNN model improvements and they are pixel based. The pixel error on the evaluation set is 1.15%. Pixel recall/precision = 94.5%/94.5%</SPAN></P><P><SPAN /></P><P><SPAN>Polygonization:</SPAN></P><P><SPAN>Method description</SPAN></P><P><SPAN>We developed a method that approximates the prediction pixels into polygons making decisions based on the whole prediction feature space. This is very different from standard approaches, e.g. Douglas-Peucker algorithm, which are greedy in nature. The method tries to impose some of a priori building properties, which is, at the moment, manually defined and automatically tuned. Some of these a priori properties are:</SPAN></P><P><SPAN /></P><P><SPAN>The building edge must be of at least some length, both relative and absolute, e.g. 3 meters</SPAN></P><P><SPAN>Consecutive edge angles are likely to be 90 degrees</SPAN></P><P><SPAN>Consecutive angles cannot be very sharp, smaller by some auto-tuned threshold, e.g. 30 degrees</SPAN></P><P><SPAN>Building angles likely have very few dominant angles, meaning all building edges are forming an angle of (dominant angle ± nπ/2)</SPAN></P><P><SPAN>In near future, we will be looking to deduce this automatically from existing building information.</SPAN></P><P><SPAN /></P><P><SPAN>Metrics</SPAN></P><P><SPAN>Building matching metrics:</SPAN></P><P><SPAN /></P><P><SPAN>Metric Value</SPAN></P><P><SPAN>Precision 99.3%</SPAN></P><P><SPAN>Recall 93.5%</SPAN></P><P><SPAN>We track various metrics to measure the quality of the output:</SPAN></P><P><SPAN /></P><P><SPAN>Intersection over Union – This is the standard metric measuring the overlap quality against the labels</SPAN></P><P><SPAN>Shape distance – With this metric we measure the polygon outline similarity</SPAN></P><P><SPAN>Dominant angle rotation error – This measures the polygon rotation deviation</SPAN></P><P><SPAN /></P><P><SPAN /></P><P><SPAN>On our evaluation set contains ~15k building. The metrics on the set are:</SPAN></P><P><SPAN /></P><P><SPAN>IoU is 0.85, Shape distance is 0.33, Average rotation error is 1.6 degrees</SPAN></P><P><SPAN>The metrics are better or similar compared to OSM building metrics against the labels</SPAN></P><P><SPAN>Data Vintage</SPAN></P><P><SPAN>The vintage of the footprints depends on the vintage of the underlying imagery. Because Bing Imagery is a composite of multiple sources it is difficult to know the exact dates for individual pieces of data.</SPAN></P><P><SPAN /></P><P><SPAN>How good are the data?</SPAN></P><P><SPAN>Our metrics show that in the vast majority of cases the quality is at least as good as data hand digitized buildings in OpenStreetMap. It is not perfect, particularly in dense urban areas but it is still awesome.</SPAN></P><P><SPAN /></P><P><SPAN>What is the coordinate reference system?</SPAN></P><P><SPAN>EPSG: 4326</SPAN></P><P><SPAN /></P><P><SPAN>Will Microsoft be open sourcing the models?</SPAN></P><P><SPAN>Yes. We are working through the internal process to open source the segmentation models and polygonization algorithms.</SPAN></P><P><SPAN /></P><P><SPAN>Will there be more data coming for other geographies?</SPAN></P><P><SPAN>Maybe. This is a work in progress.</SPAN></P><P><SPAN /></P><P><SPAN>Why is the data being released?</SPAN></P><P><SPAN>Microsoft has a continued interest in supporting a thriving OpenStreetMap ecosystem.</SPAN></P></DIV></DIV></DIV>

Service Item Id: 1ad4d390100448c39926b120594c0946

Copyright Text: Microsoft

Default Visibility: true

MaxRecordCount: 2000

Supported Query Formats: JSON, geoJSON, PBF

Min Scale: 0

Max Scale: 0

Supports Advanced Queries: true

Supports Statistics: true

Has Labels: false

Can Modify Layer: true

Can Scale Symbols: false

Use Standardized Queries: true

Supports Datum Transformation: true

Extent:
Drawing Info: Advanced Query Capabilities:
HasZ: false

HasM: false

Has Attachments: false

HTML Popup Type: esriServerHTMLPopupTypeAsHTMLText

Type ID Field: null

Fields:
Supported Operations:   Query   Query Attachments   Query Analytic   Generate Renderer   Return Updates

  Iteminfo   Thumbnail   Metadata