DEV Community

Cover image for Spatial Search of Overture Maps Data Using Wherobots Cloud
Yasunori Kirimoto for MIERUNE

Posted on

Spatial Search of Overture Maps Data Using Wherobots Cloud

About Wherobots

Have you heard of the geospatial service called Wherobots?

Wherobots is a cloud-based platform that enables the analysis, processing, and AI utilization of geospatial data. Its strength lies in its ability to handle global-scale data in a scalable manner, and it supports a wide range of use cases, including remote sensing, optimization, and risk analysis. It also makes performing tasks such as satellite image analysis, change detection, and traffic optimization easy. It provides a UI and CLI, and an SDK for TypeScript is also available. After trying out the basic functions with the free plan, you can access more advanced functions with the professional plan or above. It was founded by the developers of Apache Sedona, and it is also distinctive in that it has a deep connection with open source. This service has been attracting a lot of attention recently, with reports of it raising 21.5 million dollars in funding.

Meeting at AWS re:Invent 2024

I first heard about Wherobots before attending this year's AWS re:Invent, when I was invited to this year's "Geo Party," which CARTO held at last year's AWS re:Invent. The invitation informed me that this year's Geo Party would be a joint event with Wherobots, and that's how I first heard about Wherobots.

img

On the day of AWS re:Invent 2024, I visited the Wherobots booth in the EXPO Hall. When I spoke to them, they explained in detail the geospatial data analysis platform provided by Wherobots and their technical strengths. This encounter was the inspiration for writing this article.

img

Joining the Overture Maps Foundation

Wherobots has also joined the Overture Maps Foundation this year.

https://wherobots.com/overture-maps-foundation-wherobots-new-member-cloud-native-geospatial-intelligence/

Wherobots' products

Wherobots has four products.

img

1. Wherobots Cloud
This integrated platform performs ETL (extraction, transformation, and loading), analysis, and AI processing of geospatial data in the cloud. Users can develop and analyze using Spatial SQL, Python, Java, etc., on Jupyter Notebook and perform everything from pre-processing geospatial data to the advanced model application. Currently, it supports workloads on AWS, and in the future, we are also planning to expand to other cloud environments such as Azure and GCP.

2. WherobotsDB
A cloud-native, serverless analytics engine optimized for geospatial data. It is said to be able to execute small-scale analysis to global-scale geospatial queries at speeds up to 20 times faster, and it is also expected to reduce costs compared to conventional cloud analytics engines. It supports various geospatial data formats and projection methods and achieves flexible workload processing through on-demand automatic scaling.

3. WherobotsAI
This module provides computer vision functions dedicated to remote sensing data. It enables multifaceted analysis, such as segmentation, change detection, object detection, land use classification, and climate analysis from satellite images, and is expected to be applied in a wide range of fields, including infrastructure monitoring, natural disaster prediction, and agricultural optimization.

4. Wherobots Spatial Catalog
This is a catalog function that provides integrated management and provision of large geospatial data sets. Users can easily find the data they need through keyword searches and metadata filtering and then use it for analysis smoothly in conjunction with Wherobots Cloud and WherobotsDB. It is possible to handle public, commercial, and customer-specific data in an integrated manner, and it supports data-driven decision-making.

img

In this article, we will try out Wherobots Cloud. By using Wherobots Cloud, it is possible to link it with other functions.

Wherobots Cloud Pricing

In this article, we will use the Community version. It is also possible to use the more advanced Professional version on AWS Marketplace. As of December 2024, the Professional version can be tried for free for up to $400 for the first 30 days.

img

Advance Preparation

Create an account

First, create an account.

Click on "Try Wherobots."
img

Enter your account information → Click on "Create Account."
img

Enter your organization name → Click on "Submit."
img

After logging in, the dashboard will be displayed.
img

Start the Notebook

Next, start the Notebook that will be used to process location data.

Click on "Start".
img

After starting the Notebook → Click on "Open."
img

The Notebook will be displayed.
img

Spatial search using Overture Maps data

Using WherobotsDB

To use WherobotsDB, create a "SedonaContext" object.

from sedona.spark import *

config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)
Enter fullscreen mode Exit fullscreen mode

Check the Open Data Catalog in the Wherobots Spatial Catalog

The Open Data Catalog allows you to use Overture Maps and Foursquare data as presets.

Display the list of the Open Data Catalog.

sedona.sql("SHOW SCHEMAS IN wherobots_open_data").show()
Enter fullscreen mode Exit fullscreen mode
+--------------------+
|           namespace|
+--------------------+
|            overture|
| overture_2024_02_15|
| overture_2024_05_16|
| overture_2024_07_22|
|overture_2024_01_...|
|overture_2024_05_...|
|overture_2023_07_...|
|overture_2024_06_...|
|overture_2024_09_...|
|overture_2024_08_...|
|overture_2024_10_...|
|overture_2024_03_...|
|overture_2024_04_...|
|overture_2023_10_...|
|overture_2023_11_...|
|overture_2024_06_...|
|overture_2024_07_...|
|overture_2024_02_...|
|overture_2023_12_...|
|foursquare_2024_1...|
+--------------------+
only showing top 20 rows
Enter fullscreen mode Exit fullscreen mode

Display the tables in the Overture Maps database.

sedona.sql("SHOW tables IN wherobots_open_data.overture").show(truncate=False)
Enter fullscreen mode Exit fullscreen mode
+---------+-----------------------------+-----------+
|namespace|tableName                    |isTemporary|
+---------+-----------------------------+-----------+
|overture |admins_administrativeBoundary|false      |
|overture |admins_locality              |false      |
|overture |buildings_building           |false      |
|overture |places_place                 |false      |
|overture |transportation_connector     |false      |
|overture |transportation_segment       |false      |
+---------+-----------------------------+-----------+
Enter fullscreen mode Exit fullscreen mode

Displays the schema for the "places_place" table in Overture Maps.

sedona.table("wherobots_open_data.overture.places_place").printSchema()
Enter fullscreen mode Exit fullscreen mode
root
 |-- id: string (nullable = true)
 |-- updatetime: string (nullable = true)
 |-- version: integer (nullable = true)
 |-- names: map (nullable = true)
 |    |-- key: string
 |    |-- value: array (valueContainsNull = true)
 |    |    |-- element: map (containsNull = true)
 |    |    |    |-- key: string
 |    |    |    |-- value: string (valueContainsNull = true)
 |-- categories: struct (nullable = true)
 |    |-- main: string (nullable = true)
 |    |-- alternate: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |-- confidence: double (nullable = true)
 |-- websites: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- socials: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- emails: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- phones: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- brand: struct (nullable = true)
 |    |-- names: map (nullable = true)
 |    |    |-- key: string
 |    |    |-- value: array (valueContainsNull = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)
 |    |-- wikidata: string (nullable = true)
 |-- addresses: array (nullable = true)
 |    |-- element: map (containsNull = true)
 |    |    |-- key: string
 |    |    |-- value: string (valueContainsNull = true)
 |-- sources: array (nullable = true)
 |    |-- element: map (containsNull = true)
 |    |    |-- key: string
 |    |    |-- value: string (valueContainsNull = true)
 |-- bbox: struct (nullable = true)
 |    |-- minx: double (nullable = true)
 |    |-- maxx: double (nullable = true)
 |    |-- miny: double (nullable = true)
 |    |-- maxy: double (nullable = true)
 |-- geometry: geometry (nullable = true)
 |-- geohash: string (nullable = true)
Enter fullscreen mode Exit fullscreen mode

Spatial SQL for spatial search

You can perform various spatial searches using Spatial SQL.

Creates a "places" view from the "places_place" table in Overture Maps.

sedona.table("wherobots_open_data.overture.places_place").createOrReplaceTempView("places")
Enter fullscreen mode Exit fullscreen mode

Obtains point data with attributes for name, category, and point coordinates.

sedona.sql("SELECT categories.main AS category, names.common[0].value AS name, geometry FROM places LIMIT 20").show(truncate=False)
Enter fullscreen mode Exit fullscreen mode
+---------------------+----------------------------------+------------------------------+
|category             |name                              |geometry                      |
+---------------------+----------------------------------+------------------------------+
|taiwanese_restaurant |台湾料理四季紅                    |POINT (136.885079 35.343057)  |
|japanese_restaurant  |いろ川                            |POINT (139.848961 35.739373)  |
|farm                 |株式会社阿部農園                  |POINT (139.0049956 37.7329319)|
|train_station        |岩波駅                            |POINT (138.919082 35.215899)  |
|japanese_restaurant  |甲南そば                          |POINT (135.27482 34.728534)   |
|restaurant           |Cuatro                            |POINT (138.8529754 35.1067167)|
|smoothie_juice_bar   |ゴクゴク 横浜ワールドポーターズ店|POINT (139.6385154 35.4541338)|
|hotel                |Curation Hotel                    |POINT (139.0785116 35.1066403)|
|asian_restaurant     |集来軒                            |POINT (135.470227 34.717611)  |
|professional_services|株式会社CDF                       |POINT (135.1936619 34.6962007)|
|public_plaza         |深見歴史の森スポーツ広場          |POINT (139.464748 35.491567)  |
|pet_store            |うさぎ専門店ちゅらうさぎ          |POINT (138.2966425 34.8537383)|
|japanese_restaurant  |無添くら寿司横浜長津田店          |POINT (139.497489 35.521179)  |
|health_and_medical   |大滝漢方堂                        |POINT (136.5 36.05)           |
|japanese_restaurant  |焼き鳥酒場 ちょりちょり          |POINT (139.5350146 36.0305335)|
|eat_and_drink        |味のじゅん天                      |POINT (140.386382 37.399886)  |
|japanese_restaurant  |まご茶亭                          |POINT (139.071774 35.09486)   |
|gym                  |スマートフィット100大塚店         |POINT (140.408632 36.383624)  |
|NULL                 |法隆寺駅                          |POINT (135.739107 34.601617)  |
|bar                  |布施酒場かい                      |POINT (135.564394 34.663048)  |
+---------------------+----------------------------------+------------------------------+
Enter fullscreen mode Exit fullscreen mode

Obtains "hiking_trail" within a 10km radius of a specified point.

sedona.sql("""
SELECT names.common[0].value AS name, categories.main AS category, geometry FROM places WHERE ST_DistanceSphere(ST_GeomFromWKT('POINT (139.7645 35.6811)'), geometry) < 10000 AND categories.main = 'hiking_trail' LIMIT 20
""").show(truncate=False)
Enter fullscreen mode Exit fullscreen mode
+-----------------------------+------------+------------------------------+
|name                         |category    |geometry                      |
+-----------------------------+------------+------------------------------+
|玉川上水旧水路幡ヶ谷緑道     |hiking_trail|POINT (139.678356 35.676208)  |
|向日葵の小径                 |hiking_trail|POINT (139.756216 35.674825)  |
|荒木坂                       |hiking_trail|POINT (139.7393195 35.7115361)|
|目黒川緑道池尻大橋駅側入口   |hiking_trail|POINT (139.6854 35.651325)    |
|芝浦アイランド 遊歩道        |hiking_trail|POINT (139.750809 35.63663)   |
|大塚バラロード               |hiking_trail|POINT (139.72557 35.729693)   |
|解剖坂                       |hiking_trail|POINT (139.759613 35.721601)  |
|足立の平成五色桜             |hiking_trail|POINT (139.770207 35.759781)  |
|芸術の散歩道                 |hiking_trail|POINT (139.77332 35.7163)     |
|玉川上水旧水路世田谷緑道     |hiking_trail|POINT (139.662694 35.670916)  |
|三段坂                       |hiking_trail|POINT (139.769306 35.718932)  |
|飛鳥大坂                     |hiking_trail|POINT (139.737485 35.750156)  |
|目黒川東海禅寺裏遊歩道       |hiking_trail|POINT (139.739045 35.615981)  |
|山王男坂                     |hiking_trail|POINT (139.740707 35.674686)  |
|レインボープロムナード 台場口|hiking_trail|POINT (139.775662 35.635127)  |
|大横川の桜並木               |hiking_trail|POINT (139.794909 35.671448)  |
|白鷺坂                       |hiking_trail|POINT (139.736717 35.723212)  |
|三平坂                       |hiking_trail|POINT (139.731583 35.758)     |
|玉川上水旧水路初台緑道       |hiking_trail|POINT (139.687152 35.680788)  |
|レインボープロムナード 芝浦口|hiking_trail|POINT (139.759193 35.637868)  |
+-----------------------------+------------+------------------------------+
Enter fullscreen mode Exit fullscreen mode

Create a DataFrame of the spatial search results.

trails_df = sedona.sql("""
SELECT names.common[0].value AS name, categories.main AS category, geometry FROM places WHERE ST_DistanceSphere(ST_GeomFromWKT('POINT (139.7645 35.6811)'), geometry) < 10000 AND categories.main = 'hiking_trail'
""")
Enter fullscreen mode Exit fullscreen mode

Visualize the spatial search results

You can use SedonaKepler or SedonaPyDeck to visualize the spatial search results.

Visualize the spatial search results using SedonaKepler.

SedonaKepler.create_map(trails_df, "Hiking Trails")
Enter fullscreen mode Exit fullscreen mode

img

Convert Overture Maps data to PMTiles

Use WherobotsDB

Create a "SedonaContext" object to use WherobotsDB.

from sedona.spark import *

config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)
Enter fullscreen mode Exit fullscreen mode

Extract data for a specified range

Specify a range around Tokyo.

from sedona.sql.st_constructors import ST_GeomFromText
from sedona.sql.st_predicates import ST_Intersects

region_wkt = "POLYGON ((139.7543 35.7044, 139.7371 35.6921, 139.7377 35.6758, 139.7498 35.6600, 139.7734 35.6658, 139.7841 35.6793, 139.7845 35.7023, 139.7543 35.7044))"
Enter fullscreen mode Exit fullscreen mode

Create a DataFrame of the "buildings" table in Overture Maps.

import pyspark.sql.functions as f

buildings_df = (
    sedona.table("wherobots_open_data.overture_2024_02_15.buildings_building")
    .select(
        f.col("geometry"),
        f.lit("buildings").alias("layer"),
        f.element_at(f.col("sources"), 1).dataset.alias("source")
    )
)
buildings_df.show()
Enter fullscreen mode Exit fullscreen mode
+--------------------+---------+--------------------+
|            geometry|    layer|              source|
+--------------------+---------+--------------------+
|POLYGON ((-49.438...|buildings|Microsoft ML Buil...|
|POLYGON ((-49.438...|buildings|Google Open Build...|
|POLYGON ((-49.438...|buildings|Microsoft ML Buil...|
|POLYGON ((-49.438...|buildings|Google Open Build...|
|POLYGON ((-49.441...|buildings|Google Open Build...|
|POLYGON ((-49.440...|buildings|Google Open Build...|
|POLYGON ((-49.441...|buildings|Microsoft ML Buil...|
|POLYGON ((-49.441...|buildings|Google Open Build...|
|POLYGON ((-49.440...|buildings|Microsoft ML Buil...|
|POLYGON ((-49.442...|buildings|Google Open Build...|
|POLYGON ((-49.442...|buildings|Google Open Build...|
|POLYGON ((-49.442...|buildings|Google Open Build...|
|POLYGON ((-49.442...|buildings|Google Open Build...|
|POLYGON ((-49.442...|buildings|Microsoft ML Buil...|
|POLYGON ((-49.442...|buildings|Google Open Build...|
|POLYGON ((-49.438...|buildings|Google Open Build...|
|POLYGON ((-49.438...|buildings|Google Open Build...|
|POLYGON ((-49.439...|buildings|Google Open Build...|
|POLYGON ((-49.439...|buildings|Google Open Build...|
|POLYGON ((-49.438...|buildings|Microsoft ML Buil...|
+--------------------+---------+--------------------+
only showing top 20 rows
Enter fullscreen mode Exit fullscreen mode

Create a DataFrame of the "roads" table in Overture Maps.

roads_df = (
    sedona.table("wherobots_open_data.overture_2024_02_15.transportation_segment")
    .select(
        f.col("geometry"),
        f.lit("roads").alias("layer"),
        f.element_at(f.col("sources"), 1).dataset.alias("source")
    )
)
roads_df.show()
Enter fullscreen mode Exit fullscreen mode
+--------------------+-----+-------------+
|            geometry|layer|       source|
+--------------------+-----+-------------+
|LINESTRING (7.034...|roads|OpenStreetMap|
|LINESTRING (7.037...|roads|OpenStreetMap|
|LINESTRING (7.032...|roads|OpenStreetMap|
|LINESTRING (7.033...|roads|OpenStreetMap|
|LINESTRING (7.031...|roads|OpenStreetMap|
|LINESTRING (7.031...|roads|OpenStreetMap|
|LINESTRING (7.031...|roads|OpenStreetMap|
|LINESTRING (7.033...|roads|OpenStreetMap|
|LINESTRING (7.034...|roads|OpenStreetMap|
|LINESTRING (7.030...|roads|OpenStreetMap|
|LINESTRING (7.037...|roads|OpenStreetMap|
|LINESTRING (7.037...|roads|OpenStreetMap|
|LINESTRING (7.041...|roads|OpenStreetMap|
|LINESTRING (7.051...|roads|OpenStreetMap|
|LINESTRING (7.037...|roads|OpenStreetMap|
|LINESTRING (7.050...|roads|OpenStreetMap|
|LINESTRING (7.054...|roads|OpenStreetMap|
|LINESTRING (7.051...|roads|OpenStreetMap|
|LINESTRING (7.052...|roads|OpenStreetMap|
|LINESTRING (7.052...|roads|OpenStreetMap|
+--------------------+-----+-------------+
only showing top 20 rows
Enter fullscreen mode Exit fullscreen mode

Create a DataFrame of the Tokyo area combining the "buildings" and "roads" tables in Overture Maps.

features_df = roads_df.union(buildings_df)

if filter:
    features_df = features_df.filter(ST_Intersects(f.col("geometry"), ST_GeomFromText(f.lit(region_wkt))))

features_df.count()
Enter fullscreen mode Exit fullscreen mode
38957
Enter fullscreen mode Exit fullscreen mode

Create vector tiles

Create vector tiles of the extracted data.

from wherobots import vtiles

tiles_df = vtiles.generate(features_df)
tiles_df.show(3, 150, True)
Enter fullscreen mode Exit fullscreen mode
-RECORD 0----------------------------------------------------------------------------------------------------------------------------------------------------------
 tile     | {56, 25, 6}                                                                                                                                            
 features | [1A DA B8 02 0A 05 72 6F 61 64 73 12 16 12 02 00 00 18 02 22 0E 09 A4 36 82 0D 22 01 01 01 00 01 00 01 01 12 10 12 02 00 00 18 02 22 08 09 98 36 F2... 
-RECORD 1----------------------------------------------------------------------------------------------------------------------------------------------------------
 tile     | {14552, 6453, 14}                                                                                                                                      
 features | [1A 8E 01 0A 05 72 6F 61 64 73 12 3E 12 02 00 00 18 02 22 36 09 FE 0A 8A 08 AA 01 01 0F 03 27 1E 8F 03 04 2D 06 27 22 83 02 14 9D 01 06 31 06 35 1E... 
-RECORD 2----------------------------------------------------------------------------------------------------------------------------------------------------------
 tile     | {29104, 12900, 15}                                                                                                                                     
 features | [1A 90 20 0A 05 72 6F 61 64 73 12 14 12 02 00 00 18 02 22 0C 09 D4 23 D6 3D 1A 05 56 03 38 29 0E 12 43 12 02 00 00 18 02 22 3B 09 88 21 D0 3A C2 01... 
only showing top 3 rows
Enter fullscreen mode Exit fullscreen mode

Convert vector tiles to PMTiles

Convert vector tiles to PMTiles.

import os

full_tiles_path = os.getenv("USER_S3_PATH") + "tiles.pmtiles"
vtiles.write_pmtiles(tiles_df, full_tiles_path, features_df=features_df)
Enter fullscreen mode Exit fullscreen mode

Visualizing PMTiles data

Visualize PMTiles data using Leafmap.

vtiles.show_pmtiles(full_tiles_path)
Enter fullscreen mode Exit fullscreen mode

img

Visualizing simplified PMTiles data

Visualize large amounts of data by simplifying them.

sample_tiles_path = os.getenv("USER_S3_PATH") + "sampleTiles.pmtiles"
vtiles.generate_quick_pmtiles(features_df, sample_tiles_path)
Enter fullscreen mode Exit fullscreen mode

img

Related Articles

References
Wherobots

Top comments (0)