AWS Partner Network (APN) Blog

Enriching Snowflake Data with Amazon Location Service and AWS Lambda

By Andrea Amorosi, Sr. Partner Solutions Architect – AWS
By Pascal Vogel, Solutions Architect – AWS
By Bosco Albuquerque, Sr. Partner Solutions Architect – AWS

Snowflake-AWS-Partners-2.1
Snowflake
Connect with Snowflake-2

Organizations are collecting more data than ever, including location data such as street addresses, zip codes, GPS coordinates, and other geospatial data types. Combining this location data with existing business data can lead to the discovery of valuable insights and enables organizations to solve meaningful business challenges.

The integration of geospatial data into the broader business intelligence and decision-making process is referred to as location intelligence. This enables organizations to optimize marketing and sales activities with geo-targeting, improve supply chain and route planning processes, or offer unique localized customer experiences.

While transactional business data is oftentimes readily accessible for consumption by data analysts or downstream business processes through organizational data platforms, it may require enrichment to be actionable for location intelligence. For instance, street addresses may need to be geocoded into coordinates or vice versa for certain data-driven use cases.

On Amazon Web Services (AWS), you can use the Snowflake Data Cloud to integrate fragmented data, discover and securely share data, and execute diverse analytic workloads. Snowflake is an AWS Specialization Partner and AWS Marketplace Seller with the Data and Analytics Competency. It supports analyzing structured and unstructured data and offers comprehensive support for analyzing geospatial data.

This post shows how you can enrich your existing Snowflake data with location-based insights using Amazon Location Service for location intelligence workloads.

Amazon Location Service is a location-based service that developers can use to add geospatial data and location functionality to applications. Customers can visualize data on a map, recommend routes, use geocoding to convert plain text addresses into geographic coordinates, use reverse geocoding to convert latitude and longitude coordinates into addresses, and monitor and track assets such as fleets of vehicles.

Amazon Location Service provides cost-effective location-based services (LBS) using high-quality data from global, trusted providers Esri, HERE, Open Data, and GrabMaps.

Solution Overview

The solution presented in this post leverages Snowflake external functions to set up a Snowflake API integration between Snowflake and Amazon Location Service via Amazon API Gateway and AWS Lambda.

External functions are a type of user-defined function (UDF) that calls code stored and executed outside of Snowflake. UDFs allow you to extend built-in system functionality through and can be reused across your code once defined.Architecture diagram showing Snoflake connection with AWS

Figure 1 – Integration architecture overview.

Implementing the external functions requires resources both in your Snowflake account and outside of Snowflake; in this case in your AWS account.

In your Snowflake account:

  • An external function is stored as a database object in Snowflake. Its definition contains the URL of the proxy service endpoint used when the function is called, which arguments it accepts, and which API integration to use to connect to the external service.
  • An API integration describes the interface between Snowflake and an external service. This Snowflake object stores configuration and security information necessary to connect to the external service, such as access credentials. In case of this integration, this includes the AWS Identity and Access Management (IAM) role used to set up a trust-relationship for invoking the external service.

In your AWS account:

  • Amazon API Gateway acts as a proxy service between Snowflake and the external function implementation in AWS Lambda. It receives and authenticates requests from Snowflake and passes back results.
  • Lambda functions implement the external function logic for calling Amazon Location Service. By doing so, they act as the Snowflake remote service.
  • AWS Secrets Manager is used to securely store and programmatically access the credentials used to access your Snowflake account.
  • AWS AppConfig is used to store securely and programmatically access parameters such as the name of your Snowflake database and warehouse.

Using the Amazon Location Service Integration

Deploying the Integration

Follow the detailed instructions in this GitHub repository to deploy the solution to your AWS and Snowflake accounts. Once deployed, take a look at the available external functions or try out the following examples to get started.

Geocoding Location Data

Amazon Location Service supports geocoding location data, such as determining the coordinates (described as latitude and longitude) from a street address or place name via the SearchPlaceIndexForText API action. This integration supports several location data providers.

Geocoding a Single Address

To use the Esri data provider for geocoding, use the following external function:

geocode_amazon_location_service_provider_esri(address VARCHAR)

In a Snowflake query, you can use the external function as follows:

SELECT geocode_amazon_location_service_provider_esri('440 Terry Ave N, Seattle, WA 98109, United States');

The result of this query is raw JSON data:

Image showing a SQL query result of an address being geocoded

Figure 2 – Result of geocoding a single address.

Using the Snowflake GET and parse_json functions, you can parse the raw JSON data and select individual keys from the query result:

SELECT
    GET(result, 'latitude') AS LATITUDE,
    GET(result, 'longitude') AS LONGITUDE
FROM (
    SELECT parse_json(geocode_amazon_location_service_provider_esri('440 Terry Ave N, Seattle, WA 98109, United States')) AS result
);

Take a look at querying semi-structured data to learn more about handling JSON-formatted result data.

Geocoding Multiple Addresses

You can also use the external functions to geocode multiple rows of data in one query. To try this out, create a table to hold sample data:

USE WAREHOUSE MY_WAREHOUSE;
USE DATABASE MY_DATABASE;
CREATE OR REPLACE TABLE MY_SCHEMA.COORDINATES (
    ID INTEGER,
    LONGITUDE FLOAT,
    LATITUDE FLOAT,
    ADDRESS VARCHAR(255),
    CONTROL VARCHAR(255)
);

Insert sample data into the table:

INSERT INTO MY_SCHEMA.COORDINATES (ID, ADDRESS, CONTROL)
VALUES (1, 'Potsdamer Platz 1, Berlin, Germany', 'Berlin'),
    (2, '166 Rue de Rivoli, 75001, 1er Arrondissement, Paris, Île-de-France, FRA', 'Paris'),
    (3, '44 W 77th St, New York, NY, 10024, USA', 'New York'),
    (4, '57 Great Russell Street, London, England, WC1B 3BA, GBR', 'London'),
    (5, 'Via Alberico II 35, 00193, Roma, RM, ITA', 'Rome'),
    (6, 'Avenida de la República 32, Tabacalera, Cuauhtémoc, Ciudad de México, 06030, MEX', 'Mexico City'),
    (7, '2-3-6, Uchikanda, Chiyoda, Tokyo, 1010047, JPN', 'Tokyo'),
    (8, 'Bongcheon-dong, Seoul, KOR', 'Seoul'),
    (9, 'Four Seasons Hotel Beijing, Beijing, CHN', 'Beijing'),
    (10, 'El Doqi Bridge, El Dokki, Giza, EGY', 'Cairo');

Call the geocoding external function on the data:

SELECT
    ADDRESS,
    GET(result, 'latitude') AS LATITUDE,
    GET(result, 'longitude') AS LONGITUDE
FROM(
    SELECT
       ADDRESS,
       parse_json(geocode_amazon_location_service_provider_esri(ADDRESS)) AS result
    FROM LOCATION_DATA.COORDINATES);

As a result, we receive the geocoded LATITUDE and LONGITUDE of the addresses:

Image showing a SQL query result of an address being reverse geocoded

Figure 3 – Result of geocoding multiple addresses.

Reverse Geocoding Location Data

Amazon Location Service supports reverse geocoding a location (described by geographic coordinates as latitude and longitude) to a human-readable address via the SearchPlaceIndexForPosition API action.

Reverse Geocoding a Single Coordinate

To reverse geocode a single coordinate—for instance with the Esri data provider—this integration providers the following external function:

reverse_geocode_amazon_location_service_provider_esri(lng FLOAT, lat FLOAT)

The external function returns multiple address data points such as city, municipality, and the full name and address of the point of interest as label. See the documentation for a full overview of the external function response.

Access the results in a Snowflake query as follows:

SELECT
    GET(result, 'Country') AS CITY,
    GET(result, 'Municipality') AS MUNICIPALITY,
    GET(result, 'Label') AS LABEL
FROM (
    SELECT parse_json(reverse_geocode_amazon_location_service_provider_esri(-118.321969, 34.134011)) AS result
);

Image showing a SQL query result of an address

Figure 4 – Result of reverse geocoding a single coordinate.

Reverse Geocoding Multiple Coordinates

You can also reverse geocode multiple coordinates in a single query. To demonstrate this, empty the sample table and insert some new sample data:

DELETE FROM MY_SCHEMA.COORDINATES;
 INSERT INTO MY_SCHEMA.COORDINATES (ID, LONGITUDE, LATITUDE, CONTROL)
VALUES (1, 13.404954, 52.520008, 'Berlin'),
       (2, 2.352222, 48.856614, 'Paris'),
       (3, -74.005973, 40.712775, 'New York'),
       (4, -0.127758, 51.507351, 'London'),
       (5, 12.496366, 41.902782, 'Rome'),
       (6, -99.133209, 19.432608, 'Mexico City'),
       (7, 139.691706, 35.689487, 'Tokyo'),
       (8, 126.977969, 37.566535, 'Seoul'),
       (9, 116.407395, 39.904211, 'Beijing'),
       (10, 31.235712, 30.044420, 'Cairo');

To use Esri to reverse geocode an address from this set of coordinates, call an external function as follows:

SELECT
    LATITUDE,
    LONGITUDE,
    GET(result, 'Label') AS LABEL
FROM(
    SELECT
        LATITUDE,
        LONGITUDE,
        parse_json(reverse_geocode_amazon_location_service_provider_esri(LONGITUDE, LATITUDE)) AS result
    FROM LOCATION_DATA.COORDINATES);

As a result, we receive a human-readable address for each coordinate:

Image showing a SQL query result of a set of addresses

Figure 5 – Result of reverse geocoding multiple coordinates.

Best Practices for Location Intelligence at Scale

Managing Amazon Location Service Quotas

The default Amazon Location Service quotas implement rate limits for the API operations performed through the external functions. For instance, the default quotas allow 50 requests per second for both the SearchPlaceIndexForPosition and SearchPlaceIndexForText operations.

Depending on the size of your queries, you need to increase the quotas that apply to your AWS account. Request a quota increase by following the instructions in the Amazon Location Service Developer Guide.

Scaling Your AWS Lambda Functions

With automatic scaling, built-in high availability, and a pay-for-use billing model, AWS Lambda is well-suited as the compute platform for location intelligence workloads.

To ensure your Lambda functions can handle large numbers of invocations—for instance, when geocoding large datasets—make sure to manage Lambda function concurrency appropriately. In Lambda, concurrency is the number of requests your function can handle at the same time. For your Lambda functions, you can configure reserved concurrency and provisioned concurrency.

Configure reserved concurrency to guarantee the maximum number of concurrent instances for the Lambda function. Allocate provisioned concurrency before an increase in invocations to ensure all requests are served by initialized Lambda instances with low latency. Lambda also integrates with application auto scaling, allowing you to manage provisioned concurrency on a schedule or based on utilization.

Learn more about Lambda function scaling and configuring reserved and provisioned concurrency in the AWS Lambda Developer Guide.

Consider enabling API Gateway caching to increase the responsiveness of the integration and to optimize the cost of repeat queries.

Securing Your API Endpoints

The integration presented in this post follows security best practices such as storing your Snowflake credentials in AWS Secrets Manager and utilizing IAM to secure access to resources in your AWS account. To protect your Snowflake account, you should regularly rotate your Snowflake credentials and update them in AWS Secrets Manager.

It’s also possible to set up key pair authentication for your Snowflake account and configure key pair rotation following instructions in the documentation.

Furthermore, set up access logging for the API Gateway with Amazon CloudWatch to keep a record of who accessed your API endpoint and how. For an overview of security recommendations for API Gateway, refer to security best practices in Amazon API Gateway.

Extending the Amazon Location Service Integration

The integration presented in this post serves as a starting point for integrating Snowflake and Amazon Location Service. Depending on your location intelligence use case, there are more capabilities of Amazon Location Service you can leverage.

These include, for example, geofencing, routing capabilities such as determining distance or travel time, generating autocomplete suggestions for locations, and more. Consult the Amazon Location Service API Reference to understand all of the available API actions and data types at your disposal.

This solution can be extended using the Snowflake Native Application Framework which allows user to build, distribute, and use data applications on Snowflake. Application providers or producers can use Snowflake’s core functionality to build data applications, and use Snowflake Marketplace to distribute and monetize them.

These data applications are deployed inside a consumer’s Snowflake account. Data application producers get a wide global audience of Snowflake customers, while customers who consume these data applications keep their data centralized, simplifying their application procurement process.

Conclusion

Location intelligence can contribute to drawing value from your business data by integrating geospatial data into the broader business intelligence and decision-making process.

This post demonstrates how you can enrich your business data stored in the Snowflake Data Cloud with location-based insights using Amazon Location Service for location intelligence workloads.

Consult the GitHub repository for the Amazon Location Service with Snowflake solution presented in this post to learn more about its architecture, capabilities, and for detailed deployment instructions. Open a GitHub issue to provide your feedback or create a pull request to extend the solution.

The Amazon Location Service Developer Guide provides an overview of concepts, functionality, and usage instructions.

To learn more about working with location data in Snowflake, take a look at geospatial data types and the following posts on the Snowflake blog:

Contact Snowflake to learn more about realizing the full potential of your data for location intelligence workloads and beyond.

.
Snowflake-APN-Blog-Connect-2023
.


Snowflake – AWS Partner Spotlight

Snowflake is an AWS Partner that supports analyzing structured and unstructured data and offers comprehensive support for analyzing geospatial data.

Contact Snowflake | Partner Overview | AWS Marketplace | Case Studies