Posted On: Apr 28, 2021
Amazon Redshift native support for JSON and semi-structured data is now generally available. It is based on the new data type ‘SUPER’ that allows you to ingest and store semi-structured data in your Amazon Redshift data warehouses. Amazon Redshift also includes support for PartiQL for SQL-compatible access to relational, semi-structured and nested data. Using the SUPER data type and PartiQL in Amazon Redshift, you can perform advanced analytics that combine classic structured SQL data (such as string, numeric, and timestamp) with the semi-structured SUPER data (such as JSON) with superior performance, flexibility, and ease-of-use.
The SUPER data type is schemaless in nature and allows for storage of nested values that could consist of Redshift scalar values, nested arrays or other nested structures. Amazon Redshift supports the parsing of JSON data into SUPER and up to 5x faster insertion of JSON/SUPER data in comparison to inserting similar data into classic scalar columns. PartiQL is an extension of SQL that is adopted across multiple AWS services. PartiQL allows access to schemaless and nested SUPER data via efficient object and array navigation, unnesting, and flexibly composing queries with classic analytic operations such as JOINs and aggregates. This enables new advanced analytics that discover combinations of structured and semi-structured data. Data engineers can achieve simplified and low latency ELT (Extract, Load, Transform) processing of the inserted semi-structured data directly in their Redshift cluster without integration with external services. PartiQL features that facilitate ELT include schemaless semantics, dynamic typing and type introspection abilities in addition to its navigation and unnesting. These make ingesting and querying schemaless data much easier now that users do not have to pre-discover data types for each ingested source, handle evolving schemas or write complex SQL to account for different types when querying the data. Users can also easily shred the semi-structured data by creating materialized views and can achieve orders of magnitude faster analytical queries, while keeping the materialized views automatically and incrementally maintained.
Native support for JSON and semi-structured data in Amazon Redshift is available in all commercial regions
Learn more about native support for JSON and semi-structured data in Amazon Redshift in documentation.