When I try to read JSON data in Amazon Athena, I receive NULL or incorrect data errors. How can I resolve this?

To remedy NULL or incorrect data errors, check the following common issues:

Use the correct JSON SerDe with Amazon Athena to read your data

Athena is able to read JSON data using one of two JSON SerDes:

  • The native Hive / HCatalog JsonSerDe (org.apache.hive.hcatalog.data.JsonSerDe)
  • The OpenX SerDe (org.openx.data.jsonserde.JsonSerDe)

If you're not sure which SerDE you used, try both of the available SerDE versions.

Be sure your JSON record is on a single line

Athena doesn't currently support multi-line JSON records.

Generate your data with case-insensitive columns

Athena is case-insensitive, so if you have column names that differ only by case (for example, “Column” and “column”), Athena generates an error, and your data will not be visible in Athena.

Provide an option to ignore malformed records

If you use the “org.openx.data.jsonserde.JsonSerDe” SerDe, you can provide an option to ignore malformed records. This will return malformed records as NULL:

CREATE EXTERNAL TABLE json (
    a string,
    b int
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
LOCATION 's3://bucket/path/';

After the table is created, run your new query on the new table.


Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2016-12-15

Updated: 2017-09-25