When I query CSV data in Athena, I get the error "HIVE_BAD_DATA: Error parsing field value '' for field X: For input string: """

Last updated: 2020-03-30

When I query CSV data in Amazon Athena, I get an error like this:

Your query has the following error(s):
HIVE_BAD_DATA: Error parsing field value '' for field 0: For input string: ""
This query ran against the "default" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: b99xxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.

Short Description

There are several versions of this error. If the error message includes an input string with a value (for example, "For input string: "12312845691"), the data type in the file doesn't match the expected data type in the schema. For more information, see My Amazon Athena query fails with the error "HIVE_BAD_DATA: Error parsing field value for field X: For input string: "12312845691"".

Errors that specify a null or empty input string ("For input string: "") happen when both of the following are true:

  • You're using Athena with OpenCSVSerDe, which means that your source data uses double quotes (") as the default quote character.
  • The source data contains null values ("") or empty cells.

For example, the following data returns the "Error parsing field value" error:

"1","1","a3","a4"
"","2","a3","def"
"1","3","abc3","ab4"

Resolution

Define each column as STRING. The parser in Athena parses the values from STRING into actual types based on what it finds. This prevents Athena from throwing an error when it finds null values (empty strings with double quotes and no spaces) or empty cells (no values or double quotes). For more information about how Athena processes CSV files, see OpenCSVSerDe for Processing CSV.

To use a specific data type for a column with a null or empty value, use CAST to convert the column to the desired type:

1.    Create the table and define each column as STRING:

CREATE EXTERNAL TABLE myopencsvtable_example (
   col1 string,
   col2 string,
   col3 string,
   col4 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   'separatorChar' = ',',
   'quoteChar' = '"',
   'escapeChar' = '\\'
   )
STORED AS TEXTFILE
LOCATION 's3://awsexamplebucket/';

2.    Query the table:

select * from myopencsvtable_example

Example output:

     col1     col2    col3      col4
1    1        1       a3        a4
2             2       a3        def
3    1        3       abc3      ab4

3.    Cast "col1" as INT and print the default value for the null value (""):

SELECT COALESCE(TRY(CAST(col1 AS integer)),0) as cast_column,* from myopencsvtable_example

Example output:

     cast_column    col1    col2     col3      col4
1    1              1       1        a3        a4
2    0                      2        a3        def
3    1              1       3        abc3      ab4 

Did this article help you?

Anything we could improve?


Need more help?