The SELECT COUNT query in Amazon Athena returns only one record even though the input JSON file has multiple records

Last updated: 2020-10-07

When I execute SELECT COUNT(*) FROM TABLE, the output is "1," but the input file has multiple JSON records. I used org.openx.data.jsonserde.JsonSerDe to create the table. Why does the query return the wrong value?

Resolution

When records aren't separated by a newline character (\n), SELECT COUNT(*) FROM TABLE returns "1." Example:

{"ID":"1","MYTIMESTAMP":"2020-06-09 17:50:46.267","Name":"John"}{"ID":"2","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Joel"}{"ID":"3","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Patrick"}

To correct this problem, be sure that the records are delimited by \n, as shown in the following example:

{"ID":"1","MYTIMESTAMP":"2020-06-09 17:50:46.267","Name":"John"}

{"ID":"2","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Joel"}

{"ID":"3","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Patrick"}

Now, SELECT COUNT(*) FROM TABLE returns "3."


Did this article help?


Do you need billing or technical support?