How do I troubleshoot issues with joining data sources in QuickSight?

4 minute read
0

I want to join data sources in Amazon QuickSight, but I'm encountering issues.

Short description

The following are common scenarios that can prevent you from joining data from different sources in Amazon QuickSight:

  • You can't see the Add data button and grid panel on the Edit dataset page.
  • You're using geographic fields.
  • You're trying to create a third dataset from two Amazon datasets.
  • QuickSight ran out of join memory.
  • You get a duplicate column or ambiguous column error.

Resolution

You can't see the Add data and grid panel on the Edit dataset page

If you can't see the Add data button and editing panel, then you must request access to the data source from the dataset owner.

If you own the dataset and want to share it with other users, then complete the following steps:

  1. Open the Amazon QuickSight console.
  2. Choose Datasets, and then choose New dataset.
  3. Select the dataset that you want to share.
  4. Choose Share data source, and then choose Invite users.
  5. Enter the user name and required permission.
  6. Choose Share.

To add more data to the dataset:

  1. Open the dataset that you want to add data to, and then choose Edit dataset.
  2. On the Data preparation page, choose Add data.
  3. Choose how you want to add the data. You can add it from the dataset, data source, or by uploading a file.
    Note: You can also choose Use Custom SQL to open the query editor and then write a query for the SQL data source.
  4. Choose the red dots to configure the join.
  5. Select your preferred join type: Inner, Left, Right, or Full.
  6. Choose Apply.

You're using geographic fields

Geographical fields aren't supported in the join interface. To resolve this issue, you can:

  1. Change the data type from Geospatial to String.
  2. Apply your preferred join type.
  3. Return to the dataset page, and select the dataset.
  4. Choose Use in a new Dataset to create a new child dataset in the parent dataset.
  5. Change the field type back to Geospatial.
  6. Choose Save.

You're trying to create a third dataset from two datasets

You can't join two datasets to create a third dataset.

However, for Amazon Athena datasets, you can do the following:

  1. Write a query in Athena by joining two tables, and then create a view.
  2. Create an Athena data source on the view.
  3. Write a custom SQL, and then create an Athena dataset in QuickSight using the new Athena data source.

QuickSight ran out of join memory

When you join datasets, one of the datasets must be within 1 GB in size. If more than one dataset is larger than 1 GB, then QuickSight runs out of memory. To resolve this issue, use a custom SQL query to join them.

If you have large table joins, then it's a best practice to run a join conditions query at the database, and create a table or view. Then, create the dataset in QuickSight from the table or view.

For more information, see Joining across data sources on Amazon QuickSight.

You get a duplicate column or ambiguous column error

You receive the following duplicate column error:

ERROR - Duplicate column name 'column name

The following example query results in a duplicate column error:

select * from schema.sales, schema.date where sales.dateid = date.dateid;

You receive the following ambiguous column error:

ERROR - column reference "dateid" is ambiguous

The following example query results in an ambiguous column error:

select sid, dateid from schema.sales, schema.date, schema.users where sales.sid = users.userid and dateid = dateid and year = 2010 and city = 'dallas';

Duplicate and ambiguous column errors occur when the common column name isn't qualified by the table name or alias. To resolve duplicate and ambiguous column errors, specify the joining column name, with the table name or alias as the qualifier.


AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago