AWS Database Blog

Build an Amazon Keyspaces (for Apache Cassandra) data model using NoSQL Workbench

In this post, we build an end-to-end data model for an internet data consumption application. Data modeling provides a means of planning and blueprinting the complex relationship between an application and its data. Creating an efficient data model helps to achieve better query performance. An inefficient data model can slow development and performance, increase costs, and make it hard to respond to new requirements.

Data modeling with Amazon Keyspaces (for Apache Cassandra), however, is quite different from relational data modeling, and requires a mindset adjustment that leaves behind many of the restrictions of relational data modeling. To design an efficient and scalable Amazon Keyspaces data model, you must consider what is the best partition key, how your application accesses data, and how you can support the growing data requirements while maintaining fast query performance.

AWS recently added support for Amazon Keyspaces and Apache Cassandra to NoSQL Workbench. NoSQL Workbench is a client-side application that helps you design and visualize non-relational data models for Amazon Keyspaces more easily. NoSQL Workbench also works with Amazon DynamoDB. NoSQL Workbench clients are available for Windows, macOS, and Linux. You can use NoSQL Workbench with Amazon Keyspaces to do the following:

  • Design data models and create resources automatically – NoSQL Workbench provides a point-and-click interface to design and create Amazon Keyspaces data models. You can easily create new data models from scratch by defining keyspaces, tables, and columns. You can also import existing data models and make modifications (such as adding, editing, or removing columns) to adapt the data models for new applications. NoSQL Workbench enables you to commit the data models to Amazon Keyspaces or Apache Cassandra, and create the keyspaces and tables automatically
  • Visualize data models – With NoSQL Workbench, you can visualize your data models to help ensure that the models can support your application’s queries and access patterns.

For more information, see Using NoSQL Workbench with Amazon Keyspaces (for Apache Cassandra).

Solution overview

Amazon Keyspaces is a scalable, highly available, and managed Apache Cassandra–compatible database service. With Amazon Keyspaces, you don’t have to provision, patch, or manage servers, and you don’t have to install, maintain, or operate software. Amazon Keyspaces is serverless, so you pay for only the resources that you use, and the service automatically scales tables up and down in response to application traffic.

In this post, we build a new greenfield app that can retrieve information like total internet data consumption by customer and residence. It also supports different search patterns. We use NoSQL Workbench to build this single-table model.

Data modeling in Amazon Keyspaces is query driven. When creating an optimal model in Amazon Keyspaces, we recommend having a strong understanding of Amazon Keyspaces architecture and query patterns.

The access patterns required by the application and facilitated by this data model are as follows:

  • Retrieve total internet data consumption of a customer using the customerId
  • Retrieve total internet data consumption of a customer using the customerId and a consumptionDate
  • Search for a customer by customerId
  • Retrieve customer details

To create our application, we complete the following high-level steps:

  1. Create the data consumption model.
  2. Add sample data to the model.
  3. Commit the model to Amazon Keyspaces.

Prerequisites

Before you get started, complete the following prerequisites:

  1. Download and install NoSQL Workbench on your computer.
  2. Create credentials for Amazon Keyspaces.
  3. Download the Starfield digital certificate using the following command and save it to the local directory. You use the certificate to connect to Amazon Keyspaces.
curl https://www.amazontrust.com/repository/AmazonRootCA1.pem -O

You also can use this cert to connect to Amazon Keyspaces using cqlsh (the Cassandra Query Language Shell).

Create a data consumption model

To create your model, complete the following steps:

  1. On the Database Catalog page in NoSQL Workbench, choose Amazon Keyspaces.
  2. Choose Launch.
    NoSQL Workbench screenshot with a highlight to the "Launch" button associated with Amazon Keyspaces for Apache Cassandra
  3. Open NoSQL Workbench and choose the Data modeler.
  4. Under Data model, choose the + icon.
    Screenshot of NoSQL Workbench highlighting the "Data modeler" menu and the "+" icon
  5. For Name, enter the name of the data model.
  6. Choose Create.
    screenshot of NoSQL Workbench with the "Name" and "Description" fields filled up with value "Internet Data Consumption Data Model". "Create" button highlighted
  7. Under Keyspace, choose the + icon to add the keyspace.
  8. For Keyspace name, enter a name for the keyspace.
  9. For Replication strategy, choose a replication strategy for the keyspace.Amazon Keyspaces uses SingleRegionStrategy to replicate data three times automatically in multiple AWS Availability Zones. If you want to commit the data model to an Apache Cassandra cluster, you can choose SimpleStrategy or NetworkTopologyStrategy.
  10. Choose Add keyspace definition.
    NoSQL Workbench screenshot highlighting the "Add keyspace definition" button.
  11. Under Tables, choose the + icon to add the table definition.
    NoSQL Workbench screenshot highlighting the menus "Data modeler" on the left panel and the "plus icon" in section "Tables"
  12. Add the column name and data type as per the following screenshot.
    Screenshot of the "Add table definition". Table name is "data_consuption_details" with the following columns, all of type TEXT: customer_id, consumptionDate, customerName, residenceid, houseNumber,consumedData,unit
  13. For Partition key, enter customer_id.
  14. For Clustering columns, enter consumtionDate.
  15. For Capacity mode, select On-demand.With on-demand mode, you pay based on the actual reads and writes you perform. For more information, see Read/Write Capacity Modes in Amazon Keyspaces.
  16. Choose Add table definition.
    Screenshot highlighting the "Add table definition" button. Capacity mode checkbox with "on-demand" option chosen

Add sample data to the data model

We have completed the data model in the data modeler and we now add some data to visualize the model. Visualizing your data model helps you validate that your data model is well suited for your application’s access patterns. You also want to ensure your data is going to be evenly distributed across partitions. For more information about data modeling best practices and techniques, see Data Modeling in Amazon Keyspaces (for Apache Cassandra).

  1. In NoSQL Workbench, choose Data modeler.
  2. Choose Visualize data model.The data visualizer provides a visual representation of the table’s schema and lets you add sample data.Screenshot highlighting the "Visualize data model" button
  3. Choose Add new row to add new row of data.
  4. Choose Save.
    Screenshot of the Visualizer with a manually added row and highlighting the "Save" button
  5. Choose Aggregate view.
  6. Choose Export to PNG.
    Screenshot highlighting the Visualizer menu on the left pane, the aggregate view button, and the export to PNG button
  7. To export the data model to a JSON file, choose the upload icon under the data model name.screenshot highlighting the "Upload" button, an upward arrow

Commit the data model to Amazon Keyspaces

When we’re satisfied with our data model, we can commit the model to Amazon Keyspaces.

This process automatically creates the server-side resources for keyspaces and tables based on the settings that you defined in the data model. You can follow a similar process to commit to Apache Cassandra.

  1. Choose Commit to Amazon Keyspaces.
    screenshot highlighting the "Commit to Amazon Keyspaces" buttonAmazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure connections with clients. You can connect to Amazon Keyspaces using one of the following options:

  2. On the Connect by using IAM credentials tab, for Connection name¸ enter a name for the connection.
  3. For AWS Region, enter a Region.For available Regions, see Service Endpoints for Amazon Keyspaces.
  4. For Access key ID, enter the access key ID.
  5. For Secret access key, enter the secret access key.
  6. For Port, enter 9142.
  7. For AWS public certificate, point to the AWS certificate you downloaded earlier.
  8. Select Persist connection if you want to save the AWS connection secrets locally.
  9. Choose Commit.
    You should see a success message.
  10. On the Amazon Keyspaces console, in the navigation pane, choose Keyspaces.We see our ks_internet_consumption keyspace is created and available in AWS.
  11. In the navigation pane, choose Tables.
  12. Choose our data_consumption_details table.
  13. In the navigation pane, choose CQL Editor.
  14. Enter the following command:
    SELECT * FROM ks_internet_consumption.data_consumption_details;
  15. Choose Run command.
    screenshot of the CQL editor with the query from the previous step written and highlighting the "Run command" button

We can see the record of our table.

screenshot of the table view showing two records

You’ve now successfully built the data model using NoSQL Workbench.

Clean up your resources

Delete the resources you created in this post, like the Amazon Keyspaces tables, to avoid ongoing charges.

  1. In NoSQL Workbench, choose Data modeler.
  2. For Data model, choose your table.
  3. Choose the delete icon.
    NoSQL Workbench screenshot highlighting the Data Modeler menu on the left pane and the delete menu (trash can icon) at the bottom

You can also delete the table via the Amazon Keyspaces console, on the Tables page.

Screenshot of the Tables menu of the Amazon Keyspaces console highlighting the "Delete" button and the created table selected

Conclusion

Data modeling is an important task when designing new applications. In this post, I showed you how to use NoSQL Workbench to build, design, create, query, and manage Amazon Keyspaces tables. You can also use NoSQL Workbench to convert the existing relational database management service application and build an Amazon Keyspaces model.

To learn more about Amazon Keyspaces data modeling, see Advanced data modeling techniques for Cassandra. To learn more about NoSQL Workbench, see Using NoSQL Workbench with Amazon Keyspaces (for Apache Cassandra).


About the Author

Dhiraj Thakur pictureDhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.