AWS Database Blog
Build an Amazon Keyspaces (for Apache Cassandra) data model using NoSQL Workbench
In this post, we build an end-to-end data model for an internet data consumption application. Data modeling provides a means of planning and blueprinting the complex relationship between an application and its data. Creating an efficient data model helps to achieve better query performance. An inefficient data model can slow development and performance, increase costs, and make it hard to respond to new requirements.
Data modeling with Amazon Keyspaces (for Apache Cassandra), however, is quite different from relational data modeling, and requires a mindset adjustment that leaves behind many of the restrictions of relational data modeling. To design an efficient and scalable Amazon Keyspaces data model, you must consider what is the best partition key, how your application accesses data, and how you can support the growing data requirements while maintaining fast query performance.
AWS recently added support for Amazon Keyspaces and Apache Cassandra to NoSQL Workbench. NoSQL Workbench is a client-side application that helps you design and visualize non-relational data models for Amazon Keyspaces more easily. NoSQL Workbench also works with Amazon DynamoDB. NoSQL Workbench clients are available for Windows, macOS, and Linux. You can use NoSQL Workbench with Amazon Keyspaces to do the following:
- Design data models and create resources automatically – NoSQL Workbench provides a point-and-click interface to design and create Amazon Keyspaces data models. You can easily create new data models from scratch by defining keyspaces, tables, and columns. You can also import existing data models and make modifications (such as adding, editing, or removing columns) to adapt the data models for new applications. NoSQL Workbench enables you to commit the data models to Amazon Keyspaces or Apache Cassandra, and create the keyspaces and tables automatically
- Visualize data models – With NoSQL Workbench, you can visualize your data models to help ensure that the models can support your application’s queries and access patterns.
For more information, see Using NoSQL Workbench with Amazon Keyspaces (for Apache Cassandra).
Solution overview
Amazon Keyspaces is a scalable, highly available, and managed Apache Cassandra–compatible database service. With Amazon Keyspaces, you don’t have to provision, patch, or manage servers, and you don’t have to install, maintain, or operate software. Amazon Keyspaces is serverless, so you pay for only the resources that you use, and the service automatically scales tables up and down in response to application traffic.
In this post, we build a new greenfield app that can retrieve information like total internet data consumption by customer and residence. It also supports different search patterns. We use NoSQL Workbench to build this single-table model.
Data modeling in Amazon Keyspaces is query driven. When creating an optimal model in Amazon Keyspaces, we recommend having a strong understanding of Amazon Keyspaces architecture and query patterns.
The access patterns required by the application and facilitated by this data model are as follows:
- Retrieve total internet data consumption of a customer using the
customerId
- Retrieve total internet data consumption of a customer using the
customerId
and aconsumptionDate
- Search for a customer by
customerId
- Retrieve customer details
To create our application, we complete the following high-level steps:
- Create the data consumption model.
- Add sample data to the model.
- Commit the model to Amazon Keyspaces.
Prerequisites
Before you get started, complete the following prerequisites:
- Download and install NoSQL Workbench on your computer.
- Create credentials for Amazon Keyspaces.
- Download the Starfield digital certificate using the following command and save it to the local directory. You use the certificate to connect to Amazon Keyspaces.
You also can use this cert to connect to Amazon Keyspaces using cqlsh (the Cassandra Query Language Shell).
Create a data consumption model
To create your model, complete the following steps:
- On the Database Catalog page in NoSQL Workbench, choose Amazon Keyspaces.
- Choose Launch.
- Open NoSQL Workbench and choose the Data modeler.
- Under Data model, choose the + icon.
- For Name, enter the name of the data model.
- Choose Create.
- Under Keyspace, choose the + icon to add the keyspace.
- For Keyspace name, enter a name for the keyspace.
- For Replication strategy, choose a replication strategy for the keyspace.Amazon Keyspaces uses
SingleRegionStrategy
to replicate data three times automatically in multiple AWS Availability Zones. If you want to commit the data model to an Apache Cassandra cluster, you can chooseSimpleStrategy
orNetworkTopologyStrategy
. - Choose Add keyspace definition.
- Under Tables, choose the + icon to add the table definition.
- Add the column name and data type as per the following screenshot.
- For Partition key, enter
customer_id
. - For Clustering columns, enter
consumtionDate
. - For Capacity mode, select On-demand.With on-demand mode, you pay based on the actual reads and writes you perform. For more information, see Read/Write Capacity Modes in Amazon Keyspaces.
- Choose Add table definition.
Add sample data to the data model
We have completed the data model in the data modeler and we now add some data to visualize the model. Visualizing your data model helps you validate that your data model is well suited for your application’s access patterns. You also want to ensure your data is going to be evenly distributed across partitions. For more information about data modeling best practices and techniques, see Data Modeling in Amazon Keyspaces (for Apache Cassandra).
- In NoSQL Workbench, choose Data modeler.
- Choose Visualize data model.The data visualizer provides a visual representation of the table’s schema and lets you add sample data.
- Choose Add new row to add new row of data.
- Choose Save.
- Choose Aggregate view.
- Choose Export to PNG.
- To export the data model to a JSON file, choose the upload icon under the data model name.
Commit the data model to Amazon Keyspaces
When we’re satisfied with our data model, we can commit the model to Amazon Keyspaces.
This process automatically creates the server-side resources for keyspaces and tables based on the settings that you defined in the data model. You can follow a similar process to commit to Apache Cassandra.
- Choose Commit to Amazon Keyspaces.
Amazon Keyspaces requires the use of Transport Layer Security (TLS) to help secure connections with clients. You can connect to Amazon Keyspaces using one of the following options:- To use service-specific credentials, see Connecting to Amazon Keyspaces with Service-Specific Credentials.
- To use AWS Identity and Access Management (IAM) credentials, see Connecting to Amazon Keyspaces with IAM Credentials.
For this post, we connect using IAM credentials to create a new connection.
- On the Connect by using IAM credentials tab, for Connection name¸ enter a name for the connection.
- For AWS Region, enter a Region.For available Regions, see Service Endpoints for Amazon Keyspaces.
- For Access key ID, enter the access key ID.
- For Secret access key, enter the secret access key.
- For Port, enter 9142.
- For AWS public certificate, point to the AWS certificate you downloaded earlier.
- Select Persist connection if you want to save the AWS connection secrets locally.
- Choose Commit.
You should see a success message. - On the Amazon Keyspaces console, in the navigation pane, choose Keyspaces.We see our
ks_internet_consumption
keyspace is created and available in AWS. - In the navigation pane, choose Tables.
- Choose our
data_consumption_details
table. - In the navigation pane, choose CQL Editor.
- Enter the following command:
- Choose Run command.
We can see the record of our table.
You’ve now successfully built the data model using NoSQL Workbench.
Clean up your resources
Delete the resources you created in this post, like the Amazon Keyspaces tables, to avoid ongoing charges.
You can also delete the table via the Amazon Keyspaces console, on the Tables page.
Conclusion
Data modeling is an important task when designing new applications. In this post, I showed you how to use NoSQL Workbench to build, design, create, query, and manage Amazon Keyspaces tables. You can also use NoSQL Workbench to convert the existing relational database management service application and build an Amazon Keyspaces model.
To learn more about Amazon Keyspaces data modeling, see Advanced data modeling techniques for Cassandra. To learn more about NoSQL Workbench, see Using NoSQL Workbench with Amazon Keyspaces (for Apache Cassandra).
About the Author
Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.