Image of How to create a DynamoDB table on AWS

ADVERTISEMENT

Table of Contents

Introduction

This article will show you how to create your first DynamoDB table on AWS. This is an important part of building quality cloud services.

Amazon DynamoDB is the primary database in AWS for building serverless applications. DynamoDB is a fully managed NoSQL database and you do not have to manage any servers. Unlike most NoSQL databases, DynamoDB also supports consistent reads but with an additional cost.

Attributes in DynamoDB are synonymous with columns, and items are synonymous with rows in a relational database. However, there is no table-level schema in DynamoDB. You can have a different set of attributes in different items (rows). You can also have an attribute with the same name but different types in different items. Getting ready

You need a working AWS account and should have installed and configured the AWS CLI with a profile with the necessary permissions. You are also expected to have a decent understanding of AWS CLI commands, Amazon CloudFormation, and basic database concepts. For the complete code files for this article, you can refer to:

https://github.com/PacktPublishing/Serverless-Programming-Cookbook/tree/master/Chapter03/your-first-dynamodb-table/resources

How to do it...

Create a simple table, check its properties, update it, and finally delete the table. First, use CLI commands to create the table and then use a CloudFormation template to do the same. We will also use CLI commands to check the created table.

Creating a table using CLI commands

  1. We can create a simple DynamoDB table using the aws dynamodb create-table CLI command as follows:
aws dynamodb create-table \
--table-name my_table \
--attribute-definitions 'AttributeName=id, AttributeType=S' 'AttributeName=datetime, AttributeType=N' \
--key-schema 'AttributeName=id, KeyType=HASH' 'AttributeName=datetime, KeyType=RANGE' \
--provisioned-throughput 'ReadCapacityUnits=5, WriteCapacityUnits=5' \
--region us-east-1 \
--profile admin

Here, we define a table named my_table and use the attribute-definitions property to add two fields: id of type string (denoted by S) and ``datetimeof type number (denoted byN). We then define a partition key (or hash key) and a sort key (or range key) using thekey-schemaproperty. We also define the maximum expected read and write capacity units per second using theprovisioned-throughputproperty. I have specified the region even thoughus-east-1` is the default.

  1. List tables using the aws dynamodb list-tables CLI command to verify that our table was created:
aws dynamodb list-tables \
--region us-east-1 \
--profile admin
  1. Use the aws dynamodb describe-table CLI command to see the table properties:
aws dynamodb describe-table \
--table-name my_table \
--profile admin

The initial part of the response contains the table name, attribute definitions, and key schema definition we specified while creating the table:

The latter part of the response contains TableStatus, CreationDateTime, ProvisionedThroughput, TableSizeBytes, ItemCount, TableArn and TableId:

  1. You may use the aws dynamodb update-table CLI command to update the table:
aws dynamodb update-table \
--table-name my_table \
--provisioned-throughput 'ReadCapacityUnits=10, WriteCapacityUnits=10' \
--profile admin
Finally, you may delete the table using aws dynamodb delete-table:
aws dynamodb delete-table \
--table-name my_table \
--profile admin

Creating a table using a CloudFormation template

Now, we will see the components of the CloudFormation template needed for this article. The completed template file is available in the code files.

  1. Start creating the CloudFormation template by defining the template format, the version, and a description:
AWSTemplateFormatVersion: '2010-09-09'
Description: Your First DynamoDB Table
  1. Define the Resources section with the DynamoDB Table type:
Resources:
MyFirstTable:
Type: AWS::DynamoDB::Table
  1. Define the properties section with the essential properties: TableName, ProvisionedThroughput, KeySchema, and AttributeDefinitions:
Properties:
TableName: my_table
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
KeySchema:
    -
AttributeName: id
KeyType: HASH
    -
AttributeName: dateandtime
KeyType: RANGE
AttributeDefinitions:
    -
AttributeName: id
AttributeType: S
    -
AttributeName: dateandtime
AttributeType: N
  1. Update the table properties with the CloudFormation template:

Change ReadCapacityUnits and WriteCapacityUnits in the template to 5 for each. You can then update the stack using the aws cloudformation update-stack CLI command:

aws cloudformation update-stack \
    --stack-name myteststack \
    --template-body file://resources/your-first-dynamodb-table-cf-template-updated.yml \
    --region us-east-1 \
    --profile admin

Whenever an update is made, CloudFormation compares the template with the existing stack and updates only those resources that are changed.

  1. Verify the table update using the aws dynamodb describe-table CLI command.
  2. Delete the stack using the aws cloudformation delete-stack CLI command.

How it works...

We used the following DynamoDB CLI command actions in this recipe: create-table, list-tables, describe-table, update-table, and delete-table. We use the corresponding components and properties within our CloudFormation template as well. Some of these options will become clear after you read the following notes.

DynamoDB data model

Data in DynamoDB is stored in tables. A table contains items (similar to rows) and each item contains attributes (similar to columns). Each item can have a different set of attributes and the same attribute names may be used with different types in different items. DynamoDB supports the datatypes string, number, binary, Boolean, string set, number set, binary set, and list. It does not have a JSON data type; however, you can pass JSON data to DynamoDB using the SDK and it will be mapped to native DynamoDB data types. You can also define indexes (global secondary indexes and local secondary indexes) to improve read performance.

Data model limits

The following are some of the important limits in the DynamoDB data model:

  • There is an initial limit of 256 tables per region for an AWS account, but this can be changed by contacting AWS support.

  • Names for tables and secondary indexes must be at least three characters long, but no more than 255 characters. Allowed characters are A-Z, a-z, 0-9, _ (underscore), - (hyphen), and . (dot).

  • An attribute name must be at least one character long but no greater than 64 KB long. Attribute names must be encoded using UTF-8, and the total size of each encoded name cannot exceed 255 bytes.

  • The size of an item, including all the attribute names and attribute values, cannot exceed 400 KB.

  • You can only create a maximum of five local secondary indexes and five global secondary indexes per table.

DynamoDB keys and partitions

Each item is identified with a primary key, which can be either only the partition key if it can uniquely identify the item or a combination of partition key and sort key. The partition key is also called a hash key and the sort key is also called a range key. Primary key attributes (partition and sort keys) can only be string, binary, or number.

Initially, a single partition holds all table data. When a partition's limits are exceeded, new partitions are created and data is spread across them. Current limits are 10 GB storage, 3,000 RCU, and 1,000 WCU. Data belonging to one partition key is stored in the same partition; however, a single partition can have data for multiple partition keys. The partition key is used to locate the partition and the sort key is used to order items within that partition.

Read and write capacity units

We specified the maximum read and write capacity units for our application per second, referred to as read capacity unit (RCU) and write capacity unit (WCU). We also updated our RCU and WCU. Updating the table properties is an asynchronous operation and may take some time to take effect.

Waiting for asynchronous operations

The CLI commands create-table, update-table, and delete-table are asynchronous operations. The control returns immediately to the command line but the operation runs asynchronously.

To wait for table creation, you can use the aws dynamodb wait table-exists --table <table-name> command, which polls the table until it is active. The wait table-exists command may be used in scripts to wait until the table is created before inserting data. Similarly, you can wait for table deletion using the aws dynamodb wait table-not-exists --table <table-name> command, which polls with describe-table until ResourceNotFoundException is thrown. Both the wait options poll every 20 seconds and exit with a 255 return code after 25 failed checks.

Other ways to create tables

We created our table by specifying the properties, such as attribute-definitions, key-schema, provisioned-throughput, and so on. Instead, you can specify a JSON snippet or JSON file using the cli-input-json option. The generate-cli-skeleton option returns a sample template as required by the cli-input-json option.

We also created a table using the AWS CLI and CloudFormation. You can also create DynamoDB tables from Java code using the AWS SDK. However, in most real-world cases, CloudFormation templates are used to create and provision tables and the AWS SDK is used to work with data items.

There's more...

Let's first see some features and limitations of DynamoDB. We will also see some theory on the LSI and GSI.

DynamoDB features

The following are some of the important features of DynamoDB:

  • DynamoDB is a fully managed NoSQL database service. There are no servers to manage.
  • DynamoDB has the characteristics of both the key-value and the document-based NoSQL families.
  • Virtually no limit on throughput or storage. It scales very well but according to the provisioned throughout configuration.
  • DynamoDB replicates data into three different facilities within the same region for availability and fault tolerance. You can also set up cross-region replication manually.
  • It supports eventual consistency reads as well as strongly consistent reads.
  • DynamoDB is schemaless at the table level. Each item (rows) can have a different set of elements. Even the same attribute name can be associated with different types in different items.
  • DynamoDB automatically partitions and re-partitions data as the table grows in size.
  • You can store JSON and then do nested queries on that data using the AWS SDK.
  • Data is stored on SSD storage.
  • DynamoDB supports atomic updates and atomic counters.
  • DynamoDB supports conditional operations for put, update, and delete.

DynamoDB general limitations

Here are some of the general limitations of DynamoDB:

  • DynamoDB does not support complex relational queries such as joins or complex transactions.
  • DynamoDB is not suited for storing a large amount of data that is rarely accessed. S3 may be better suited for such use cases.
  • You cannot select the Availability Zone for your DynamoDB table.
  • Default replication of data for availability and fault tolerance is only within a region.
  • Local and global secondary indexes
  • You can define LSI and GSI for your tables to improve the read performance. An LSI can be considered as an alternate sort key for a given partition-key value. A GSI contains attributes from the base table and organizes them by a primary key that is different from that of the base table.
  • Secondary indexes are useful when you want to query based on non-key parameters. You can create them with the CLI as well as CloudFormation templates. There is a limit of five LSIs and five GSIs per table.

Conclusion

You can read and learn more about LSIs and GSIs from the following links:

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html

If you found this article interesting, check out the Serverless Programming Cookbook for building and deploying real-world serverless applications. Explore the exciting world of Cloud offerings including Azure, Google Cloud, and IBM Cloud. The Serverless Programming Cookbook provides solutions to the most common problems faced in the world of serverless applications.

About the author

Heartin Kanikathottu is a Senior Software Engineer and Blogger with around 11 years of IT experience currently working as a Senior Member of Technical Staffing at VMware.

Final Notes