Amazon DynamoDB

Dynamo DB

 

DynamoDB is an Amazon service which differs from other their services by allowing developers to purchase a service based on throughput, rather than storage. Although the database does not automatically scale, administrators can request more throughput and DynamoDB will spread the data and traffic over a number of servers using solid-state drives, predictable performance. Moreover, it offers integration with Hadoop via Elastic MapReduce.

DynamoDB pros:
  • Scalable. There is no limit to the amount of data, and the service automatically allocate more storage.
  • Flexible. Each data item may have different number of attributes. Multiple data types (strings, numbers, binary, and sets).
  • Distributed. DynamoDB scales horizontally and seamlessly scales a single table over hundreds of servers.
  • Cost effective. Allows more than 40 million database operations/month and pricing is based on throughput.
  • Easy administration, etc.
DynamoDB cons:
  • 64KB limit on row size
  • 1Mb limit on querying
  • secondary indexes are not supported
  • joins are impossible
  • deployable only on AWS
  • extremely limited querying, especially, if you want to query non-indexed data, etc.

Dynamo is scalable NoSQL solution from AWS. However, not everybody knows how to utilize it at its best in order to make profit for their business.

In this post we try to look at pros and cons in comparison with other solutions and find use cases where DynamoDB works at its best.

So in NoSQL world there are particular requirements for dbs.

  1. It has to accept and store massive amounts of data.
  2. Scalability is a prior requirement. RDBMS is very hard to scale horizontally and there is particular point in time when it’s just reaching the limit. NoSQL in comparison is designed for horizontal scaling and could be scaled endlessly.
  3. No relations, so query could be speeded up. However, because NoSQL means there is no ability to create complicated queries.

Also nobody can stop you from using RDBMS and NoSQL at the same time within the same app.

Why DynamoDB is good:
  1. It has predictable speeds. Simply you just have to define throughput and use it. Expecting 100r/s just add more and you’re ready to go.
  2. It is hosted on AWS premises and hence acting as a web service. For you that means you absolutely avoidу having administration headache. No more fleets of instances with Mongo master/slave nodes. No more complicated administration hell with load-balancers.
  3. RedShift integration. That means you can export your DynamoDB table into Redshift and run complicated queries for data analytics.
  4. Build-in CloudWatch monitoring. So you can watch your table performance in close to real time. This is a bit tricky and we should cover this in our next post about dynamo speed checking and speedometers.

So, if you decided to select this option, you could use AWS web-console or CLI. But these two are only good for monitoring tables. The ultimate tool is DynamoDB libraries. It gives you full access to all Dynamo features and most important part of it that you will be able to make Dynamo as part of your application through the code.

Below there are few code snippets which will give you understanding how to:

Create table

[code language=”groovy”]
CreateTableRequest createTableRequest = new CreateTableRequest()
createTableRequest.tableName = campaignPkToTableName(tableName)
createTableRequest.provisionedThroughput = createDefaultProvisionedThroughput()
createTableRequest.setKeySchema(createDefaultKeySchema(hashName, rangeName))
createTableRequest.setAttributeDefinitions(createDefaultAttributeDefinitions(hashName, rangeName))
CreateTableResult createTableResult = client.createTable(createTableRequest)

[/code]

Add item

[code language=”groovy”]
UpdateItemRequest updateItemRequest = new UpdateItemRequest()
List<String> updateExpression = []
Map<String, String> expressionAttributeNames = [:]
Map<String, AttributeValue> expressionAttributeValues = new HashMap<String, AttributeValue>()
updateItemRequest.setUpdateExpression(" SET " + updateExpression.join(‘,’))
updateItemRequest.setExpressionAttributeValues(item)
updateItemRequest.setTableName(tableName)
updateItemRequest.setExpressionAttributeNames(expressionAttributeNames)
updateItemRequest.setExpressionAttributeValues(expressionAttributeValues)
Map.Entry<String, AttributeValue> hashEntry =
new AbstractMap.SimpleEntry<String, AttributeValue>(HASH_NAME, item[HASH_NAME]);
Map.Entry<String, AttributeValue> rangeEntry =
new AbstractMap.SimpleEntry<String, AttributeValue>(RANGE_NAME, item[RANGE_NAME]);
updateItemRequest.setKey(hashEntry, rangeEntry)
boolean saved = true
try {
client.updateItem(updateItemRequest)
} catch (AmazonClientException exception) {

}
[/code]

Edit table settings.

[code language=”groovy”]
UpdateTableRequest request = new UpdateTableRequest()
request.withTableName(dynamoTableHelperService.campaignPkToTableName(tableName))
ProvisionedThroughput throughput = new ProvisionedThroughput()
throughput.setReadCapacityUnits(maxReadCapacityUnits)
throughput.setWriteCapacityUnits(describeTableResult.table.provisionedThroughput.getWriteCapacityUnits())
request.withProvisionedThroughput(throughput)
[/code]

Add global index

[code language=”groovy”]

GlobalSecondaryIndexUpdate indexUpdate = new GlobalSecondaryIndexUpdate()
UpdateGlobalSecondaryIndexAction updateIndexAction = new UpdateGlobalSecondaryIndexAction()
updateIndexAction.setIndexName(DynamoIndexConstants.GSI_NAME)
ProvisionedThroughput indexThroughput = createDefaultProvisionedThroughput()
indexThroughput.setWriteCapacityUnits(gsiWriteCapacity)
indexThroughput.setReadCapacityUnits(minReadCapacityUnits)
updateIndexAction.setProvisionedThroughput(indexThroughput)
indexUpdate.setUpdate(updateIndexAction)
request.setGlobalSecondaryIndexUpdates([indexUpdate])
[/code]

Query table
Scan table

If you have any queries or problems with DynamoDB administration, please contact.

Leave a Reply

Your email address will not be published. Required fields are marked *