Sunday, October 14, 2012

Getting started with Amazon's DynamoDB in Python

Amazon provides many cloud services, both for Infrastructure as a Service (running Virtual Machines in EC2) and Platform as a Service (providing the actual services). One of the services is DynamoDB, a simple NoSQL database, hosted on Amazon's servers.DynamoDB, like many NoSQL Databases is a simple key-value storage, with some added capabilities.
Basically, you store objects (representable in JSON), matched to a key. Your key can be a string or an int (and actually, DynamoDB allows your key to be composed of two fields, a hash and a range, but here we will only do the hash). Your object is just a dictionary mapping fields to values.
The best python library for DynamoDB (and for most other aws services) is boto.  The easiest way to install boto is with pip; just do pip install boto or sudo pip install boto.
Amazon offers a free tier for DynamoDB, which should be enough for playing and small applications (the DB has to be less than 100M in size, and you get 5KB in reads and 1KB in writes per second), so head down to aws.amazon.com and create your account, if you don't have it; after making your account, go to security credentials and find out your access key id and your secret access key. You may want to store your credentials in a file called .boto in your home folder, as explained here. After that, you can connect to your dynamodb as follows:
Now, let's try to create one table. Our table will be called users, and its key will be a string. We first need to create a schema (the schema only specifies the name and type of your key; your 'table' can contain any kind of objects, with any attributes), and then use that schema to create a table, as follows:
Once we have created the table, we go onto adding items; we create a dictionary containing our data, then use the new_item method of our table to create the item, and finally the put method of that item to actually add it to the database, as follows: And, after creating several items we can obtain an element (knowing its key) with the get_item method of a table, as follows (notice it will throw an exception if the key doesn't exist): To modify the item in the DB, we would just modify it in memory (it looks like a dictionary) and then call its put method again.
Most of the times you will be searching for a given key (or at least will know a part of the key), but if not, you can use the scan method (notice this method goes through EVERY object on the table, and so may be slow and consume a lot of I/O). You can pass a list of conditions (as of now, the documentation is incorrect, stating you specify the conditions with strings; you need to use elements coming from boto.dynamodb.condition). A scan would look like: For more information, check: Or come back :) I should be blogging more about dynamodb (and other aws services) soon.

3 comments:

  1. users=table.scan(scan_filter={'username':CONTAINS('o')})
    it doesn't return any data for me it returns "" as result and i have data with that content in my table. Kindly help me..

    ReplyDelete
    Replies
    1. The problem is with your imports...You can do it this way:
      records = table.scan(scan_filter={'username':boto.dynamodb.condition.CONTAINS('Mik')})
      And don't forget to import "import boto.dynamodb.condition"

      Delete
  2. Are you sure you have data that has lowercase o ? maybe you have it only uppercase ? did you use 'username' in lowercase ?

    Actually, I see a typo in my code ... user_data has a name attribute, not username ... try users=table.scan(scan_filter={'name':CONTAINS('o')})

    ReplyDelete