Three Ways to Interact with Dynamo DB

At various points in the p2t processing flow, I need to save state somewhere, indicating upload file properties, as well as the details of any output, and so on. Actually there are myriad details worth capturing, and for the most part, they look like key-value pairs. Therefore, a logical choice is dynamoDB, the AWS flavor of NoSQL. This bodes well for scaling down the track, as well as (hopefully!) throughput.

I mention installing and using the local version of dynamoDB previously, but since most backend components are functioning at this point, I’ve moved to developing with the cloud instance. While the aws-cli tool is much more expedient for bulk/batch operations, the console is useful for creating a new table, including the specification of hash and range primary key details. The console also allows for inspection of table entries, and dataypes eg.,

At the backend, a variety of lambda functions push on dynamoDB using node.js. For example, here I create an entry for a freshly uploaded document example, ready for processing by the workflow:

var params = {
TableName:table,
Item: {
"time": time,
"email": putEmail,
"info": {
"input_file": fileInput,
"upload_date": "2013-09-02T00:00:00Z",
"ip" : "127.0.0.1",
"id" : globalLabel,
"num_files" : numFiles,
"input_size": srcSize,
"job_files": JSON.stringify(file_array)
}
}};
dynamodbDoc.put(params, function(err, data) {
if (err) {
console.error("Unable to add item. Error JSON:", JSON.stringify(err, null, 2));
} else {
console.log("Added item:", JSON.stringify(data, null, 2));
}});

On the frontend, I want a user to be able to browse available output. This is predicated on lambda functions firing correctly and passing data through the flow, and how the results should be presented to the user is still something under consideration. At the very least, I do want a user to be able to check on the outcome of the aforementioned upload. Regardless, in order to present contents of dynamoDB tables to the enduser, I switch gears to php, for example this little snippet :

require 'app/start.php';
use Aws\DynamoDb\Exception\DynamoDbException;
$sdk = new Aws\Sdk([
'region' => 'us-east-1',
'version' => 'latest',
]);
date_default_timezone_set('UTC');
$client = $sdk->createDynamoDb();
$result = $client->describeTable(array(
'TableName' => 'processingJobs'
));
// milliseconds
$t = strtotime("-2 days") * 1000;
$iterator = $client->getIterator('Query', array(
'TableName' => 'processingJobs',
'KeyConditions' => array(
'email' => array(
'AttributeValueList' => array(
array('S' => 'example@example.com')
),
'ComparisonOperator' => 'EQ'
),
'time' => array(
'AttributeValueList' => array(
array('N' => (string) $t)
),
'ComparisonOperator' => 'GT'
))
));

It’s been a long time since I used php; I found this series incredibly helpful in getting up to speed with AWS,

although be aware of a few undocumented and/or non-obvious aspects to the sdk. For instance, one gotcha was the need to cast the number type for time to a string. An easy way to check if your script is kosher is to simply run it through the php interpreter; I’ve found the debug output really helpful and easy to use. I would also recommend testing with a local apache instance too, in which case the output of (eg., on Ubuntu) /var/log/apache2/error.log is very useful eg., this :

[Mon Oct 12 18:18:08.954871 2015] [:error] [pid 12534] [client 127.0.0.1:53302] PHP Fatal error: Uncaught exception 'Aws\\Exception\\CredentialsException' with message 'Error retrieving credentials from the instance profile metadata server.

which was remedied by adding credentials to the aforementioned php snippet ala :

use Aws\DynamoDb\Exception\DynamoDbException;
$sdk = new Aws\Sdk([
'region' => 'us-east-1',
'version' => 'latest',
'credentials' => array(
'key' => 'AKIAI7UQxxxxxxxxxxxx',
'secret' => 'iFspwoSBEYr6ln5/xxxxxxxxxxxxxxxxxxxxxxxx')
]);

happy hunting! -bill