DynamoDB Streams locally with arc.codes & Enhance

Introduction

When working on a recent web app project using Enhance, I encountered a requirement for using both transactions and DynamoDB streams, two DynamoDB features that aren't supported by Dynalite - the fast in-memory DynamoDB engine that Architect uses to support local development.

In this article, I'll introduce my new plugin (arc-plugin-sandbox-stream) for working with DynamoDB streams locally within Arc and Enhance projects. This plugin enables developers to use DynamoDB streams locally in their arc or enhance sandbox environment.

Background

Architect got me hooked back in 2018 with its amazing local developer experience, it's a huge DX boon to be able to try ideas out on your own machine without being concerned about provisioning real infrastructure, or being tied to a location where network connectivity is guaranteed.

One of the services Architect spins up locally is Dynalite. It provides a fast in-memory DynamoDB engine. For lots of projects this engine is absolutely ideal, fast to launch, lightweight and not a process hog. But for this particular project I really needed streams and I wasn't prepared to sacrifice the local developer experience. So I set about researching how this could be achieved, if at all. Helpfully it's a question that has cropped up on Arc's discord and the thread of replies helped point me in the right direction.

I'd need to do a few things:

Switch to using DynamoDB Local
Create a middleware to poll the stream for new data
Invoke the corresponding function handler when new data is returned from the stream.

Using the plugin

The plugin kicks in when a @tables-streams pragma is discovered within your arc config, and queries the table meta data / stream meta data to retrieve iterators for each shard of the table's stream.

If data is found the plugin invokes the corresponding lambda function using arc's inbuilt invoke function.

Your function code will receive one or more results off the stream like so (as you'd expect if you've used streams on AWS):

[
  {
    "eventID": "7c1ac231-2c9d-4ba0-b3a2-1de2eb6602c0",
    "eventName": "MODIFY",
    "eventVersion": "1.1",
    "eventSource": "aws:dynamodb",
    "awsRegion": "ddblocal",
    "dynamodb": {
      "ApproximateCreationDateTime": "2023-04-24T12:19:00.000Z",
      "Keys": {
        "sk": {
          "S": "account:si-novi"
        },
        "pk": {
          "S": "account:si-novi"
        }
      },
      "NewImage": {
        "_type": {
          "S": "Organisation"
        },
        "name": {
          "S": "Si Novi"
        },
        "sk": {
          "S": "account:si-novi"
        },
        "created_at": {
          "S": "2023-04-24T12:19:07.409Z"
        },
        "id": {
          "S": "01GYSK850HWHGE36A2VTDG1CPK"
        },
        "pk": {
          "S": "account:si-novi"
        },
        "modified_at": {
          "S": "2023-04-24T12:19:07.409Z"
        },
        "slug": {
          "S": "si-novi"
        }
      },
      "SequenceNumber": "000000000000000000935",
      "SizeBytes": 310,
      "StreamViewType": "NEW_IMAGE"
    }
  }
]

Setup

1. Add the dependency to your project

npm install @hicksy/arc-plugin-sandbox-stream

2. Configure your project to use @tables-streams in .arc file

@tables
example
  pk *String
  sk **String

@tables-streams
example

3. Add the plugin to arc config

@plugins
hicksy/arc-plugin-sandbox-stream

4. Setup DynamoDB Local

There are various ways to use DyanmoDB Local. I opted for the bundled version that comes with AWS's NoSQL Workbench tool. But you can use which ever version you're comfortable with.

Just note that if you use the NoSQL Workbench version there's the following caveats I came accross:

Data storage is persistent in the NoSQL Workbench version - I can't find a way to set the inMemory flag available to the standalone / docker version (you'll need to modify any seed scripts to conditionally insert or drop and re-create the table)
You need to be running NoSQL Workbench while you develop and remember to toggle on DynamoDB Local each time (not a biggie really, but it'd be great to have the means to set it to autostart or run as a start-up background task)
The NoSQL Workbench version runs on port 5500, I can't see a way to change this, but it's not a problem for me.

5. Configure Arc to use DynamoDB Local

Arc environment params

Arc makes it really easy to switch out Dynalite for DynamoDB Local.

There's a couple of env vars you need to set to get you going.

Tell arc that you're using an external db
Change the port used for tables within arc
Set your region to ddblocal

You can do this one of several ways.

Either setting an environment variable ARC_DB_EXTERNAL=true / ARC_TABLES_PORT=5500 in an .env file, or using the prefs.arc file and setting the properties within the @sandbox section see arc's docs for more info

e.g.

.env file

APP_URL=http://localhost:3333
ARC_TABLES_PORT=5500

prefs.arc file

@sandbox
external-db true

Arc data seeds

As DynamoDB Local data is persistent when using the bundled NoSQL Workbench version, you'll need to modify any startup data seeds you have and update them to either conditionally insert data, or what I prefer, to delete the table and re-create it each time - this makes it more in keeping with all my other arc projects.

Arc table create or update

Sadly, as Dynalite doesn't support streams, arc's sandbox doesn't create the table with the required StreamSpecification param, so you'll need to supply that either when re-creating the table using createTable or supplying it via an updateTable call.

An example StreamSpecifcation looks like:

StreamSpecification: {
	StreamEnabled: true,
	StreamViewType: 'NEW_IMAGE' 
},

The project I'm working on currently uses Sensedeep's OneTable. So I can achieve the above like:

const client = new Dynamo({client: new DynamoDBClient(params)})
const table = new Table({
    name: DDB_NAME,
    client: client,
    logger: true,
    schema: AppSchema,
    partial: false
})

if(await table.exists()) {
	await table.deleteTable('DeleteTableForever');
}

await table.createTable({
    StreamSpecification: {
        StreamEnabled: true,
        StreamViewType: 'NEW_IMAGE' 
    },
});

You can also do this using the standard JS dynamodb client for any projects that aren't using an additional tool like onetable.

tip: use @architect/functions to infer the full name for your DynamoDB table's, which will be a combination of the name you've given and it's deployement environment etc

6. Modify the polling interval [optional]

By default the plugin will poll for new stream data every 10 seconds. You can override this by providing an alternate millisecond interval by updating your .arc file

@sandbox-table-streams
polling_interval 1000

Get in touch

Any feedback on your usage of this plugin, bugs, or suggestions for improvements please get in touch on GitHub. I'd love to hear from you.

DynamoDB Streams locally with arc.codes & Enhance

Introduction

Background

Using the plugin

Setup

Arc environment params

Arc data seeds

Arc table create or update

Get in touch

Get in touch

Follow me on...

Work history