What a conditional DynamoDB transaction actually costs
Leonie opened an issue on the suite the other week with a question I couldn't answer off the top of my head: when you run a conditional write inside a DynamoDB transaction, does the condition cost you read capacity on top of the write? Or is it write capacity all the way down?
So I went to the docs. They were... not clear. There's a line on the transaction API page about idempotent retries returning read capacity, and a separate read/write page that circles the topic, but nothing that says plainly: here is what a conditional TransactWriteItems charges you.
This is the sort of thing the suite is for. It runs real DynamoDB operations against live AWS and records what the service actually does, so when the docs go vague you can swap the guesswork for a number. I wrote four tests and ran them against real DynamoDB.
What came back
A passing condition adds nothing. A conditional write that succeeds costs exactly what the unconditional version does: 2 write capacity units per item under 1KB. The condition itself adds no read capacity. If you'd been avoiding attribute_not_exists guards on the assumption they'd cost you extra reads, they don't.
A standalone ConditionCheck is billed as a write. The ConditionCheck action, the one that asserts something about an item without modifying it, costs 2 WCU and counts as write capacity rather than read. A little counterintuitive, given all it does is look at an item and check a condition, but the transaction path treats it as a write.
Idempotent replay splits the accounting. Send a TransactWriteItems with a ClientRequestToken and the first call charges write capacity, 2 WCU. Send the same token again inside the idempotency window and DynamoDB doesn't repeat the write; it hands back the stored result and charges you read capacity for it, 2 RCU, because it's reading what it already wrote. This is the behaviour the docs hint at without ever spelling out, so it's the one I most wanted nailed down.
A cancelled transaction reports nothing. When the condition fails, the transaction cancels with a TransactionCanceledException, and the response carries no ConsumedCapacity at all. AWS still bills the prepare phase, so the test pins what the response reports, not what you pay. Worth keeping the two apart.
All four now live in the suite, characterised against eu-west-2, and they re-run on every pass. If AWS quietly changes any of it, that turns up as a failing test rather than a surprise on the bill.
Loose ends
Those four pinned the everyday case, but each one left a thread hanging, so I went back for them.
The replay test had a hole in it. It ran against a sub-1KB item, and at that size the numbers can't actually prove what they claim. A transactional write costs two units for every 1KB and a transactional read two units for every 4KB, both rounded up, so anything under 1KB lands on 2 either way - the replay's "2 read units" could just be the write's 2 units wearing a read label. So I pushed the item past 1KB. At around 1.5KB it separates: the first call charges 4 write units, the replay charges 2. The replay is recomputing a real read against the item's size, not handing back the write cost - so it reads what it already wrote, and at this size it comes in cheaper than the write it stands in for.
ExecuteTransaction, the PartiQL version, takes a ClientRequestToken too, and it behaves the same as TransactWriteItems: replay the token and the statements run once rather than twice, 2 write units on the first call, 2 read on the replay. A counter I incremented inside the transaction reads back 1, not 2, which is how you know the replay didn't re-run it.
The single-item case was the one that caught me out, and not the way I expected. The issue took it as read - and I'd have assumed the same - that a plain GetItem or PutItem reports its capacity split into read and write units, the way transactions do. It doesn't. Real DynamoDB gives single-item operations a single aggregate CapacityUnits and no split at all; the read/write breakdown is a transactions-only thing. So that test pins an absence rather than a number: a strongly-consistent read reports 1 unit, an eventually-consistent read 0.5, a small write 1, and none of them carry the split. I only know that because the suite asks real DynamoDB before it pins anything, rather than trusting the assumption sitting inside the question.
A daft aside: don't share your toys
There's a stupid coda to this one. After I'd written the tests and pushed them, CI went red, but only the single job that runs against real AWS. Every emulator was green. The real-AWS job had run clean for the first couple of minutes, then fell into a wall of "resource not found" errors.
The culprit was me. The suite uses a fixed set of table names, and the harness wipes all of them before each run so it starts from a known state. I'd been running those same tests on my laptop, against the same AWS account, to work the numbers out in the first place. My local run fired up, dutifully deleted every conformance table, and pulled them out from under the CI job that was halfway through using them.
Clearing the red was trivial: stop hammering the same account while CI is on it. The lesson underneath is the boring one I keep relearning. Shared mutable state catches up with you, and "it's only my laptop" is usually how.
One release, a lot of edges
The capacity answers shipped as part of the suite's 1.9.0 release, and they were only one of three strands in it. The release pinned around fifty new behaviours, all characterised against real DynamoDB in eu-west-2, and most of them live in the corners no one documents and everyone eventually trips over.
Empty binary key values, for one. Send a key attribute with an empty binary value and real DynamoDB rejects it with a validation error, the same way it rejects an empty string, on every path: single writes, batches, transactions, secondary-index keys. The suite now holds every engine to that.
The rest is the fine print of expressions and response shapes: a BETWEEN with its bounds the wrong way round, an ExpressionAttributeValues map carrying an entry nothing references, a projection path that doesn't resolve, a hash-only GSI paged across a shared partition key, an UPDATED_NEW response that has to echo back exactly the right sub-paths. None of it breaks the happy path, and all of it is the kind of thing that slips past a lenient emulator and only shows up when you point the same code at the real service.
Where this goes: 2.0.0
There's a bigger problem sitting under all of this, and it's what the next major version is for.
The suite pins eu-west-2 as ground truth. For almost everything that's the right call, because real DynamoDB behaves the same wherever you call it. But not quite everything. A while back I watched the suite catch DynamoDB rewording its own validation errors, and the same dig turned up a handful of cases where the real service disagrees with itself by region. The clearest is that null attribute again: send { NULL: false } and eu-west-2 and eu-central-1 store it, while us-east-1 and ap-southeast-2 reject it. Neither is a bug. Both are real DynamoDB, and the split has held for over a month.
That puts the suite in an awkward spot. If your engine matches us-east-1 on that behaviour, right now it gets marked non-conformant for doing exactly what DynamoDB does in Virginia. That isn't a fair call, and it quietly rewards whichever region I happened to pin. Pin one region and you're really testing conformance to that region, not to DynamoDB.
So 2.0.0 is about scoring fairly across regions. The rough shape: ground truth becomes what real DynamoDB does across several regions rather than one, an engine conforms on a behaviour if it matches any real region and only fails if it does something no region does, and the results grow a per-region view so you can see how an engine lines up region by region instead of as one blended number. For the vast majority that's identical everywhere, nothing changes. And a behaviour only counts as region-split once it has held long enough that it isn't just a rollout caught mid-flight.
None of it is settled. I've written up the thinking in the open, and if you maintain a DynamoDB engine, especially one you build against a particular region, I'd want your view on how you'd want to be scored.