How close is your DynamoDB emulator to AWS?
My DynamoDB conformance suite has a site now: dynamodb-conformance.org. Eight emulators, 684 tests, scored against live AWS DynamoDB.
The README only ever showed a single snapshot, frozen at whenever I last updated the table. The site shows everything around that snapshot. The homepage is the standings - every target ranked by how closely it matches real DynamoDB, with DynamoDB itself pinned at the top on a flat 100%. Each emulator gets its own page with run-over-run history - not just where a score landed, but where it moved and which tests pushed it. There's a support matrix that breaks every operation down target by target, and a runs archive going back through each scoring pass. Want to know whether LocalStack handles transactions, or where DynamoDB Local diverges on error wording? It's a couple of clicks now, not a repo checkout.
How it works
Every test runs against live AWS DynamoDB first. Whatever the real thing does is recorded as the expected answer, and an emulator only passes if it gives that same answer. The ground truth is never my reading of the docs - it's what DynamoDB actually returned when I asked it.
The results split into three tiers, because one number hides too much. Core is the everyday stuff: CRUD, queries, scans, batch operations. Complete adds the documented-but-less-common features like transactions, PartiQL, TTL and streams. Strict is the fiddly end - validation ordering, exact error wording, API limits, legacy shapes. A gap in Core breaks your app. A gap in Strict breaks the CI test that asserts on an error message.
Everything is checked through the standard AWS SDK against the target's HTTP endpoint. Nothing reaches inside the implementation. If your application would see it through the SDK, the suite checks it. If it wouldn't, it doesn't care.
And the suite grows. New tests land most weeks, so a score can move because the target changed or because I added coverage that exposed an old gap. A high score only means the target passes the tests that exist; behaviours the suite doesn't cover yet are blind spots, not passes. The methodology page has the full version.
What this run shows
The figures below come from the run dated 26 May 2026 - the site always shows the current ones.
Take LocalStack. It sits at 88% overall, which reads as middling until you split it: 98.6% on Core, 68.7% on Strict. That's not a mediocre emulator, it's a genuinely good one for everyday work that comes apart on error fidelity. DynamoDB Local tells nearly the same story - 97.7% Core, 69.2% Strict. The headline number flattens two completely different shapes of gap into one figure, which is precisely why the tiers exist.
This run also shows the suite breathing. It grew by 29 tests this pass, and the biggest movers all went down - Floci off 1.3 points, Dynalite 1.2, ExtendDB 1.1. Nothing regressed. The new tests just went looking where the old ones hadn't. That's the suite doing its job: coverage sharpens, scores dip, targets fix the gaps and climb back.
One result is worth singling out. Ask real DynamoDB for item collection metrics on a write with ReturnItemCollectionMetrics: SIZE and it hands them straight back. Ask DynamoDB Local, LocalStack, Dynalite or Ministack and you get silence - a documented response field they simply don't implement, AWS's own emulator included. Dynoxide, ExtendDB and Floci return it correctly. You'd never know unless something was checking.
Run it yourself
The whole thing is Apache-2.0 and every test is in the repo. Anything that speaks the DynamoDB HTTP API can be scored, so if you maintain an emulator, clone it, point it at your endpoint and see what comes back. You'll almost certainly find something. If you'd rather just see your emulator on the board, there's a suggest a target form.
If you spot a behaviour the suite doesn't cover yet, send a PR. The single rule is that the test has to pass against real DynamoDB first - if AWS rejects it, the test is wrong, not the emulator. That rule is the whole point. Targets are measured against real DynamoDB, never against each other, so two emulators agreeing on the same wrong answer can't quietly make it the standard. The suite is meant to be an independent reference the whole field can trust, not something any one project owns. No emulator author, me included, gets to mark their own homework.
And the field is genuinely getting good. Floci has come a long way since late April, from around 61% to the low 90s, with its Tier 3 score hauled up out of the twenties. ExtendDB only joined in late May and landed near the top straight away. It's an AWS-managed open-source project, Postgres-backed, that speaks the DynamoDB wire protocol. Worth a look.
Dynoxide, my own engine, sits in there on the same terms as the rest: same tests, no favours. Floci and ExtendDB are a couple of points behind and still climbing, and the whole field is closing on real DynamoDB.