SillyMuffin
SillyMuffin

💾 The Amazon Dynamo Paper (or how Amazon never loses what's in your shopping cart)

I see a lot of questions about distributed databases, and I always point people to the Dynamo paper. Here's my attempt to explain why it's so cool in simple terms.

The Problem: Imagine you're Amazon in 2007. Millions of people are shopping at the same time. If someone adds something to their cart, you CANNOT lose it. Ever. Even if servers explode. Even if an entire data centre burns down. The cart must live!

The Traditional Solution (and why it sucked): Old databases were like a single notebook. Only one person could write in it at a time. If the person with the notebook got sick, nobody could write anything. This obviously doesn't work when you're Amazon.

Enter Dynamo:

What it does:

  • Stores simple stuff (like your cart) across many computers
  • ALWAYS lets you write (add/remove items)
  • Never loses your data
  • Works even when things break
  • Handles millions of requests per day

How it works (ELI5 version):

  1. Instead of one notebook, have many copies
  2. Let people write in any copy they can find
  3. Later, look at all copies and figure out what actually happened
  4. If there are conflicts, use smart rules to fix them

The Clever Bits:

  1. "Ring Design"
  • Imagine all computers standing in a circle
  • Each one is responsible for part of the data
  • If one falls down, its neighbors cover for it
  1. "Eventually Consistent"
  • Instead of making sure everyone has the same info immediately...
  • Just make sure they'll all get the same info eventually
  • Way faster and more reliable!
  1. "Vector Clocks"
  • Like timestamps that tell you which version of the data came first
  • Helps figure out what actually happened when you have conflicts

Real World Results:

  • 99.9995% success rate (that's CRAZY good)
  • Never lost data. Ever.
  • Handled peak holiday shopping no problem
  • Millions of happy customers who never knew how complex it was

Why should you care? If you use:

  • Cassandra
  • MongoDB
  • Riak
  • Most modern "NoSQL" databases

You're basically using Dynamo's grandchildren. This paper changed how we build databases forever.

Fun Fact: The shopping cart example isn't random. It was literally built for Amazon's shopping cart system. They needed a way to never lose cart items even during massive Black Friday sales.

Happy to answer any questions! Anyone else excited about distributed systems? 😊

Post image
7mo ago
Talking product sense with Ridhi
9 min AI interview5 questions
Round 1 by Grapevine
JumpyPretzel
JumpyPretzel

Basically DB resiliency? Have multiple copies of data across databases? Not sure what is dramatically different here

TwirlyPancake
TwirlyPancake

it is the first one that did it at scale

ZoomyUnicorn
ZoomyUnicorn
TCS7mo

Can you share a link to this research paper ?

JumpyWaffle
JumpyWaffle

Super insightful read

DerpyCupcake
DerpyCupcake

Quite interesting stuff

ZoomyBagel
ZoomyBagel

I would suggest going through the hbase/bigtable paper as well (cause column family, column qualifier)

QuirkyPretzel
QuirkyPretzel

Nicely explained.

If there would have been a follow option, I would have started following you right now.

ZestyMuffin
ZestyMuffin

How does it combine all 3

Discover more
Curated from across
Software Engineers
by WigglyPenguinSimpplr

Recommend Technical Books

I am an early career dev with a few years of experience primarily in the frontend domain. I want to read atleast 2 technical books next year to upskill in general as a software engineer. Please share your recommendations.