🔁 What is Replication in Databases?
Replication is the process of copying data from one database server (source) to one or more other servers (replicas) — so they all have the same data.
It’s used for:
- High availability (if one server fails, others still serve)
- Load balancing (read from replicas)
- Disaster recovery (backups)
- Geographic distribution (faster access globally)
🧠 How It Works
- One server is designated as the Primary (or Master)
- Other servers are Replicas (or Slaves, or Secondaries)
- Any changes made to the primary are replicated to the replicas
🔄 Types of Replication
1. Synchronous Replication
- Primary waits for replicas to acknowledge the write
- Guarantees consistency but adds latency
- Suitable when data loss is unacceptable
2. Asynchronous Replication
- Primary doesn’t wait; replicas update eventually
- Faster writes, but there’s a risk of data lag/loss during failure
3. Semi-synchronous (Hybrid)
- Primary waits for at least one replica to acknowledge
- Balance between performance and safety
📦 Examples in Popular Databases
| DB | Replication Type | Tools |
|---|---|---|
| MySQL | Master–slave, Group replication | Built-in |
| PostgreSQL | Streaming replication | pg_basebackup, replication slots |
| MongoDB | Replica sets | Native |
| Cassandra | Peer-to-peer (eventual) | Automatic, no master |
| Redis | Master–replica | Built-in |
| CockroachDB | Strong consistency, multi-region | Automatic |
⚖️ Read/Write Behavior
| Role | Responsibility |
|---|---|
| Primary | Handles all writes, critical reads |
| Replica | Can serve read-only queries to offload traffic |
💥 Why Use Replication?
| Benefit | Description |
|---|---|
| 🔒 Fault Tolerance | If the primary fails, replicas can take over |
| 🚀 Performance | Scale reads across multiple replicas |
| 🌍 Geo Distribution | Serve users from nearest replica (low latency) |
| 📦 Backups & Analytics | Run heavy queries on replicas without affecting primary |
⚠️ Challenges
- Lag: Replicas may be slightly behind (in async setups)
- Consistency: Reads from replica may return stale data
- Conflict resolution: In multi-master systems
- Network partition: Can lead to data loss if not handled properly
🔄 Advanced Forms
- Multi-master replication: All nodes accept writes (harder to manage consistency)
- Bidirectional replication: Two nodes replicate to each other
- Sharded + Replicated: Used in high-scale systems like MongoDB or Cassandra
🧠 Example Scenario
You run an e-commerce site. The primary database handles all orders and payments. Replicas serve read-only traffic like product listings and search, ensuring the site remains fast even under load. If the primary crashes, a replica can be promoted with minimal downtime.
TL;DR
Replication ensures your data is safe, available, and fast.
It copies data across multiple servers to prevent loss, scale traffic, and support real-time apps globally.