Every database has to solve the same basic problem.
Data lives on disk, and accessing disk is slow. Every read and every write eventually has to reach the disk, and how a database organizes data on that disk determines everything about its performance.
Over decades of research, two dominant approaches have emerged.
-
B-Trees keep data sorted on disk so reads are fast, but pay for it on every write.
-
LSM Trees buffer writes in memory and flush them to disk in bulk, making writes cheap but reads more expensive.
Neither approach is better. They represent two different approaches, and understanding the tradeoff between them is one of the most useful mental models in system design.
In this article, we will look at B-Trees and LSM trees in detail, along with the trade-offs associated with each of them.



