I'm increasingly seeing AI generated codebases around me.
The codebase is usually broken in very fundamental ways. These are the mistakes someone would make if they lacked a fundamental understanding of how things work, and how they connect together.
The solution is simple - study. And study well.
Classics
- Designing Data Intensive Applications by Martin Kleppmann
- THE book for introduction to data systems.
- Provides a great tour of the data storage landscape - systems, tradeoffs etc.
- It doesn't have much depth. But it is a great book to find out what you don't know, and didn't know that you need to know.
PostgreSQL
I like postgres.
The Art of PostgreSQL
- Not really a classic, but still an incredible book
- Like the name says, it leads to beautiful queries
PostgreSQL 14 Internals
- Goes through an overview, and immediately dives in
- The knowledge is highly transferable, and leads to better engineering in general
Data modelling and governance
- The Enterprise Data Catalog by Ole Olesen-Bagneux (OReilly)
- The Data Warehouse Toolkit by Ralph Kimball
- One of the classics. It is an essential reads. The world has moved on in some regard, but the book's knowledge still has staying power.
Database Engineering
- Data Structures for Data-Intensive Applications - Tradeoffs and Guidelines. By Manos, Stratos and Dennis.
Kafka
- Effective Kafka: A Hands-On Guide to Building Robust and Scalable Event-Driven Applications with Code Examples in Java
- A good book to skim through, for kafka.