Data and Computing books list

2024 Oct 07  |  1 min read  |  tags: study (1)

I'm increasingly seeing AI generated codebases around me.

The codebase is usually broken in very fundamental ways. These are the mistakes someone would make if they lacked a fundamental understanding of how things work, and how they connect together.

The solution is simple - study. And study well.

Classics

  • Designing Data Intensive Applications by Martin Kleppmann
    • THE book for introduction to data systems.
    • Provides a great tour of the data storage landscape - systems, tradeoffs etc.
    • It doesn't have much depth. But it is a great book to find out what you don't know, and didn't know that you need to know.

PostgreSQL

I like postgres.

  • The Art of PostgreSQL

    • Not really a classic, but still an incredible book
    • Like the name says, it leads to beautiful queries
  • PostgreSQL 14 Internals

    • Goes through an overview, and immediately dives in
    • The knowledge is highly transferable, and leads to better engineering in general

Data modelling and governance

  • The Enterprise Data Catalog by Ole Olesen-Bagneux (OReilly)
  • The Data Warehouse Toolkit by Ralph Kimball
    • One of the classics. It is an essential reads. The world has moved on in some regard, but the book's knowledge still has staying power.

Database Engineering

  • Data Structures for Data-Intensive Applications - Tradeoffs and Guidelines. By Manos, Stratos and Dennis.

Kafka

  • Effective Kafka: A Hands-On Guide to Building Robust and Scalable Event-Driven Applications with Code Examples in Java
    • A good book to skim through, for kafka.