Git based development - Standards and Etiquette

2024 Aug 28  |  9 min read  |  tags: git (1) conventions (1)

Work in progress
Work in progress
Work in progress
Work in progress

Branches

First decide what kind of development you will be following:

  • Trunk based development
  • Git flow development

Branch naming rules is simple: You can only use numbers, lowercase letters, and hyphen (-). Use hyphens to separate words (-) in the branch name. The validation regex looks like this: ^[a-z0-9\-\/]+$.

Git flow based development

A project of this type will always have 2 core branches:

  1. main - Contains production-ready code. Used to deploy to production (though CI-CD)
  2. dev - Used to develop new features

Optionally, codebase might also have a release branch. This stores milestones in the project. Release branches are well tested, rock solid, and dependable.

All other branches that you'll create will follow a specific nomenclature. Every branch's name tells the purpose of the branch. This way, you can look at a branch and immediately tell why it exists and what it does.

  • release/<release-version>
    • Used to store milestone releases of the software. It is a stable, tested snapshot of the codebase from main or dev, that can be deployed in production. Usually a culmination of many feature branches.
    • Examples:
      • release/0.1
      • release/0.2
      • release/1.0
      • release/1.1
  • feature/<feature-description>
    • Used to develop new features. Based on a commit in the dev branch.
    • Examples:
      • feature/google-authentication
      • feature/elasticsearch-integration
      • feature/stripe-payment-gateway
  • bugfix/<bug-description>
    • Fix discovered bugs. Based on a commit in the dev/ branch.
    • Examples:
      • bugfix/fix-api-token-retrieval
      • bugfix/fix-pipeline-timeout
      • bugfix/fix-tablet-layout
  • hotfix/<problem-description>
    • Urgent fix rolled out to stop a fire in production. Typically deployed without testing, when the aim is to stop the fire asap. Usually based on a commit in main or release/ branch.
    • Examples:
      • hotfix/fix-critical-memory-leak
      • hotfix/fix-api-token-login
      • hotfix/fix-payment-processor-error
  • experiment/<experiment-description>
    • Branches used to test new ideas (carry out experiments). Can be abandoned if the experiment fails. Usually based on a commit in dev or feature/ branch.
    • Examples:
      • experiment/claude-integration
      • experiment/dark-mode-prototype
      • experiment/queue-based-inference-service
  • docs/
    • holds the documentation of the project. Could be based on mkdocs, sphinx etc.
    • docs branches are very rare. Have never seen it in commercial codebases. But that's just my personal experience.

In Git-flow based workflow,the code flow between branches looks like this:

  • If there is CI-CD, it is set up on the main branch.
  • If there is no CI-CD, then the release branch is used for deployments.

  • Challenges:

    • If developing a feature takes a long time, there will be many commits made in the parent dev branch by then. The feature branch will fall behind the dev branch. So, before finally merging the feature code back into main, we have to take a pull from main. This usually causes merge conflicts, which take time and co-ordination to resolve.
    • Sometimes merging branches to form a release branch is a major task.

Trunk based development

Trunk based development flow looks like this:

  • The main branch is kept production ready. The CI-CD pipeline watches this branch. Every time a new commit comes, the build pipelines tests it and deploys it.
  • Trunk based development is heavily reliant on having a rock solid test suite. If you don't have dependable tests, you cannot do trunk based development. Trunk based development demands competence from your team, and confidence in your test suite.
  • Feature branches are typically short lived, and are quickly merged back into main.
  • CI-CD is set up on main. The code is sent through the automated build pipelines, and if it succeeds, then deployed.
  • This flow makes heavy use of feature flags. Features are enabled/disabled on production using feature flags.

Commit messages

Just think - when do you actually look at commit messages? Most of the time, the answer is - When something has gone wrong and I'm trying to find out when it went wrong, by reading through the history of changes. - When you want to create a new branch, and are looking for a particular commit.

For this, you need commit messages that actually tell what was done to the codebase in each commit. More specifically, you need commit messages that tell the module/files affected, and what was change was done in them.

You're telling what changes you did. So the template of a commit message is simple:

<area you affected>:<work you did>

Examples:

  • Readme: Added infrastructure information into architecture section
  • utils: Added docstrings to existing functions
  • cicd: added testing step to build process
  • payments: refactored code, removed unused functions
  • cdc module: added capability to set batch size using enums
  • supernet pipelines: improved performance by optimizing joins with dim tables

As you can see, the area you affected could be a single file, or a module, or something else. But it usually is a single thing that you can put into words.

If you think to yourself "I can't do this. I've worked on too many things. It doesn't fit into a single area affected.", then my friend, your commit is too large. You need to break it down into smaller commits.

Some more points:

Writing paragraphs in description is fine. This happens often in open source projects.

But always make sure you start with a summary line. It can look like this:

<area you affected>:<work you did>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi lectus turpis, commodo ut odio a, pretium volutpat justo. Curabitur eros nisl, consequat id consequat et, bibendum at elit. Aliquam fermentum urna in tellus egestas, at faucibus orci tincidunt. Duis et nibh leo. Curabitur varius bibendum ex, ac tincidunt neque suscipit a. Praesent ullamcorper sollicitudin tristique. Praesent in ante turpis.

Pull requests

Use pull requests when you are committing to branches that have other contributors (shared branches). Don't directly commit code to these shared branches.

This serves 4 purposes:

  1. The person whose code you've affected gets to know the changes you made. This person can then decide if the change is acceptable to them or not. If not, they will give comments on what needs to be fixed before they will accept the code.
  2. You can get feedback from more experienced programmers in your team. In PR review, they can comment things like better ways of implementing the logic, coding standards you didn't follow etc. It is great for learning, and helps to maintain code quality and standards.
  3. Having someone else go through your code during a PR review is great for finding mistakes you might have missed or bugs you might have mistakenly created. While working with code, it is easy to develop tunnel vision and lose sight of mistakes. Having someone else go through your code helps solve this problem.
  4. If you're new to the codebase, this protects you from causing mistakes. Typically when working with large codebases, you generally don't know the ins and outs of a module and its interactions with the rest of the system. In this case, you get help from an existing developer who knows these things.

So, it is fine if you directly commit to branches that only you work on. Eg - feature/, bugfix/ etc. But when you are merging this branch back into a parent branch (eg - dev), create a pull request and add relevant people to review the code.

Example scenarios:

  • Suppose your branch has made changes to code in the payment-processor module. So in the pull request, add the owner of the payment-module as the pull request reviewer. The code gets merge only if this person approves.
  • Your colleague is working on a feature branch. You got an idea and created an experiment branch based on the feature branch. Your experiment was successful, and now you want to merge this code into the parent feature branch. In this case, don't directly commit to your colleague's feature branch. Create a pull request to merge code of your experiment branch into your colleague's feature branch. Assign your colleague as the reviewer of the pull request.
  • You worked on a feature branch. The feature development is complete and you want the feature to be included into the dev branch. Create a pull request to merge the code of your feature branch into the dev branch, and add relevant people as reviewers (usually people who own the modules that your branch modified).