It's always exciting to see new approaches to code reviews - GitHub has its strengths, but it’s far from perfect.
For the scenario you’ve outlined, have you thought about splitting the 3 patches into separate, dependent pull requests? While GitHub doesn’t natively support this, the right code review tool (shameless plug - I’m part of a team building one called GitContext) should allow you to keep pull requests small while maintaining dependencies between them. For example, patch 3 can depend on patch 2, which in turn depends on patch 1. The dependency tracking between them - provided by the code review tool - can ensure everything is released in unison if that's required.
Each patch can then be reviewed on its own, making feedback more targeted and easier to respond to. You can even squash commits within a pull request, ensuring a clean commit history with messages that accurately reflect the individual changes. Better still, with the right tool, you can use AI to summarize your pull request and review, streamlining the creation of accurate commit messages without all the manual effort.
A good code review tool also won’t get bogged down by git operations like rebases, merges, or force pushes. Reviewers should always see only the changes since their last review, no matter how many crazy git operations happen behind the scenes. That way, you avoid having to re-review large diffs and can focus on what’s new. The review history stays clean, separate from the commit history.
I'd be curious if this approach to splitting up pull requests and tracking their inter-dependencies would address your needs?
> It's always exciting to see new approaches to code reviews - GitHub has its strengths, but it’s far from perfect.
This is nice sentiment, it's positive reception to an idea and polite to the incumbent.
But it's so thoroughly not a new idea. It's literally the workflow git was designed to support, and is core to many long-standing criticisms about GitHub's approach for as long as GitHub has had pull requests.
And I'm over here wondering why this idea took *checks calendar* over 15 years to graduate from the denigrated mailing list degens and into hip trendy development circles.
I thought we were knowingly choosing shit workflows because we had to support the long-standing refusal by so many software devs to properly learn one of their most-used tools. That's why I chose the tools I chose, and built the workflows I built, when I migrated a company to git. Nobody gets fired for buying IBM after all.
I mean, the answer is simple. Even if email-based flows use range-diff, which is the correct conceptual model, all the actual details of using email are, I would estimate, about 1,000x shittier in 2024 than using GitHub in 2008 when I signed up for the beta as user #3000-something.
Email flows fucking suck ass. Yes I have used them. No, I won't budge on this, and no, I'm not going to go proselytize on LKML or Sourcehut or whatever about it, in Rome I'll just do as the Romans even if I think it sucks. But I've used every strategy you can think of to submit patches, and I can't really blame anyone for not wading through 500 gallons of horrendous bullshit known as the mailing list experience in order to glean the important things from it (like range-diff), even if I'm willing to do it because I have high pain tolerance or am a hired gun for someone's project.
Also, to be fair, Gerrit was released in 2009, and as the creator of ReviewBoard (in this thread!) also noted it supports interdiffs, and supported them for multiple version control backends, was released in 2006! This was not a totally foreign concept, it's just that for reasons, GitHub won, and the defaults chosen by the most popular software forge in history tend to have downstream cultural consequences, both good and bad, as you note.
> all the actual details of using email are, I would estimate, about 1,000x shittier in 2024 than using GitHub in 2008
Disagree about "all". Tracking patches in need of review is better done in a good MUA than on github. I can suspend a review mid-series, and continue with the next patch two days later. Writing comments as manually numbered, plaintext paragraphs, inserted at the right locations of the original patch is also lightyears better than the clunky github interface. For one, github doesn't even let you attach comments to commit message lines. For another, github's data model ties comments to lines of the cumulative diff, not to lines of specific patches. This is incredibly annoying, it can cause your comment for patch X to show up under patch Y, just because patch Y includes context from patch X.
Edited to add: github also has no support for git-notes. git-notes is essential for maintaining patch-level changelogs between rebases. Those patch-level changelogs are super helpful to reviewers. The command line git utilities, such as git-format-patch, git-rebase, git-range-diff, all support git-notes.
Dude, I'm not making a defense of mailing list workflows here. I'm just pondering the nature of the world where despite all the yapping about git I've seen floating around on the internet for as long as I've been lurking social media, the yappers are just recently keying in on something.
If you're asking "Why did this take 15 years for people to understand" and my reply is "Because it was under 1000 layers of other bullshit", then that's the answer to your pontification. It has nothing to do with whether you think email is good or not. You pondered, I answered. That simple.
> Because it was under 1000 layers of other bullshit
Not only because of that.
git-range-diff, while absolutely a killer feature, is a relatively new feature of git as well (a bit similarly to "git rebase --update-refs" -- which I've just learned of from you <https://news.ycombinator.com/item?id=41511241>, so thanks for that :)).
(FWIW, before git-range-diff was a thing, and also before I had learned about git-tbdiff, I had developed a silly little script for myself, for doing nearly the same. Several other people did the same for themselves, too. Incremental review was vital for most serious maintainers, so it was a no-brainer to run "git format-patch" on two versions of a series, and colordiff those. The same workflow is essential for comparing a backport to the original (upstream) version of the series. Of course my stupid little script couldn't recognize reorderings of patches, or a subject line rewrite while the patch body stayed mostly the same.)
Nope, none of it was knowingly done, and plenty of teams are almost trivially convertible to the normal workflow, even without inventing a buzzword like TFA did!
Though plenty aren't. I get it. (But one of the magic phrases that really works well is "this is what git, itself, does, and there's a man page installed on your system at this very moment explaining it")
As far as I know, splitting the series into individual PRs only works if you have commit rights to the repository, so you can base one PR on a different branch (in the main repository) than main.
As an outside contributor, with a fork of the repository, your three PRs will incrementally contain change A, A+B, and A+B+C, making the review of the last two PRs harder, because you need to review diffs for code you're already reviewed in another PR.
Not sure about the fork workflow but otherwise it is possible to change the base branch (manually on GH’s web interface) so that you don’t have to see the original branches and review the changes from A to B and from B to C. Maybe this is not possible with fork workflows?
What's the point of keeping track of commits? I honestly never understood people wanting that. Is this for some kind of weird accounting / social-score system where the number of commits decides your yearly bonus?
It's useful to see how the system evolved (because you might want to go back a bit and redo the newer stuff), but it's pointless to see the mistakes made along the way, for example, unless you have some administrative use for that.
Similarly, if a sequence of commits doesn't make sense as committed, but would make better sense if split into a different sequence: then I see no problem doing that. What's the point of keeping history in a bad shape? It's just harder to work with, if it's in a bad shape, but gives no practical advantages.
Not only do I think that's a pipe dream... I think it's technically impossible... I mean, diff has to show also what happened before whatever change took place. How do they expect not to see what was replaced? Or maybe I just don't understand what they mean by "changes since their last review".
Why not just do good old mergetrains with pullrequest A points to branch B amd then B points to master, merge B into master and thereafter point A back to master or am I missing the point?
This is called "stacked diffs" and it's a good workflow; the issue is that it's annoying to use on GitHub without tooling. The "point A back to master" bit isn't easy/obvious with pull requests.
From the peanut gallery of HN I’ve never understood Stacked Diffs. It looks like they reinvented commits as dependent PRs. Which are stacked on top of each other like commits are.
Fun fact: part of the reason that this article is on HN, I believe, is because I linked it to someone on another site as a means of explaining stacked diffs.
> It looks like they reinvented commits as dependent PRs.
Sort of kind of. It really depends on what you mean by PR: if we're talking about "review this branch please," which is what I would argue most mean by PRs, then yes, in the context of "stacked PRs for GitHub" it's largely about tooling that makes dependent PRs easier.
But there are other, non-GitHub tools. With those tools, you don't say "here's a branch, review it please," you say "here is a stack of commits, review them please." There's no branch going on inside. It's just a sequence of commits. This matters because it centers the commit as the unit of code review, not a branch. This also means that you can merge parts of the stack in at different times: to use the example from the article, once "small refactor" is good to go, it can be landed while "new API" is awaiting review. etc.
I think it takes actually using some of these tools to really "get it." I never understood them either, until I actually messed around with them. I am currently on my project solo, so I don't really stack at the moment, I think it really helps more the larger of a team you're working with is.
Oh hey, thanks for the explanation! I’ve been wondering about this for a while. The linked articles on HN tended to be heavy on arguing how unlike the workflow is to what “you are used to” that the description of what it was about got obfuscated.
I’ve used Git with email a little bit which also lets you review commits in isolation. It’s too bad that so many review tools bury the commits (ask me how many times someone at work has asked “what this is about” on the PR diff when the relevant commit message explains exactly that).
But what I like about email is that the whole series/PR also gets reviewed as a unit. Both worlds.
For the scenario you’ve outlined, have you thought about splitting the 3 patches into separate, dependent pull requests? While GitHub doesn’t natively support this, the right code review tool (shameless plug - I’m part of a team building one called GitContext) should allow you to keep pull requests small while maintaining dependencies between them. For example, patch 3 can depend on patch 2, which in turn depends on patch 1. The dependency tracking between them - provided by the code review tool - can ensure everything is released in unison if that's required.
Each patch can then be reviewed on its own, making feedback more targeted and easier to respond to. You can even squash commits within a pull request, ensuring a clean commit history with messages that accurately reflect the individual changes. Better still, with the right tool, you can use AI to summarize your pull request and review, streamlining the creation of accurate commit messages without all the manual effort.
A good code review tool also won’t get bogged down by git operations like rebases, merges, or force pushes. Reviewers should always see only the changes since their last review, no matter how many crazy git operations happen behind the scenes. That way, you avoid having to re-review large diffs and can focus on what’s new. The review history stays clean, separate from the commit history.
I'd be curious if this approach to splitting up pull requests and tracking their inter-dependencies would address your needs?