The purpose of snapshot testing is to not have observable changes if you think there shouldn't be observable changes. To that end, a pattern I like is:
- Don't store/commit the snapshot and have an "update" command. Your CI/CD should run both versions of the software and diff them. That eliminates a lot of the toil.
- You should have a completely trivial way to mark that a given PR intends to have observable changes. That could be a tag on GitHub, a square-bracket thing in a commit message, etc. Details don't matter a ton. The point is that the test just catches if things have changed, and a person still needs to determine if that change is appropriate, but that happens often enough that you should make that process easy.
- Culturally, you should split out PRs which change golden-affecting behavior from those which don't. A bundle of bug fixes, style changes, and a couple new features is not a good thing to commit to the repo as a whole.
The net effect is that:
1. Performance improvements, unrelated features, etc are tested exactly as you expect. If your perf enhancement changes behavior, that was likely wrong, and the test caught it. If it doesn't, the test gives you confidence that it really doesn't.
2. Legitimate changes to the golden behavior are easy to institute. Just toggle a flag somewhere and say that you do intend for there to be a new button or new struct field or whatever you're testing.
3. You have a historical record of the commits which actually changed that behavior, and because of the cultural shift I proposed they're all small changes. Bisecting or otherwise diagnosing tricky prod bugs becomes trivial.
- Don't store/commit the snapshot and have an "update" command. Your CI/CD should run both versions of the software and diff them. That eliminates a lot of the toil.
- You should have a completely trivial way to mark that a given PR intends to have observable changes. That could be a tag on GitHub, a square-bracket thing in a commit message, etc. Details don't matter a ton. The point is that the test just catches if things have changed, and a person still needs to determine if that change is appropriate, but that happens often enough that you should make that process easy.
- Culturally, you should split out PRs which change golden-affecting behavior from those which don't. A bundle of bug fixes, style changes, and a couple new features is not a good thing to commit to the repo as a whole.
The net effect is that:
1. Performance improvements, unrelated features, etc are tested exactly as you expect. If your perf enhancement changes behavior, that was likely wrong, and the test caught it. If it doesn't, the test gives you confidence that it really doesn't.
2. Legitimate changes to the golden behavior are easy to institute. Just toggle a flag somewhere and say that you do intend for there to be a new button or new struct field or whatever you're testing.
3. You have a historical record of the commits which actually changed that behavior, and because of the cultural shift I proposed they're all small changes. Bisecting or otherwise diagnosing tricky prod bugs becomes trivial.