Message queues (e.g. SQS) are inappropriate for tracking long-running tasks/workflows. This is due to the operational requirements such as:
- Checking the status of a task (queued, pending, failed, cancelled, completed)
- Cancelling a queued task (or pending task if the execution environment supports it)
- Re-prioritizing queued tasks
- Searching for tasks based off an attribute (e.g. tag)
We literally had a major us-east-1 incident on AWS today. Only thing we can do is sit on our butts and wait for it to end so that we can clean up. This happens every few months. I am unimpressed with the the "thousands of engineers" argument.
It may be worthwhile to understand where dynamic typing is helpful since this gets mentioned a lot. Python and other dynamic languages are increasingly reliant on static type checkers.
I believe the problem is the lack of proper dependency indexing at PyPI. The SAT solvers used by poetry or pdm or uv often have to download multiple versions of the same dependencies to find a solution.
- Checking the status of a task (queued, pending, failed, cancelled, completed) - Cancelling a queued task (or pending task if the execution environment supports it) - Re-prioritizing queued tasks - Searching for tasks based off an attribute (e.g. tag)
You really do need a database for this.