There are some parts in nand2tetris that are not self-contained, in the sense that even if you study and master all the preceding content, it is not guaranteed that you can solve the assignment. That's why I don't like it that much.
Asynchronous and parallel programming are concepts I've never really learned, and admittedly scared to use, because I never really understand what the code is doing or even the right way of writing such code. Does anyone have any good resources that helped you learn this and have a good mental model of the concepts involved?
You’ll never learn it by reading something. You have to experience it. Get your hands dirty.
Write a simple single-threaded http server that takes strings and hashes them with something slow like bcrypt with a high cost value (or, just sleep before returning).
Write some integration tests (doesn’t have to be fancy) that hammer the server with 10, 100, 1000 requests.
Time how performant (or not performant) the bank of requests are.
Now try to write a threaded server where every request spins up a new thread.
What’s the performance like? (Hint: learn about the global interpreter lock (GIL))
Hmm, maybe you’re creating too many threads? Learn about thread pools and why they’re better for constraining resources.
Is performance better? Try a multiprocessing.Pool instead to defeat the GIL.
Want to try async? Do the same thing! But because the whole point of async is to do as much work on one thread with no idle time, and something like bcrypt is designed to hog the CPU, you’ll want to replace bcrypt with an await asyncio.sleep() to simulate something like a slow network request. If you wanted to use bcrypt in an async function, you’ll definitely want to delegate that work to a multiprocessing.Pool. Try that next.
Learning can be that simple. Read the docs for Thread, multiprocessing, and asyncio. Python docs are usually not very long winded and most importantly they’re more correct than some random person vibe blogging.
I'm playing around with using hyperlinks in pdfs to get around how much the www sucks for posting serious research with serious working code.
Caveat emptor: I'm first working on getting the basic groundwork out, like a pipeline that shows what you need to do to extract a scanned pdf in a quality that tesseract can actually get text out of.
> I'm first working on getting the basic groundwork out, like a pipeline that shows what you need to do to extract a scanned pdf in a quality that tesseract can actually get text out of.
Sounds like TBL's contribution doesn't suck so much after all.
I have a beginner question - Can WebRTC be used as an alternative to sending base64-encoded images to a backend server for image processing? Is this approach recommended?
The OpenAI API is a pretty high-profile example of this existing in the real world. You use it in their conversations interface when you want to include images in the conversation. Discord also uses it for attachments https://discord.com/developers/docs/reference#image-data. More generally it's when you want to send image data as part of a larger JSON blob.
Say a website links to 20 images and a dozen mp3s (and other files not in your interest). How would you without an extension like DownThemAll grab these files all at once?