Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you're familiar with Go, there's Colly too [1]. I liked its simplicity and approach and even wrote a little wrapper around it to run it via Docker and a config file:

https://gotripod.com/insights/super-simple-site-crawling-and...

[1] http://go-colly.org/



I used this library to get familiar with Go. It is indeed very powerful and really easy to create a scraper.

My main concerns though were about testing. What if you want to create tests to check if your scraper still gets the data we want? Colly allows nested scraping and it's easy to implement but you have all your logic into one big function, making it harder to test.

Did you find a solution to this? I'm considering switching to net/http + GoQuery only to have more freedom.


Not yet but my plan was to just have a static HTML site which the tests could run against.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: