Scraping meaning data is totally doable, but yeah, JS sites suck. I’ve had luck with Pyppeteer (Python version of Puppeteer).
For cleanup, OpenRefine is a lifesaver—it’s like magic for messy data.
And legality? Just don’t be a jerk about it. Respect robots.txt and don’t scrape stuff behind logins unless you’re sure it’s cool.
For cleanup, OpenRefine is a lifesaver—it’s like magic for messy data.
And legality? Just don’t be a jerk about it. Respect robots.txt and don’t scrape stuff behind logins unless you’re sure it’s cool.
