I was given an interesting challenge to scrape some data from a specific site. Not to write a completed, packaged solution, but rather just to scrape the data. The rub being, the site uses Javascript paging, so one couldn't simply use something like Mechanize. While a self-contained product would require inclusion of V8 (as the Javascript would need to be run and evaluated), to just scrape the data allows making use of whatever is easy and available. Enter Watir . Watir allows "mechanized/automated" browser control. Essentially, we can script a browser to go to pages, click links, fill out forms, and what have you. It's mainstay is in testing, but it's also pretty damned handy in cases where we need some Javascript on a page processed... like in this case. Keep in mind though, it is literally automating a browser, so you'll see your browser open and navigate to pages, etc. when the script runs. But, there is also a headless browser opti...
Adventures in programming and hacking.