Photo Scraping With Ruby
I love Watir. I enjoy doing web automations using it. Today I will show you a small script I’m using to download photos from a local photography cummunity. It downloads all photos to a directory called ‘images’.
Here’s the code:
It’s a pretty straightforward code so I’ll go over the interesting bits. I started with this sandbox:
This opens a browser that I can work with to examine what I need to do. I quickly realized that all pages I have to collect can be matched with “https://photo-forum.net/i/< digits >” so that’s why I match those first and assign them to a variable called “links”.
Once I have all links I open each one and match the jpg files via my second regex. In the end I’m using wget to download the images. The interesting part here is the -N which will not download files that were already collected. This allows me to schedule regular run and collect images that I can review locally later.
That’s great for building up enough inspiration to get me going out with my camera in those cold cold days…
Enjoy the code!