CLI Data Gem Project

For the CLI Data Gem Project, part of Learn.co’s Full Stack Web Development online course, I was tasked with building a CLI (Command Line Interface) to data scraped from a public website.  Now I, being such a big Star Wars fan, and who am subscribed to all Marvel’s Star Wars comics, decided to go with Wookieepedia’s “Canon Comics” wiki.

Just in case you are unfamiliar with comic books, as I was before Marvel started releasing their Star Wars series, the sorts of metadata associated with each issue are:

  • Issue title
  • Series title (most issues are part of a series, which is usually focused on one character or set of characters)
  • Issue number (chronologically within the series)
  • Publication date
  • Artist info
    • Writer
    • Penciller
    • Inker
    • Letterer
    • Colorist

There are other data, but I decided I’d focus on the above when building my models.

Speaking of models, the main purpose of this project was to work with object in Ruby. So, after I had made the decision that I was going to make a Star Wars Comics CLI, I started thinking about what my objects would look like.

Models

Because comic books are physical objects, and the people who make them are, well, physical humans, it was pretty easy to figure out my model scheme:

First, I would have an class called a Series, representing the various comic book series you might find in the Star Wars canon. Each Series has at least one, but up to many, constituent Issues.

Then, I will implement an Artist class, which encapsulates the idea of any artist who works on an Issue. Now, even though each artist has a different job—one writes, one draws the letters, etc.—in reality, I know I’m not going to need have each type represented as a different type of object because they don’t need unique functionality when my CLI is just going to be listing them…

Actually, since this lab is all about object-oriented programming, let’s do it! So, I’m going to create subclasses of Artist: Writer, Penciller, Letterer, and Colorist. These don’t add additional functionality to Artists, but they will allow me to show that I understand inheritance.

Scraping

Man, this turned out to the worst part—I should have looked at Wookiepedia’s HTML before deciding to use it! All I can say is, its editors don’t follow best practices when it comes to HTML/CSS. No class names or classes that don’t distinguish between the things I need them to. I ended up having to use a lot of selectors based on text formatting tags written directly in the HTML, including having to use the background color of links to tell whether it linked to a story (I’m not even dealing with these in my gem) or an issue.

Also, there are a LOT of comic books, so I ran into the problem that my gem was just taking too long to scrape up all that metadata! I don’t want my user to sit there for two minutes while the app downloads information about every Star Wars comic ever! I ended up writing the code so that it only scrapes when it needs to. For instance, if the user just wants to list the issues within a particular series, my gem will scrape for Series when the user asks to look at them, and scrape for Issues within that Series only when they’ve selected it. However, if a user wants to look at issues contributed to by a particular artist, they’ll just have to wait (Artists are only assigned to Issues when Issues are scraped, so in this case all Issues need to be scraped)!