Abusing microformats to test HTML

Testing semantic markup can prove tedious with complex selectors and fragile tests. Littering the code with "js"-prefixed classes is no better to the discerning craftsperson.

There is a more elegant hack that I find myself fond of, and that’s annotating my markup with microformats.

While this works with any language that has a decent HTML parser, we’ll be using Clojure and Hickory.

Let’s start off with some example markup we can test relatively easily. I have a soft spot for Hiccup and Clojure over plain old HTML. If you’ve worked with structural editing, you might feel the same way.

[:article {:class     style/container
           :itemscope true
           :itemtype  "https://schema.org/Article"}
 [:h1 {:class    style/heading
       :itemprop "name"}
 [:p {:class    style/subheading
      :itemprop "abstract"}
  "Thanks for popping by!"]]

The generated HTML from the data representation above looks something like this:

<article class="container" itemscope itemtype="https://schema.org/Article">
  <h1 class="heading" itemprop="name">Example</h1>
  <p class="paragraph" itemprop="abstract">Thanks for popping by!</p>

And with that in mind, we can write a big ol’ test!

(ns example.web-test
   [clojure.test :refer [deftest is]]
   [hickory.core :as hickory]
   [hickory.select :as sel]
   [example.test.api :as test.api]
   [example.test.report :as test.report]
   [example.test.system :as test.system]

(defn- itemprop
  {:pre [(string? prop)]}
  (sel/attr :itemprop #(= % prop)))

(deftest get-home
  (test.system/with-system [{:keys [web]} (test.system/system)]
    (let [response (test.api/response-for web :get "/")
          doc      (hickory/as-hickory (hickory/parse (:body response)))]
      (when (is (match? {:status  401
                         :headers (assoc secure-headers
                                         "Content-Type" "text/html;charset=utf-8")}
                (test.report/pp response))
        (is (match? [{:content ["Example"]}]
                    (sel/select (sel/tag :title) doc)))
        (is (match? [{:content ["Example"]}]
                    (sel/select (itemprop "name") doc)))
        (is (match? [{:content ["Thanks for popping by!"]}]
                    (sel/select (itemprop "abstract") doc)))))))

If you’ve never seen Clojure before, the code above might be confusing. To get you off to the races, know that Clojure uses namespaces to group related code, and the use of ns declares a namespace that can require vars in other namespaces. Functions are declared using defn (the trailing dash makes that function private), and the test.system/with-system trick is something I wrote about yesterday.

The language-agnostic trick is in the use of itemprop to create a new selector based on the itemprop attribute found in our markup. We could be even more selective and look for a descendant of an Article, but for our purposes the itemprop="title" is good enough.

I’m also making use of the excellent matcher-combinators library that makes intent clearer and our tests feel more declarative.

I find this pattern works exceedingly well, perhaps because it was designed to make it easier for machines to understand markup.

If you’re trying to give Google et al your data more efficiently, JSON-LD is even easier still, but because it’s decoupled from your markup, you lose the nice side effect of being able to more easily test your markup.