We blog using Jekyll at MapBox, which means that all of our blog posts are written in code. Sometimes we make mistakes though, and missing or invalid metadata can cause layout quirks or unexpected errors. To catch these problems earlier, we decided to treat our blog like we do our code - automated unit tests now run after every commit.
Jekyll is a static site generator. We keep all content in simple text files and Jekyll reads each file and transforms it into HTML. We use Jekyll for all static content on our site - the blog, developer docs, help pages and much more.
Each bit of content, like a blog post or a help document, is a file composed of two parts: metadata stored in YAML, and content written in Markdown.
Here's what the YAML part of this blog post looks like:
---layout:blogcategory:blogtitle:"WeunittestourblogatMapBox"permalink:/blog/unit-test-blogimage:https://farm4.staticflickr.com/3710/10732224274_a4c27f21fc.jpgtags:-Mike Morris---
If we forget the author tag, the blog layout breaks. If we write invalid YAML, the blog won't rebuild and the post will stay in limbo.
Content testing prevents these failures ahead of time. Every blog post is submitted as a pull request on GitHub, and with pull request testing hooked up to Travis CI, every change is run through a test suite that gives the green light.
If there's a problem, we know immediately.
Travis-CI supports plenty of languages for test suites, and we ended up writing ours in Node.js. Since Jekyll is a Ruby project, Travis installs Jekyll for the compilation and Node for the test runner. We use mocha
and assert
for our content tests.
Here's our .travis.yml
file:
language:node_jsbefore_install:-gem install liquid -v 2.5.1 --no-rdoc --no-ri-gem install jekyll -v 1.0.2 --no-rdoc --no-ri-gem install rdiscount -v 1.6.8 --no-rdoc --no-riscript:-./node_modules/.bin/mocha test/test.metadata.js-jekyll build
One important consideration is that all tests must be created (but not necessarily run) synchronously in Mocha, which necessitates using the synchronous variants of some Node functions to build tests dynamically. While writing some of the more complex tests, we found that it was more efficient to load all posts using fs.readFileSync
before any tests were run, rather than loading each post asynchronously during its corresponding test. This approach allows for testing one-to-many relationships between posts (such as unique permalinks) while minimizing the time spent loading files from disk.
We first construct a posts
object and create a test for each post.
varpaths={blog:'_posts/blog/',team:'_posts/team/'},dirs=Object.keys(paths);varposts=dirs.reduce(function(prev,dir,index,list){varpath=paths[dir];describe(path,function(){prev[dir]=readDir(path);});returnprev;},{});dirs.forEach(function(dir){varpath=paths[dir];describe(path,function(){posts[dir].forEach(function(post){it(post.name,tests[dir](post));});});});
The metadata parsing is wrapped in a try/catch statement because js-yaml
throws an error when parsing invalid YAML.
functionreadPost(dir,filename){varbuffer=fs.readFileSync(dir+filename),file=buffer.toString('utf8');try{varparts=file.split('---'),frontmatter=parts[1];it(filename,function(){assert.doesNotThrow(function(){jsyaml.load(frontmatter);});});return{name:filename,file:file,metadata:jsyaml.load(frontmatter),content:parts[2]};}catch(err){}}functionreadDir(dir){returnfs.readdirSync(dir).map(function(filename){returnreadPost(dir,filename);});}
tests['blog']
asserts each necessary property of a blog post: all image links and iframes are HTTPS, exactly the expected metadata keys are present, and the metadata is valid. The date
key, if it exists, must be a valid JavaScript Date
object, the permalink must begin with /blog/
and each post needs to contain a <!--more-->
tag for generating post excerpts with the excerpt.rb
Jekyll plugin.
vartests={'blog':function(dir,file){returnfunction(){varfile=post.file,metadata=post.metadata,content=post.content,keys=['published','date','layout','category','title','image','permalink','tags'];// HTTPS images & iframes in blogvarurls=file.match(/https?:\/\/[\w,%-\/\.]+\/?/g);if(urls)urls.forEach(function(url){assert.ok(!(/http:[^'\"]+\.(jpg|png|gif)/).test(url),url+' should be https');});variframes=file.match(/<iframe [^>]*src=[\"'][^\"']+/g);if(iframes)iframes.forEach(function(iframe){assert.ok(!(/<iframe [^>]*src=[\"']http:/).test(iframe),iframe+' should be https');assert.ok(!(/<iframe [^>]*src=[\"']https:\/\/[abcd]\.tiles\.mapbox\.com.*\.html[^\?]/).test(iframe),iframe+' is insecure embedded map (add ?secure=1)');});assert.equal(typeofmetadata,'object');assert.ok('layout'inmetadata,missing('layout'));assert.ok('category'inmetadata,missing('category'));assert.ok('title'inmetadata,missing('title'));assert.ok('image'inmetadata,missing('image'));assert.ok('permalink'inmetadata,missing('permalink'));assert.ok('tags'inmetadata,missing('tags'));if(metadata.date){assert.ok(metadata.dateinstanceofDate,invalid('date',metadata.date));}assert.equal(metadata.category,'blog',invalid('category',metadata.category));assert.ok(isImage(metadata.image),invalid('image',metadata.image));assert.ok(/^\/blog\//.test(metadata.permalink),invalid('permalink',metadata.permalink));assert.ok(content.indexOf('<!--more-->')!==-1,missing('<!--more-->'));varextraKeys=Object.keys(metadata).diff(keys);assert.deepEqual(extraKeys,[],extraneous(extraKeys));};}};
We also check the integration between different posts and confirm that the author of each blog post matches the title of a post in _posts/team/
.
// Build a list of team member namesvarteam=posts.team.map(function(post){returnpost.metadata.title;});// Later, in a test assertion, make sure that that// the author of a blog post is a team member.assert.ok(team.indexOf(author)!==-1,'no team post found for author '+author);
We've saved ourselves a lot of frustration by automating this little part of our publishing workflow. The integration between Travis CI and GitHub lets everyone on our team, not just developers, benefit from tests and push new posts with confidence.