Skip to main content
  1. Dispatches/

Changelog: Improving Discovery and Readability

·780 words·4 mins
Articles Meta Personal Development Lsi Jekyll Octopress Django Xapian Haystack
Daniel Andrlik
Author
Daniel Andrlik lives in the suburbs of Philadelphia. By day he manages product teams. The rest of the time he is a podcast host and producer, writer of speculative fiction, a rabid reader, and a programmer.

Iteration is the thing. Or at least that’s what I tell myself. After all, tinkering is easy to do, especially when you are working with a platform as flexible as Jekyll. So as you would expect, I’ve made several changes to the site since the initial relaunch.

Ditching LSI>

Ditching LSI #

Jekyll has built-in support for latent semantic indexing, which should result in an accurate list of related posts for every entry on the site. The results are based upon indexing the language of each post and comparing it to the the others. It’s an appealing feature, and should provide similar results as my Xapian/ Haystack powered reading suggestions that were part of my old Django-based version of this site. The performance gains for LSI were actually one of the reasons I initially put in the effort to upgrade to the latest Jekyll version from my aging Octopress installation.

Here’s the thing though: LSI sucks.

It’s true that LSI no longer takes hours to run, but it still adds up to anywhere between 11-15 minutes of additional time in site generation. That’s acceptable to me in order to ensure accurate results that just work, so I happily began deploying with it in place. What I found was that performance was not the only issue with LSI; the results were bad too.

After running a site build with LSI enabled, I found that every post was still linked to virtually the same articles, which also happened to be some of the most recent. This meant that it didn’t matter where in the site you landed, it would recommend the same five or six articles over and over again. The point of this feature is to enable reader disovery of other content, and this was a ridiculously poor result considering the performance tax for using it.

The solution to this problem was to cut out LSI altogether in favor of Lawrence Woodman’s related posts plugin, which uses tag indexing to relate posts together by topic.1 Of course, this means you need to ensure you are assigning tags for each post, which is a best practice anyway. I’m pleased to note that not only are the results more consistently accurate, this method is radically faster. My entire site now generates in less than 27 seconds.

Dropping Dates In Permalinks #

I recently read Matt Gemmell’s article on permalinks and found his argument compelling. Essentially, his point is that date-based permalinks are a vestigial feature of early journal-style blogging and add little value for the reader. He also asserts, perhaps a little bit presumptively, that these URL designs may in fact detract from the authority of the piece itself.2

I encourage you to read his piece on the matter to see if you agree. I must admit that I quickly found myself nodding at his reasoning.

There is some concern of duplicate titles causing issues in this scenario, but searching through the 1000+ posts on this site, I found only two duplicates titles for posts, and those two were link-blog entries, which were easy to correct. After adding a simple rewrite rule to my nginx config to redirect legacy inbound links, I deployed a new version of the site using the new permalink model.

I’m happy with the change and agree with Gemmell that the URLs look cleaner without all the date-based cruft. I think they are friendlier to a reader as well.

Featured Images in RSS #

The new site theme includes support for featured images in posts, which I quite enjoy. However, those images were not being displayed in the Atom feed, which meant that those reading in a feed reader would never see them. This was a relatively easy thing to add, and now the featured image, when present, is automatically supplied using the media:thumbnail element.

I’ve tested the results with several popular feed readers and they look good.

There Is Always More>

There Is Always More #

Of course, the project is never really done. I’m sure I’ll continue to make changes to the site in the coming months and years, but that’s half the joy. As with any creative outlet, the day we stop playing is the day we start dying. With that in mind, I look forward to many more years of tinkering here.


  1. Please note that if you want to use with Jekyll 2.1+, you will need to use this fork of the library until it’s pull request is merged. ↩︎

  2. This is clearly an area of importance to Gemmell, as demonstrated in his recent post suggesting that the terminology inherited from blogging does a disservice to online writers. ↩︎

Related

Switching To Octopress
·1512 words·8 mins
Articles Assorted Geekery Meta Ruby Octopress Django Development Python Jekyll Hyde
This site is now powered by Octopress. The tentacles compell you! It’s tentacly delicious ! The Search> The Search # As I mentioned in my previous post, I’ve been looking to try out a new CMS for this site.
Changelog: Scheduled Posts in Jekyll
·691 words·4 mins
Articles Meta Development Jekyll Octopress
I do so love to tinker… One thing that has annoyed me about using a static site generator for my blog has been the lack of scheduled posts. It’s a feature I relied on quite a bit back when I used Wordpress, and also when I built my custom Django CMS.
Emerging
·547 words·3 mins
Articles Meta Announcments Jekyll Octopress Ruby
Ahh, a site relaunch. That fresh new-theme smell! Just what the doctor ordered for the Ministry of Intrigue.