The URLs generated by Hakyll, by default also include a
.html
extension. I have never been a fan of this. When things in
reality are driven by the Content-Type
header, it is absolutely
redundant.
Hakyll provides all the utilities with which we can get cleaner URLs,
like this page’s. For this, I rely on the fact that most of the
web servers automatically serve /foo/index.html
for the URL
/foo/
. To generate clean paths, I define a custom route -
cleanRoute
:
cleanRoute :: Routes
cleanRoute = customRoute createIndexRoute
where
createIndexRoute ident = takeDirectory p </> takeBaseName p </> "index.html"
where p = toFilePath ident
This can now be used in in rule definition:
match "pages/*" $ do
route $ cleanRoute
-- the compiler follows
With this, a path say /pages/about.html
will be generated as
/pages/about/index.html
, hence solving the generation problem. We
are only partially done though. The links that Hakyll generates will
also include the /index.html
suffix in every URL. To get rid of that
we define a set of functions:
cleanIndexUrls :: Item String -> Compiler (Item String)
cleanIndexUrls = return . fmap (withUrls cleanIndex)
cleanIndexHtmls :: Item String -> Compiler (Item String)
cleanIndexHtmls = return . fmap (replaceAll pattern replacement)
where
pattern = "/index.html"
replacement = const "/"
cleanIndex :: String -> String
cleanIndex url
| idx `isSuffixOf` url = take (length url - length idx) url
| otherwise = url
where idx = "index.html"
cleanIndexUrls
and cleanIndexHtmls
strip out /index.html
from
all the anchor tags and complete text respectively. These can be used
over a page’s compiler like this:
compile $ pandocCompiler
>>= loadAndApplyTemplate "templates/page.html" pageCtx
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" pageCtx
>>= relativizeUrls
>>= cleanIndexUrls -- cleanup href in all anchor tags.
This functionality is being used by this blog and irneh/workforpizza off which this blog is actually based.