Skip to content

Bad URLs inferred from titles with non-alpha-numeric characters #174

@simon-brooke

Description

@simon-brooke

My apologies, this is a bug I've introduced.

What breaks

If a page has no embedded metadata, and the metadata is inferred, and the title includes non-alpha-numeric characters, the URL that is generated can include characters that don't link. This is due to to URL encoding, but I'm not sure why URL encoding breaks things.

The fix

However, the fix I've implemented is very simple:

index 5061ae3..45e1832 100644
--- a/src/cryogen_core/infer_meta.clj
+++ b/src/cryogen_core/infer_meta.clj
@@ -89,7 +89,7 @@
    hyphens substituted for spaces."
   [^java.io.File page meta config]
   (if (:title meta)
-    (str (:date meta) "-" (replace (lower-case (:title meta)) #" +" "-") ".html")
+    (str (:date meta) "-" (replace (lower-case (:title meta)) #"[^a-z0-9]+" "-") ".html")
     (let [re-root     (re-pattern (str "^.*?(" (:page-root config) "|" (:post-root config) ")/"))
           page-fwd    (replace (str page) "\\" "/")  ;; make it work on Windows
           page-name   (if (:collapse-subdirs? config)

If this fix is acceptable to you, I'll submit a pull request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions