Skip to content

Commit

Permalink
termite: Add patch to avoid matching trailing punctuation in URLs
Browse files Browse the repository at this point in the history
Patch based on:
thestinger/termite#516

While this might "break" some URL's in the wild,
in return this fixes matching of URL's in Nix code.
Without this patch the trailing semicolon is included as part of the
URL, which creates invalid URL's for nearly all meta.homepage values.
  • Loading branch information
dtzWill authored and jtojnar committed Dec 20, 2017
1 parent 8e6d563 commit c433fef
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 0 deletions.
3 changes: 3 additions & 0 deletions pkgs/applications/misc/termite/default.nix
Expand Up @@ -13,6 +13,9 @@ let
sha256 = "02cn70ygl93ghhkhs3xdxn5b1yadc255v3yp8cmhhyzsv5027hvj";
};

# https://github.com/thestinger/termite/pull/516
patches = [ ./url_regexp_trailing.patch ];

postPatch = "sed '1i#include <math.h>' -i termite.cc";

makeFlags = [ "VERSION=v${version}" "PREFIX=" "DESTDIR=$(out)" ];
Expand Down
27 changes: 27 additions & 0 deletions pkgs/applications/misc/termite/url_regexp_trailing.patch
@@ -0,0 +1,27 @@
Based on https://github.com/thestinger/termite/pull/516
Modified to apply to v13

From 65a454ffa8e681f3f14729cba7c42e1570a85e8a Mon Sep 17 00:00:00 2001
From: Paul Baecher <pbaecher@gmail.com>
Date: Thu, 7 Sep 2017 22:58:51 +0200
Subject: [PATCH] Do not match punctuation at the end of URLs

Punctuation at the end of URLs is most likely part of natural language
or markup (for example in Markdown). Do not match it as part of the URL.
---
url_regex.hh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/url_regex.hh b/url_regex.hh
index 2ec6be8..3039b53 100644
--- a/url_regex.hh
+++ b/url_regex.hh
@@ -9,7 +9,7 @@
#define PORT "(?:\\:[[:digit:]]{1,5})?"
#define SCHEME "(?:[[:alpha:]][+-.[:alnum:]]*:)"
#define USERPASS USERCHARS_CLASS "+(?:\\:" PASSCHARS_CLASS "+)?"
-#define URLPATH "(?:/[[:alnum:]\\Q-_.!~*'();/?:@&=+$,#%\\E]*)?"
+#define URLPATH "(?:/[[:alnum:]\\Q-_.!~*'();/?:@&=+$,#%\\E]*(?<![\\Q.,:;()!?\\E]))?"

const char * const url_regex = SCHEME "//(?:" USERPASS "\\@)?" HOST PORT URLPATH;

0 comments on commit c433fef

Please sign in to comment.