Kohei Nozaki's blog 

Entries tagged [regex]

jEdit conditional line deleting with regex


Posted on Sunday Jan 31, 2016 at 06:03PM in Technology


How to delete lines that contains a keyword?

Open Search And Replace dialog and turn Regular expressions on then put ^.*KEYWORD.*$\n to Search for box, leave Replace with blank then hit Replace All.

How to delete lines that NOT contains a keyword?

With ^((?!KEYWORD).)*$\n do the same to the above. for detail check http://stackoverflow.com/questions/406230/regular-expression-to-match-line-that-doesnt-contain-a-word


What are lookahead / lookbehind of regex?


Posted on Sunday Feb 15, 2015 at 10:04AM in Technology


I’m learning regex with Mastering Regular Expressions, 3rd Edition. it’s interesting because long time I didn’t understand lookahead / lookbehind correctly. so I leave some examples for better understanding. tests were ran against jEdit 5.2.0 on Oracle Java 1.8.0_31.

Given string

(1) http://blog1.example.com/roller/
(2) http://blog2.example.com/mt/
(3) http://blog3.example.com/wordpress/

Positive lookahead

Positive lookahead ensures that the matching has following fragment which matches to given regex inside parenthesis. example\.com(?=/roller/) matches against only (1).

19424ff6 37e1 4af4 b639 5f6b9d628950

Negative lookahead

Negative lookahead simply reverses that condition. example\.com(?!/roller/) matches against (2) and (3).

231b7831 62f4 4c91 af8e 45496304d2a6

Positive lookbehind

Positive lookbehind means that the matching has the preceding fragment which matches to given regex inside parenthesis. (?<=blog1\.)example\.com matches against only (1).

517be8f0 4b72 457a 8c18 29eb15e5aed2

Negative lookbehind

Negative lookbehind simply reverses that condition. (?<!blog1\.)example\.com matches against (2) and (3).

119dcddc 601b 4c02 a606 69dfd1bb5693


Complex string replacing on Java


Posted on Tuesday Feb 10, 2015 at 10:06PM in Technology


Sometimes annoying requirement of string replacing will risen. everytime I forgotten how to do it so I leave this as my note. also there’s a JUnit test case.

Requirement

Assume we have following string literals. we have to convert input string to expected.

String input = "<li><a href=\"../../jbatch/hello/\" >anchor</a></li>";
String expected = "<li><a href=\"/entry/articles-jbatch-hello\" >anchor</a></li>";

Solutions

Using numbered groups

Pattern p = Pattern.compile("(<a href=\")\\.\\./\\.\\./(.*)/(.*)/\"");
Matcher matcher = p.matcher(input);
String result = matcher.replaceAll("$1/entry/articles-$2-$3\"");

Using named groups

Pattern p = Pattern.compile("(?<prefix><a href=\")\\.\\./\\.\\./(?<category>.*)/(?<handle>.*)/\"");
Matcher matcher = p.matcher(input);
String result = matcher.replaceAll("${prefix}/entry/articles-${category}-${handle}\"");

Using Matcher#appendReplacement(). this one is most flexible.

Pattern p = Pattern.compile("(?<prefix><a href=\")(?<url>.*)\"");
Matcher matcher = p.matcher(input);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
	// any complex logic can be placed here
	String url = matcher.group("url");
	String[] urls = url.split("/");
	matcher.appendReplacement(sb, "${prefix}/entry/articles-" + urls[2] + "-" + urls[3] + "\"");
}
matcher.appendTail(sb);
String result = sb.toString();

Escaping special character for replacement

$ have special meaning for replacement string, but sometimes we may need to use $ as just a literal. for such case, we can use Matcher.quoteReplacement() for escaping $ character as follows:

String input = "../../jbatch/hello/";
String expected = "../../$1/${name}/";
Pattern p = Pattern.compile("(?<prefix>\\.\\./\\.\\.)/.*/.*/");
String result = p.matcher(input).replaceAll("${prefix}/" + Matcher.quoteReplacement("$1/${name}") + "/");