Java Regex Cheat Sheet

This post will be more of a note for myself. I didn’t mean to write a post, but couldn’t find a comprehensive one I wanted. So, this post will be not a well-polished one, and, I am not going to say anything about expression in general since there are millions of great resources.

Compare to JavaScript, it is a bit annoying to use Regex in Java. Yes, it is true, and that is the reason that I happened to write this post.

Syntax

For pattern, instead of single \, Java requires double backslash \\.
On good thing though, I don’t know about other editors, but at least in NetBeans, when you paste the code from clipboard, the extra \ is automatically added.
In addition, different string requires it’s own matcher unless called with a matches() method.

Methods

Pattern

Pattern compile(String regex[, int flags])    // flags optional, use fields sepreate by |
boolean matches([String regex, ]CharSequence input)    // regex optional.
String[] split(String regex[, int limit])    // limit optional.
String quote(String s)    // returns a literal pattern String for the specified String.

Enter fullscreen mode Exit fullscreen mode

Matcher

// for names, in regex, (?<name>pattern)
int start([int group | String name])    // argument optional
int end([int group | String name])    // argument optional.
boolean find([int start])    // start is optional.
String group([int group | String name])    // argument optional.
Matcher reset()

Enter fullscreen mode Exit fullscreen mode

String

boolean matches(String regex)
String replaceAll(String regex, String replacement)
String[] split(String regex[, int limit])    // limit optional

Enter fullscreen mode Exit fullscreen mode

There are more methods.

Usages

If there is a match exist.

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
System.out.println(m.find());    // true

Enter fullscreen mode Exit fullscreen mode

If they are identical

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
System.out.println(m.matches());    // false, only true when
                                    // the pattern matches with string
                                    // without remainder.

System.out.println("seashore".matches(reg));    // true

Enter fullscreen mode Exit fullscreen mode

or,

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
System.out.println(Pattern.matches(reg, str);    // false

Enter fullscreen mode Exit fullscreen mode

Array of all matches

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
List<String> matches = new ArrayList<>();
while (m.find()) {
    matches.add(m.group());
}
System.out.println(matches);    // [sells, seashells, seashore]

Enter fullscreen mode Exit fullscreen mode

Count number of matches

String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
int counter = 0;
while (m.find()) {
    counter++;
}

Enter fullscreen mode Exit fullscreen mode

I found that code golf doesn’t work well for all cases.
It’s the same for the array of matches.

Flags

All flags are fields of Pattern class.
In Java, you could think that all Regex is global.

Java Javascript equivalent Explain
CANON_EQ Enables canonical equivalence.
CASE_INSENSITIVE i Enables case-insensitive matching.
COMMENTS white space and comments are ignored in the pattern.
DOTALL s dot matches end line character as well.
LITERAL Enables literal parsing of the pattern.
MULTILINE m Enables multiline mode.
UNICODE_CASE u Enables Unicode-aware case folding.
UNIX_LINES only the ‘\n’ line terminator is recognized in the behaviour of ., ^, and $.

There could be Javascript equivalent I’ve missed.

Reference

原文链接:Java Regex Cheat Sheet

© 版权声明
THE END
喜欢就支持一下吧
点赞9 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容