This post will be more of a note for myself. I didn’t mean to write a post, but couldn’t find a comprehensive one I wanted. So, this post will be not a well-polished one, and, I am not going to say anything about expression in general since there are millions of great resources.
Compare to JavaScript, it is a bit annoying to use Regex in Java. Yes, it is true, and that is the reason that I happened to write this post.
Syntax
For pattern, instead of single \
, Java requires double backslash \\
.
On good thing though, I don’t know about other editors, but at least in NetBeans, when you paste the code from clipboard, the extra \
is automatically added.
In addition, different string requires it’s own matcher unless called with a matches()
method.
Methods
Pattern
Pattern compile(String regex[, int flags]) // flags optional, use fields sepreate by |
boolean matches([String regex, ]CharSequence input) // regex optional.
String[] split(String regex[, int limit]) // limit optional.
String quote(String s) // returns a literal pattern String for the specified String.
Enter fullscreen mode Exit fullscreen mode
Matcher
// for names, in regex, (?<name>pattern)
int start([int group | String name]) // argument optional
int end([int group | String name]) // argument optional.
boolean find([int start]) // start is optional.
String group([int group | String name]) // argument optional.
Matcher reset()
Enter fullscreen mode Exit fullscreen mode
String
boolean matches(String regex)
String replaceAll(String regex, String replacement)
String[] split(String regex[, int limit]) // limit optional
Enter fullscreen mode Exit fullscreen mode
There are more methods.
Usages
If there is a match exist.
String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
System.out.println(m.find()); // true
Enter fullscreen mode Exit fullscreen mode
If they are identical
String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
System.out.println(m.matches()); // false, only true when
// the pattern matches with string
// without remainder.
System.out.println("seashore".matches(reg)); // true
Enter fullscreen mode Exit fullscreen mode
or,
String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
System.out.println(Pattern.matches(reg, str); // false
Enter fullscreen mode Exit fullscreen mode
Array of all matches
String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
List<String> matches = new ArrayList<>();
while (m.find()) {
matches.add(m.group());
}
System.out.println(matches); // [sells, seashells, seashore]
Enter fullscreen mode Exit fullscreen mode
Count number of matches
String str = "She sells seashells by the seashore";
String reg = "\\w*se\\w*";
Pattern p = Pattern.compile(reg);
Matcher m = p.matcher(str);
int counter = 0;
while (m.find()) {
counter++;
}
Enter fullscreen mode Exit fullscreen mode
I found that code golf doesn’t work well for all cases.
It’s the same for the array of matches.
Flags
All flags are fields of Pattern class.
In Java, you could think that all Regex is global.
Java | Javascript equivalent | Explain |
---|---|---|
CANON_EQ | Enables canonical equivalence. | |
CASE_INSENSITIVE | i | Enables case-insensitive matching. |
COMMENTS | white space and comments are ignored in the pattern. | |
DOTALL | s | dot matches end line character as well. |
LITERAL | Enables literal parsing of the pattern. | |
MULTILINE | m | Enables multiline mode. |
UNICODE_CASE | u | Enables Unicode-aware case folding. |
UNIX_LINES | only the ‘\n’ line terminator is recognized in the behaviour of ., ^, and $. |
There could be Javascript equivalent I’ve missed.
暂无评论内容