RegExp Cheet Sheet - One Piece Of Script

正则表达式语法速查

Flags

i- case-insensitive
g- 针对整个字符串
m- 匹配可跨越多行
y- 回朔匹配
u- 忽略Unicode (\u{…})

Flags可以连用（/test/ig）

Matching分类

准确匹配

/test/ 意思就是准确匹配一个连着一个的字符序列

匹配到一个集合

[abc] anything in set
[^abc] anything but set
[abcde] === [a-e]

转义符

[] 可以表示[];

开始结束符

/^test$/

控制出现次数

/t?est/ - ? 1或0次
/t+est/ - + 1次及以上
/test/ - 0次及以上
/t{4}est/ - t{n} n次
/t{4,10}/ - t{n, m} n到m次
/t{4,}/ - t{n,} n次及以上

贪婪？非贪婪？贪婪指匹配一直持续到消耗掉所以字符，非贪婪指只要找到符合模式的字符串就行

Ex. 用/a+/ （贪婪）和/a+?/ （非贪婪）对”aaa”进行匹配，前者会匹配aaa，后者只会匹配a，因为其已经满足了a+条件。

提前定义好的字符集合

\t - horizontal tab
\b - back space
\v - vertical tab
\f - form feed（换页符）
\r - Carriage return（回车）
\n - new line
\cA : \cZ - Control characters(Ascii码)
\u0000 : \uFFFF - Unicode hexadecimal
\x00 : \xFF - ASCII hexadecimal
. - Any character, except for whitespace characters (\s)
\d - Any decimal digit; equivalent to [0-9]
\D - Any character but a decimal digit; equivalent to [^0-9]
\w - Any alphanumeric character including underscore; equivalent to [A-Za-z0-9_]
\W - Any character but alphanumeric and underscore characters; equivalent to [^A-Za-z0-9_]
\s - Any whitespace character (space, tab, form feed, and so on)
\S - Any character but a whitespace character
\b - A word boundary
\B - Not a word boundary (inside a word)

编组

将一个字符序列进行编组，比如/(ab)+/，表示(ab)共同和出现1次以上。
编组还有一个功能，就是capture。

替代组

比如/(ab)+|(bc)+/，表示(ab)共同出现1次以上，或者(bc)共同出现一次以上的匹配项。

反向引用

反向引用针对the captures。比如/^([bdn])a\1/，其中\1就是反向引用，该匹配项只会在运行的时候才知道匹配到什么，比如”bab”能够通过验证，而”ban”就不能通过。

反向引用的经典例子：匹配XML标签 /<(\w+)>(.+)<\/\1>/，其保证了前后标签对一致。

exec获取capture

1	while ((match = tag.exec(html)) !== null) {...}

capture 引用

// 驼峰式转连接符
function makeconnect(str){
	str.replace(/([A-Z])/g, "-$1").toLowerCase();
}
// 连接符转驼峰式
function upper(all, letter) { return letter.toUpperCase();}
function makeCamel(str){
	str.replace(/(-[a-z])/g, upper);
}

nonCapture编组

(?: …) 不会被capture，会被pass