day5 正規表示式 re

re模組用於對python的正規表示式的操作。

字元：. 匹配除換行符以外的任意字元

\w 匹配字母或數字或下劃線或漢字

\s 匹配任意的空白符

\d 匹配數字

\b 匹配單詞的開始或結束

^ 匹配字串的開始

$ 匹配字串的結束

次數：* 重複零次或更多次

+ 重複一次或更多次

? 重複零次或一次

重複n次

重複n次或更多次

重複n到m次

ip：
^(25[0-5]|2[0-4]\d|[0-1]?\d?\d)(\.(25[0-5]|2[0-4]\d|[0-1]?\d?\d))$
手機號：
^1[3|4|5|8][0-9]\d$

1、match(pattern, string, flags=0)

從起始位置開始根據模型去字串中匹配指定內容，匹配單個

import
reobj = re.match('
\d+', '
123uuasf')
ifobj:
print obj.group()

#
flags
i = ignorecase = sre_compile.sre_flag_ignorecase #
ignore case
l = locale = sre_compile.sre_flag_locale #
assume current 8-bit locale
u = unicode = sre_compile.sre_flag_unicode #
assume unicode locale
m = multiline = sre_compile.sre_flag_multiline #
make anchors look for newline
s = dotall = sre_compile.sre_flag_dotall #
make dot match newline
x = verbose = sre_compile.sre_flag_verbose #
ignore whitespace and comments

2、search(pattern, string, flags=0)

根據模型去字串中匹配指定內容，匹配單個

import
reobj = re.search('
\d+', '
u123uu888asf')
ifobj:
print obj.group()

3、group和groups

a = "
123abc456
"print re.search("
([0-9]*)([a-z]*)([0-9]*)
", a).group()
print re.search("
([0-9]*)([a-z]*)([0-9]*)
", a).group(0)
print re.search("
([0-9]*)([a-z]*)([0-9]*)
", a).group(1)
print re.search("
([0-9]*)([a-z]*)([0-9]*)
", a).group(2)
print re.search("
([0-9]*)([a-z]*)([0-9]*)
", a).groups()

4、findall(pattern, string, flags=0)

上述兩中方式均用於匹配單值，即：只能匹配字串中的乙個，如果想要匹配到字串中所有符合條件的元素，則需要使用 findall。

import
reobj = re.findall('
\d+', 'fa
123uu888asf')
print obj

5、sub(pattern, repl, string, count=0, flags=0)

用於替換匹配的字串

content = "
123abc456
"new_content = re.sub('
\d+', 'sb'
, content)
#new_content = re.sub('\d+', 'sb', content, 1)
print new_content

相比於str.replace功能更加強大

6、split(pattern, string, maxsplit=0, flags=0)

根據指定匹配進行分組

content = "
'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'
"new_content = re.split('\*'
, content)
#new_content = re.split('\*', content, 1)
print new_content

content = "
'1 - 2 * ((60-30+1*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2) )'
"new_content = re.split('
[\+\-\*\/]+
', content)
#new_content = re.split('\*', content, 1)
print new_content

inpp = '
1-2*((60-30 +(-40-5)*(9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2))
'inpp = re.sub('
\s*',''
,inpp)
new_content = re.split('
\(([\+\-\*\/]?\d+[\+\-\*\/]?\d+)\)
', inpp, 1)
print new_content

相比於str.split更加強大

7、分組匹配位址

aaa 
="111,222,333"
bbb =re
.search(r
'(\d+,)(\d+),(\d+)'
,aaa
)print
(bbb
.group(1
))print
(bbb
.group(2
))print
(bbb
.group(3
))111
,222
333

匹配ip位址

str 
="192.168.1.1"
m =re
.match
("([0-9]\.?)"
,str
).group
()print(m
)192.168
.1.1

re.match只匹配字串的開始，如果字串開始不符合正規表示式，則匹配失敗，函式返回none；而re.search匹配整個字串，直到找到乙個匹配。

其它例子，匹配除了某某之外的

aaa 
="www.m.biyao.com"
bbb =re
.search
("[^\.]+\."
,aaa
).group
()# 會匹配出 www. 因為規則是匹配除了.之外的乙個或多個值加上乙個.
bbb =re
.search
("[^b]+"
,aaa
).group
()# 會匹配出 www.m. 因為規則是匹配除了b之外的乙個或多個值.
print
(bbb
)

來自為知筆記(wiz)

正規表示式 RE

最近一段時間在研究nginx的rewirte重寫機制，因此對re需要有一定的了解，看了想關的文章，因此自己來寫一篇類似總結性的的文章。基本來說，正規表示式是一種用來描述一定數量文字的模式。regex regular express。本文用 regex 來表示一段具體的正規表示式。一段文字就是最基本的...

re正規表示式

1.數字 0 9 2.n位的數字 d 3.至少n位的數字 d 4.m n位的數字 d 5.零和非零開頭的數字 0 1 9 0 9 6.非零開頭的最多帶兩位小數的數字 1 9 0 9 0 9 7.帶1 2位小數的正數或負數 d d 8.正數負數和小數 d d 9.有兩位小數的正實數 0 9 0 9...

Re正規表示式

import re 匯入re模組重複出現的字串對於重複出現的字串可以用大括號內部加上重複次數的方式表達 r d 分組使用小括號分組 r d d 重複出現的字串對於重複出現的字串可以用大括號內部加上重複次數的方式表達 r d 重複出現的字串對於重複出現的字串可以用大括號內部加上重複次數的方式...

day5 正規表示式 re

正規表示式 RE

re正規表示式

Re正規表示式

相關推薦