Python正規表示式實戰

寫的很詳細的一篇文章：

重要概念（re模組）：

re模組內包含許多重要的方法、乙個match物件和乙個pattern物件

1、重要的方法

re.compile：構造乙個pattern物件

re.match：判斷string是否匹配

re.search：查詢子串

re.split：按照匹配的子串將string分割並返回乙個列表

re.findall：搜尋string以列表形式返回匹配的子串

re.sub：替換string中全部匹配的子串後返回替換後的string

以上除了compile，其餘的第乙個引數都是正規表示式或者生成的pattern物件，使用pattern物件時都有對應的pattern.***方法，而使用正規表示式就意味著不能復用pattern物件。

match和search方法返回match物件，因而正規表示式中的小括號很重要。

split和findall正好是互補的結果，sub替換的就是findall的結果，但有小括號()時有點特殊。

2、match物件

match和search方法返回的物件，它包含一些重要的屬性和方法，其中group方法是正規表示式中小括號的匹配結果，group(0)表示整個的匹配，groups()返回從1開始的所有group組成的元組。

3、pattern物件

編譯過的正規表示式，包含re模組中類似的方法。

這裡主要用具體的**演示：

1、判斷string是否匹配

string = 'aabbxyz'
pattern = re.compile('a(.*)b')
search = re.match(pattern, string)
if search:
print(search.group())
else:
print('not match')

結果輸出aabb，若string='xyzaabbxyz'結果則是not match。可見match是正規表示式結束時還能匹配就行，如需匹配到string結束加上$符就行了。

2、查詢子串

string = 'xyzaaabxyzabbbxyz'
pattern = re.compile('a(.*)b')
search = re.search(pattern, string)
print(search.group())
print(search.groups())

輸出aaabxyzabbb和('aabxyzabb',)，沒有輸出第乙個aaab是因為小括號中間的(.*)是貪婪的，要非貪婪則使用(.*?)，就能匹配到aaab。

3、查詢所有的子串

string = 'xyzaaabxyzabbbxyz'
pattern = re.compile('a.*b')
search = re.findall(pattern, string)
print(search)
pattern = re.compile('a(.*)b')
search = re.findall(pattern, string)
print(search)
pattern = re.compile('a(.*)b.*a(.*)b')
search = re.findall(pattern, string)
print(search)

輸出['aaabxyzabbb']，['aabxyzabb']，[('aa', 'bb')]，可見findall並非真正意義的findall，和search一樣從前往後匹配，搜尋起點只增不減。

4、切割子串

string = 'xyzaaabxyzabbbxyz'
pattern = re.compile('a.*b')
search = re.split(pattern, string)
print(search)
pattern = re.compile('a(.*)b')
search = re.split(pattern, string)
print(search)
pattern = re.compile('a(.*)b.*a(.*)b')
search = re.split(pattern, string)
print(search)

輸出['xyz', 'xyz']，['xyz', 'aabxyzabb', 'xyz']，['xyz', 'aa', 'bb', 'xyz']，第一條加?才能輸出['xyz', 'xyz', 'xyz']。

5、替換子串

string = 'xyzaaabxyzabbbxyz'
pattern = re.compile('a.*b')
search = re.sub(pattern, 'cc', string)
print(search)
pattern = re.compile('a(.*)b')
search = re.sub(pattern, 'cc', string)
print(search)
pattern = re.compile('a(.*)b.*a(.*)b')
search = re.sub(pattern, 'cc', string)
print(search)

輸出全部是xyzccxyz。

python正規表示式元字元正規表示式

字元描述將下乙個字元標記為乙個特殊字元或乙個原義字元或乙個向後引用或乙個八進位制轉義符。例如，n 匹配字元 n n 匹配乙個換行符。序列匹配而則匹配匹配輸入字串的開始位置。如果設定了 regexp 物件的 multiline 屬性，也匹配 n 或 r 之後的位置。匹配輸入字串的結束...

Python 正規表示式

1.在python中，所有和正規表示式相關的功能都包含在re模組中。2.字元表示字串的末尾如 road 則表示只有當 road 出現在乙個字串的尾部時才會匹配。3.字元表示字元中的開始如 road 則表示只有當 road 出現在乙個字串的頭部時才會匹配。4.利用re.sub函式對字串...

Python正規表示式

學習python自然而然就不得不面對正規表示式這個難題。當初在沒有學習python之前，自己也曾經嘗試著學習過正規表示式，但是那時候感覺很麻煩，很難懂，結果就是不了了之。但是現在學習python我用的書是 python基礎教程第二版這本書中對re模組的講解很簡單易懂，內容不多但起碼把人領進門了，...

Python正規表示式實戰

python正規表示式元字元 正規表示式

Python 正規表示式

Python正規表示式

相關推薦

python正規表示式元字元正規表示式