敏感詞過濾及字串編碼問題

# -*-coding:utf-8-*-
importsys
classsenseword():
def__init__(self
,filename):
""":type
filename: 敏感詞文件
"""self.list = 
printsys.stdin.encoding
withopen(filename,
'r')asf:
forlineinf.readlines():
deffilter_0011(self
,word):
"""判斷輸入詞是否為敏感詞
:param
word: 輸入
:return
:"""
word = word.decode(sys.stdin.encoding).encode('utf-8')
foriteminself.list:
item = item.decode('utf-8').encode('utf-8')
ifitem== word:
print'freedom'
returnitem
print'human rights'
returnnone
deffilter_0012(self
,string):
"""若輸入詞是敏感詞則替換
:param
string:
:return
:"""
res = self.filter_0011(string)
ifres != none:
string = string.replace(res,
"*")
printstring
if__name__ == "__main__":
check = senseword("filtered_words.txt")
inputstr = raw_input("請輸入： ").strip()
check.filter_0012(inputstr)

result：

utf-8

請輸入：北京

*freedom

首先要搞清楚，字串在python內部的表示是unicode編碼.

因此，在做編碼轉換時，通常需要以unicode作為中間編碼，即先將其他編碼的字串解碼（decode）成unicode，再從unicode編碼（encode）成另一種編碼。

decode的作用是將其他編碼的字串轉換成unicode編碼，

如str1.decode('gb2312')，表示將gb2312編碼的字串轉換成unicode編碼。

encode的作用是將unicode編碼轉換成其他編碼的字串，

如str2.encode('gb2312')，表示將unicode編碼的字串轉換成gb2312編碼。即：

decode用法：str -> decode('the_coding_of_str') -> unicode

encode用法：unicode -> encode('the_coding_you_want') -> str

判斷字串是什麼格式的編碼，可以用isinstance進行判斷：

isinstance(s, unicode) #用來判斷是否為unicode

終端的輸入編碼：sys.stdin.encoding

終端的輸出編碼：sys.stdout.encoding

php敏感字串過濾 PHP實現敏感詞過濾

1 敏感詞過濾方法 todo 敏感詞過濾，返回結果 param array list 定義敏感詞一維陣列 param string string 要過濾的內容 return string log 處理結果 function sensitive list,string if count 0 else ...

php敏感字串過濾 PHP實現的敏感詞過濾方法

php實現的敏感詞過濾方法,以下是乙份過濾敏感詞的編碼。有需要可以參考參考。todo 敏感詞過濾，返回結果 param array list 定義敏感詞一維陣列 param string string 要過濾的內容 return string log 處理結果 function sensitive ...

關於防SQL注入敏感詞過濾問題

關於對字元的過濾問題sql查詢條件過濾掉單引號是否就安全了呢？在文章最後一段管理員做了敏感字元的過濾，管理員過濾掉了空格，而攻擊者通過來代替空格繞過了過濾字元。感覺很有成就感，呵呵呵呵。真實環境中管理員發現了漏洞不太可能只過濾掉空格的。你不是注入嗎？我把單引號等於號空格一起都過濾掉，看你怎麼...

敏感詞過濾及字串編碼問題

php敏感字串過濾 PHP實現敏感詞過濾

php敏感字串過濾 PHP實現的敏感詞過濾方法

關於防SQL注入敏感詞過濾問題

相關推薦