基於python語言,自動生成文章摘要(中文)

2021-08-25 16:53:44 字數 2182 閱讀 8278

廢話少說,直接上**...

# -*- coding: utf-8 -*-

import jieba,copy,re,codecs

from collections import counter

# summary = pyhanlp.hanlp.extractsummary(text, 3)

# print(summary)

print (' '.join(words_topn))

return words_topn

return sents_topn

return keysents

def post_processing(self,keysents):

#**** 刪除不完整句子中的詳細部分 ********************

detail_tags = [',一是',':一是',',第一,',':第一,',',首先,',';首先,']

for i in keysents:

for tag in detail_tags:

index = i.find(tag)

if index != -1:

keysents[keysents.index(i)] = i[:index]

#**** 刪除編號 ****************************

for i in keysents:

# print(i)

regex = re.compile(r'^一、|^二、|^三、|^三、|^四、|^五、|^六、|^七、|^八、|^九、|^十、|^\d、|^\d ')

result = re.findall(regex,i)

if result:

keysents[keysents.index(i)] = re.sub(regex,'',i)

#**** 刪除備註性質的句子 ********************

for i in keysents:

regex = re.compile(r'^注\d*:')

result = re.findall(regex,i)

if result:

keysents.remove(i)

#**** 刪除句首括號中的內容 ********************

for i in keysents:

regex = re.compile(r'^\[.*\]')

result = re.findall(regex,i)

if result:

keysents[keysents.index(i)] = re.sub(regex,'',i)

#**** 刪除**(空格前的部分) ********************

for i in keysents:

regex = re.compile(r'^. ')

result = re.findall(regex,i)

if result:

keysents[keysents.index(i)] = re.sub(regex,'',i)

#**** 刪除引號部分(如:銀行間債市小幅**,見下圖:) ********************

for i in keysents:

regex = re.compile(r',[^,]+:$')

result = re.findall(regex,i)

if result:

keysents[keysents.index(i)] = re.sub(regex,'',i)

return keysents

def main(self,title,text):

sentences = self.cutsentence(text)

keywords = self.getkeywords(title, sentences, n=8)

sents_topn = self.gettopnsentences(sentences, keywords, n=3)

keysents = self.sents_sort(sents_topn, sentences)

print(keysents)

return keysents

if __name__=='__main__':

summary=summary()

summary.main(title,text)

Jekyll 自動生成文章

當使用jekyll寫文章的時候,你肯定不想麻煩的建立文字,修改文字字尾名,再加文字頭加上yml語法開頭。所以這時候你肯定想到的是寫個指令碼簡化操作,程式設計師不就是為偷懶而寫 嗎?可以使用rake來解決這個問題。rake,即ruby make,使用ruby開發 構建工具。安裝rakegem inst...

beego api自動生成文件

必須設定在 routers router.go 中,檔案的注釋,最頂部 apiversion 1.0.0 title mobile api description mobile has every tool to get any job done,so codename for the new mo...

Django 自動生成文件

老是忘 記錄下 coreapi pip install coreapi rest framework from rest framework.documentation import include docs urls urlpatterns url api include docs urls ti...