使用ABAP正規表示式解析HTML標籤

需求就是我用abap的某個函式從資料庫讀取乙個字串出來，該字串的內容是乙個網頁。

網頁的form裡包含了很多隱藏的input field。我的任務是解析出name為svyvalueguid的input field的值：fa163eef573d1ed89e89c7fe5e7c4715

最簡單粗暴的做法是：利用abap的find first occurrence關鍵字首先找到svyvalueguid的偏移量，然後從這個偏移量出發，再找到第乙個》的偏移量，這樣問題就化簡為在子串type=」hidden」 value=」fa163eef573d1ed89e89c7fe5e7c4715」，這樣問題就簡單多了。但是這種辦法比較笨重，**很冗餘。

有沒有更快捷的辦法呢？那就是使用abap regular expression，即正規表示式。

請看下列的測試**：

report ztest_inte***ce.
data: lv_input type
string,
reg_pattern type
string
.lv_input =
``&&
`jerry's programming skill survey`&&
``.reg_pattern =
'.*svyvalueguid(?:.*)value="(.*)">.*surveyid.*'
.try.
data(lo_regex) =
new cl_abap_regex( pattern = reg_pattern ).
data(lo_matcher) = lo_regex->create_matcher( exporting text = lv_input ).
if lo_matcher->
match( ) <> abap_true.
write:/
'fail in input scan!'
.return
.endif.
data(lt_reg_match_result) = lo_matcher->find_all( ).
read table lt_reg_match_result assigning field-symbol(<
match
>) index 1.
read table <
match
>-submatches assigning field-symbol() index 1.
data(lv_sub) = lv_input+-offset(-length).
write:/
'result: ', lv_sub.
catch cx_root into
data(cx_root).
write:/ cx_root->get_text( ).
return
.endtry.

執行結果：

解決問題的核心思路是這個正規表示式：.svyvalueguid(?:.)value=」(.)」>.*surveyid.

通過捕獲分組操作符，一對小括號，將32位的guid值進行捕獲。這種解法比find first occurance的**量要少。

使用ABAP正規表示式解析HTML標籤

使用ABAP正規表示式解析HTML標籤

使用ABAP正規表示式解析HTML標籤

ABAP 正規表示式 簡介

相關推薦

ABAP 正規表示式簡介