ABAP DEVELOPER NETWORK: REGEX SAMPLE


REPORT ZRR_SAMPLE.

TYPE-POOLS: abap.

DATA: lt_html       TYPE TABLE OF string,
      lo_pattern    TYPE REF   TO cl_abap_regex,
      lo_matcher    TYPE REF   TO cl_abap_matcher,
      ls_match      TYPE match_result,
      lv_header     TYPE string,
      lv_header_txt TYPE string.

FIELD-SYMBOLS: <lfs_html> TYPE string,
               <lfs_sub>  TYPE submatch_result.

START-OF-SELECTION.

* Build the HTML document sample:
  APPEND '<html><head></head><body>'       TO lt_html.
  APPEND '<H1>String Processing Techniques</H1>' TO lt_html.
  APPEND '<h2>ABAP Character Types</h2>'         TO lt_html.
  APPEND '<H2>Developing a String Library</h2>'  TO lt_html.
  APPEND '<h3>Designing the API</h3>'            TO lt_html.
  APPEND '<h3>...</h3>'                          TO lt_html.
  APPEND '</body></html>'                        TO lt_html.

* Extract a table of contents from the HTML document:
  TRY.
*   Parse the regex pattern:
      CREATE OBJECT lo_pattern
        EXPORTING
          pattern     = '<([h][1-6]).*>(.*)</\1>'
          ignore_case = abap_true.

*   Create a matcher to search the example HTML document:
      lo_matcher = lo_pattern->create_matcher( table = lt_html ).

*   Add each match to the table of contents:
      WHILE lo_matcher->find_next( ) EQ abap_true.
        
*     Retreive the next match found in the HTML document:
        ls_match = lo_matcher->get_match( ).
        
        READ TABLE lt_html INDEX ls_match-line ASSIGNING <lfs_html>.

*     Since we are using backreferences, the captured text
*     is actually stored in the submatch results:
        LOOP AT ls_match-submatches ASSIGNING <lfs_sub>.
          
          IF sy-tabix EQ 1.
            
            lv_header = <lfs_html>+<lfs_sub>-offset(<lfs_sub>-length).
          
          ELSEIF sy-tabix EQ 2.
            
            lv_header_txt = <lfs_html>+<lfs_sub>-offset(<lfs_sub>-length).
          
          ENDIF.
        
        ENDLOOP.

*     Output the table of contents record:
        CASE lv_header.
          
          WHEN 'H1' OR 'h1'.
            WRITE: / lv_header_txt.
          WHEN 'H2' OR 'h2'.
            WRITE: / '##', lv_header_txt.
          WHEN 'H3' OR 'h3'.
            WRITE: / '####', lv_header_txt.
        
        ENDCASE.
      
      ENDWHILE.
    
    CATCH cx_sy_regex.
      "Invalid regular expression pattern...
    CATCH cx_sy_matcher.
      "Problem generating matcher instance...
  
  ENDTRY.

Source: SAP

ABAP DEVELOPER NETWORK

16/09/2010

REGEX SAMPLE

Um comentário:

Contato / Contact

Visitantes

Arquivo Do Blog

Links Úteis

SDN - SAP Developer Network

Referência Bibliográfica

SAP Brasil - Notícias

Computerworld - Tecnologia

Livros

Pesquisar este blog