2010/7/21

[Programming] 在UltraEdit裡面使用正則表達式來搜尋

我想很多程式設計師的電腦裡面都有灌UltraEdit,因為UltraEdit在編輯文字檔上,還是蠻好用而且方便的,缺點是付費軟體,所以有些人也會灌其他免費軟體,如Notepad++…等。

不確定是9.0或10.0以後的UltraEdit(或者是更早),開始支援利用Regular Expressions來做搜尋或是取代的動作。有在寫script language的人應該都知道Regular Expressions的好用,所以前陣子在用UltraEdit看debug log的時候,就嘗試想要使用Regular Expressions。可是不知道為什麼,似乎Regular Expressions都沒用,都找不到我想要找的關鍵字。

後來上網Google一下,原來是UltraEdit裡面所定義的Regular Expressions,跟我們一般寫code所使用的Symbols,有著些許的不同。下面就將我所參考的網頁弄成表格,將UltraEdit跟我們常用的Regular Expression 之間symbols的差異比較呈現出來。

UltraEdit Symbols UNIX Symbols Description of Regular Expression Symbols
% ^

Matches/anchors the beginning of line.

$ $

Matches/anchors the end of line.

? .

Matches any single character except a newline character. Does not match repeated newlines.

*

 

Matches any number of occurrences of any character except newline.

+ +

Matches one or more of the preceding character/expression. At least one occurrence of the character must be found. Does not match repeated newlines.

++ *

Matches the preceding character/expression zero or more times. Does not match repeated newlines.

^ \

Indicates the next character has a special meaning. "n" on its own matches the character "n". "^n" (UE expressions) or "\n" (UNIX expressions) matches a linefeed or newline character. See examples below.

[] []

Matches any single character or range in the brackets.

[~xyz] [^xyz]

A negative character set. Matches any characters NOT between brackets.

^b

\f

Matches a page break/form feed character.

^p \p

Matches a newline (CR/LF) (paragraph) (DOS Files).

^r \r

Matches a newline (CR Only) (paragraph) (MAC Files).

^n \n

Matches a newline (LF Only) (paragraph) (UNIX Files).

^t \t

Matches a tab character.

[0-9] \d

Matches a digit character.

[~0-9] \D

Matches a non-digit character.

[ ^t^b]

\s

Matches any white space including space, tab, form feed, etc., but not newline.

[~ ^t^b]

\S

Matches any non-white space character but not newline.

  \v

Matches a vertical tab character.

[0-9a-z_]

\w

Matches any alphanumeric character including underscore.

[~0-9a-z_]

\W

Matches any character except alphanumeric characters and underscore.

^{A^}^{B^} (A|B)

Matches expression A OR B.

^ \

Overrides the following regular expression character.

^(...^)

(...)

Brackets or tags an expression to use in the replace command. A regular expression may have up to 9 tagged expressions, numbered according to their order in the regular expression.

參考資料:
Regular expressions with UltraEdit

沒有留言:

張貼留言