##

Java regular expressions

A regular expressions is a string that describes
or matches a set of strings, according to certain syntax rules. They
are usually used to give a concise description of a set, without
having to list all elements. The simplest form of a regular
expression is a literal string. The syntax used for the regular
expressions is the Java regular expression syntax (see
http://java.sun.com/docs/books/tutorial/essential/regex/index.html).
Below is listed some of the most important syntax rules which are
also shown in the help pop-up when you press Shift +
F1:*[A-Z]*will match the characters*A*through*Z*(Range). You can also put single characters between the brackets: The expression*[AGT]*matches the characters*A*,*G*or*T*.*[A-D[M-P]]*will match the characters*A*through*D*and*M*through*P*(Union). You can also put single characters between the brackets: The expression*[AG[M-P]]*matches the characters*A*,*G*and*M*through*P*.*[A-M&&[H-P]]*will match the characters between*A*and*M*lying between*H*and*P*(Intersection). You can also put single characters between the brackets. The expression*[A-M&&[HGTDA]]*matches the characters*A*through*M*which is*H*,*G*,*T*,*D*or*A*.*[^A-M]*will match any character except those between*A*and*M*(Excluding). You can also put single characters between the brackets: The expression*[^AG]*matches any character except*A*and*G*.*[A-Z&&[^M-P]]*will match any character*A*through*Z*except those between*M*and*P*(Subtraction). You can also put single characters between the brackets: The expression*[A-P&&[^CG]]*matches any character between*A*and*P*except*C*and*G*.- The symbol
*.*matches any character. *X{n}*will match a repetition of an element indicated by following that element with a numerical value or a numerical range between the curly brackets. For example,*ACG{2}*matches the string*ACGG*and*(ACG){2}*matches*ACGACG*.*X{n,m}*will match a certain number of repetitions of an element indicated by following that element with two numerical values between the curly brackets. The first number is a lower limit on the number of repetitions and the second number is an upper limit on the number of repetitions. For example,*ACT{1,3}*matches*ACT, ACTT*and*ACTTT*.*X{n,}*represents a repetition of an element at least*n*times. For example,*(AC){2,}*matches all strings*ACAC*,*ACACAC*,*ACACACAC*,...- The symbol
*^*restricts the search to the beginning of your sequence. For example, if you search through a sequence with the regular expression*^AC*, the algorithm will find a match if*AC*occurs in the beginning of the sequence. - The symbol
*$*restricts the search to the end of your sequence. For example, if you search through a sequence with the regular expression*GT$*, the algorithm will find a match if*GT*occurs in the end of the sequence.

**Examples**

The expression *[ACG][^AC]G{2}* matches all strings of
length *4*, where the first character is *A,C* or *G* and the second
is any character except *A,C* and the third and fourth character is
*G*. The expression *G.[^A]$* matches all strings of
length *3* in the end of your sequence, where the first character is
*C*, the second any character and the third any character except
*A*.