Syntax highlighting is the coloring of words that have a special meaning in the language you are writing. Obviously the patterns are different for every language. The "<title>" word for example means "start of title" in HTML, the "function" word means "start of function" in PHP.
The highlight patterns are build from perl compatible regular expressions. A pattern has options for coloring and style of the text it matches. Within a match other patterns can be used to color parts of that match. There are three types of patterns:
One specific pattern can also be used within several other parent patterns. The parent-match option is a regular expression that defines all parents for a certain pattern. If emty it will default to ^top$, so basically it will be on the top level.
So how does it work? Lets take a look at a little example text, a piece of PHP code within some html code:
<p align="center"> <?php // this is a comment ?> ?>
The first thing the highlighting engine does is finding the pattern that has the lowest match. Using the default patterns for PHP, the pattern named html has a match at position 0:
<p align="center">
So now the highlightiong engine searches for the lowest match in all subpatterns of html, in the region matched by the type 2 html pattern. Again, the lowest match will count. The pattern named html-tag has a match at position 1. This pattern is a type 3 pattern, so it matches a subpattern of the parent:
p
the match from subpattern html-tag ends at position 2 and it does not have any child patterns, so the highlighting engine continues at position 2 with all subpatterns from html. A type 2 pattern named html-attrib has the lowest match:
align="center"
This pattern does have a child pattern, again a type 3 pattern called html-attrib-sub2 matching:
"center"
The pattern html-attrib-sub2 does not have any child patterns, and subpatterns of html-attrib do not have any more matches, and also html subpatterns do not have any more matches. So we are back on the main level, the remaining code to highlight is:
<?php // this is a comment ?> ?>
Now a pattern named php has the lowest match. This is a type 0 pattern, so the highlight engine continues with all the remaining code, but it will not only search for the lowest match of the child patterns of php, but it will also use for the end pattern of php. The lowest match in this example is a pattern named php-comment-C++ As you can see the ?> within the comment does not end the php pattern, because it lies within a subpattern of php:
// this is a comment ?>
The pattern php-comment-C++ does not have any child patterns, so the remaining code for the php subpatterns is:
?>
It is very obvious now, the lowest match will be the end pattern of the php pattern, so we're back on the main level, and we have matched all of the code!
The config file for highlighting is a colon separated array with the following content:
mode: patternname: case_sensitive(0-on/1-off): start reg-ex: end reg-ex: start & end pattern(1), only start(2), subpattern(3): parent-match: foreground-color: background-color: don't change weight(0), non-bold(1), bold(2): don't change style(0), non-italic(1), italic(2):