Projects & Stories

Regex lexer for pygments

I am currently writing on my thesis about malicous detection systems and started using indirectly pygments with latex. I am pretty happy with it since it supports all kind of text/code highlighting.

That was why I was a bit surprised when I saw that there was nothing for plain old regular expressions. Normally I would create a pull request for such things but I am currently a bit short of time and this is far from being perfect. Still I want to share it with the world .. so here we go:

from pygments.lexer import RegexLexer, bygroups
from pygments.token import *

__all__ = ['regexLexer']

class regexLexer(RegexLexer):
    name = 'regex'
    aliases = ['regex']
    filenames = []

    tokens = {
        'root': [
            (r'\w+', Name),
            (r'\d+', Number),
            (r'[\s\,\:\-\"\']+', Text),
            (r'[\$\^]', Token),
            (r'[\+\*\.\?]', Operator),
            (r'(\()([\?\<\>\!\=\:]{2,3}.+?)(\))', bygroups(Keyword.Namespace, Name.Function, Keyword.Namespace)),
            (r'(\()(\?\#.+?)(\))', bygroups(Comment, Comment, Comment)),
            (r'[\(\)]', Keyword.Namespace),
            (r'[\[\]]', Name.Class),
            (r'\\\w', Keyword),
            (r'[\{\}]', Operator),

If you installed pygments you can activate the custom lexer by moving the snippet to /usr/lib/pythonX.X/site-packages/pygments/lexers and running sudo python within the lexer folder.

The final result could look something like this:
I used minted as package withing latex.