regex - Information Sources on Token Parsing Patterns -
to make long story short, looks if going responsible rewriting text parsing engine work. so, imagine: block of text comes in, there custom tags in text, simple one-off replaces, blocks content, nesting, etc. tags have argument/value pairs, etc. while have been coding years, , i'm mid-level regex user; first admit hardcore text parsing not forte. , needs fast, optimization concern. i looking information sources on patterns , commentary kind of parsing. i'm willing read on of offer. need educate myself before begin contemplating how tackle this. thanks much, in advance. if gets little more complex can simple state machine 1 person can understand suggest using tool generate tokenizers: flex / jflex / etc . you can create hand crafted top down parser if speed big concern or can use parser generator ( antlr example , like). hand craft parser faster has potential create nasty corner cases :). need set of test cases it. i recommend start here: parsing on wiki...