How to use Perl Regex to detect inside another -

September 15, 2012

i trying parse "wrong html" fix using perl regex. wrong html following: foobarfoo

i perl regex return me : foo

i tried like: '|(<p\b[^>]*>(?!)*?<p[^>]*>)|' no success because cannot repeat (?!)*?

is there way in perl regex charactère except following sequence (in case )

try like:

<p>(?:(?!</?p>).)*</p>(?!(?:(?!</?p>).)*(<p>|$))

a quick break down:

<p>(?:(?!</?p>).)*</p>

matches  ...  not contain either  , . , part:

(?!(?:(?!</?p>).)*(<p>|$))

is "true" when looking ahead ((?! ... )) there no  or end of input ((|$)), without  ,  in between ((?:(?!</?p>).)*).

a demo:

my $txt="<p>aaa aa a</p> <p>foo <p>bar</p> foo</p> <p> bb <p>x</p> bb</p>"; while($txt =~ m/(<p>(?:(?!<\/?p>).)*<\/p>)(?!(?:(?!<\/?p>).)*(<p>|$))/g) {   print "found: $1\n"; }

prints:

found: <p>bar</p> found: <p>x</p>

note regex trickery works baz in string:

<p>foo <p>bar</p> <p>baz</p> foo</p>

bar not matched! after replacing baz, 2nd run on input, in case bar matched.

Search This Blog

Assebmley

How to use Perl Regex to detect <p> inside another <p> -

Comments

Post a Comment

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -