regex - Replace sequential repeating tags with one of that tag in Ruby -
i'm trying replace multiple sequential <br> tags 1 <br> tag using ruby.
for instance:
hello <br><br/><br> world! would become
hello <br> world!
you could regular expression, like:
"hello\n<br><br/><br>\nworld".gsub(/(?im)(<br\s*\/?>\s*)+/,'<br>') to explain that: (?im) part has options indicating match should case-insensitive , . should match newlines. grouped expression (<br\s*\/?>\s*) matches <br> (optionally whitespace , trailing /) possibly followed whitespace, , + says match 1 or more of group.
however, should point out in general it's not idea use regular expressions manipulating html - should use proper parser instead. example, here's better way of doing using nokogiri:
require 'nokogiri' document = nokogiri::html.parse("hello <br><br/><br> world!") document.search('//br').each |node| node.remove if node.next.name == 'br' end puts document that produce output like:
<!doctype html public "-//w3c//dtd html 4.0 transitional//en" "http://www.w3.org/tr/rec-html40/loose.dtd"> <html><body><p>hello <br> world!</p></body></html> (the parser turns input well-formed document, why have doctype , enclosing <html><body><p> tags.)
Comments
Post a Comment