regex - Do I need change anything on my regular expression? -


i want transfer fixed length text xml, , use regex it. in position 36 of each line of text file 'user's initial' requires 1 alphanumeric. sometime blank. use regex pattern [a-za-z\s]{1} good, either matches 1 alphanumeric or blank. when validation using schema, schema says regex doesn't match \p{l}{1} means can letter. should on regex? or have change either text file pattern or change schema. here code example:

dim linepattern2 new regex("^(?<type_code>\d{3})(?<snm>[a-za-z0-9\s.\']{20})(?<gvn_nm>[a-za-z0-9\s.\']{12})(?<init>[\p{l} ]{1})(?<sin>\d{9})(?<rcpnt_bn>[a-za-z0-9\s.\']{15})(?<l1_nm>[a-za-z0-9\s.\']{30})(?<l2_nm>[a-za-z0-9\s.\']{30})")     dim settings new xmlwritersettings()     settings.indent = true     using writer xmlwriter = xmlwriter.create(xmloutput, settings)         writer.writestartdocument()         writer.writestartelement("submission")         writer.writeattributestring("xmlns", "xsi", nothing, "http://www.w3.org/2001/xmlschema-instance")         writer.writeattributestring("xsi", "nonamespaceschemalocation", nothing, "c:\schema\layout-topologie.xsd")          writer.writestartelement("return")         writer.writestartelement("t4a")         using reader new streamreader(textinput)             while not reader.endofstream                      dim line string = reader.readline()                     dim match2 match = linepattern2.match(line)                     if match2.success                         writer.writestartelement("t4aslip")                          writer.writestartelement("rcpnt_nm")                         writer.writeelementstring("snm", match2.groups("snm").value)                         writer.writeelementstring("gvn_nm", match2.groups("gvn_nm").value)                         writer.writeelementstring("init", match2.groups("init").value)                         writer.writeendelement()                          writer.writeelementstring("sin", match2.groups("sin").value)                         writer.writeelementstring("rcpnt_bn", match2.groups("rcpnt_bn").value)                         end if             end while         end using         writer.writeendelement()         writer.writeendelement()         writer.writeendelement()         writer.writeenddocument()     end using 

here part of text file:

100aaserude            russell alan 663345678000000000000000 

the schema validation error is:

'init': value ' ' not match regular expression facet '\p{l}{1}'

thanks in advance!

i think regex want:

[\p{l} ] 
  • \p{l} matches any letter, not ascii letters ([a-za-z]). includes accented ascii letters Ä , ñ, plus "letters" other scripts , writing systems greek, cyrillic, arabic, chinese... letter known unicode.

  • since text format fixed-length, assume missing initial represented space, not empty string 1 expect. used literal space, can switch \s if want allow tab, linefeed, or other whitespace characters.

  • the {1} in regexes serves no purpose. if want make sure 1 character allowed, usually add anchors, this: ^[\p{l} ]$. isn't necessary in xml schema, regexes anchored @ both ends.


Comments

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -