regex - Matching specific unicode char in haskell regexp -


this mac/osx related problem!

i have following 3 character long haskell string:

"a\160b" 

i want match , replace middle character

several approaches like

ghci> :m +text.regex ghci> subregex (mkregex "\160") "a\160b" "x"   "*** exception: user error (text.regex.posix.string died: (returncode 17,"illegal byte sequence")) ghci> subregex (mkregex "\\160") "a\160b" "x"   "a\160b" 

did not yield desired result.

how have modify regexp or environment replace '\160' 'x' ?

the problem seems have it's root in locale/encoding of input.

bash> locale lang= lc_collate="c" lc_ctype="utf-8" lc_messages="c" lc_monetary="c" lc_numeric="c" lc_time="c" lc_all= 

i modified .bashrc export following env-vars:

bash> locale lang="en_us.utf-8" lc_collate="en_us.utf-8" lc_ctype="en_us.utf-8" lc_messages="en_us.utf-8" lc_monetary="en_us.utf-8" lc_numeric="en_us.utf-8" lc_time="en_us.utf-8" lc_all="en_us.utf-8" 

but did not change behavior @ all.

i able reproduce problem setting locale 'en_us.utf-8'. (i using macosx.)

bash> export lang=en_us.utf-8 bash> ghci                    ghci, version 6.12.1: http://www.haskell.org/ghc/  :? prelude> :m +text.regex prelude text.regex> subregex (mkregex "\160") "a\160b" "x" "*** exception: user error (text.regex.posix.string died: (returncode 17,"illegal byte sequence")) 

setting locale 'c' should fix problem:

bash> export lang=c bash> ghci                    ghci, version 6.12.1: http://www.haskell.org/ghc/  :? prelude> :m +text.regex prelude text.regex> subregex (mkregex "\160") "a\160b" "x" "axb" 

unfortunately, don't have explanation why locale causing problem.


Comments

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -