asp.net - Reading characters from an unknown character encoding -
i have string came old database of unknown character encoding. having trouble encoding/filtering string show correct text.
what data looks in database: marronnière à quatre pans
what need string show as: marronnière à quatre pans
specifically, having trouble parsing string can display character à (à
)
this asp.net 2.0 site written in vb using sql server 2005 database. not sure if matters, data comes column collation: sql_latin1_general_cp1_ci_as
i've tried encoding string various encodings in code no avail. i've passed string (encoded different ways) byte array find unique byte pattern bad characters without success.
any ideas or leads appreciated, thanks.
it sounds collation in sql server database doesn't match character encoding used :( it's common mistake careless developers.
that's why sql server administration tools showing weird characters rather strings expecting.
possibly utf-8? in utf-8 Ã
represented bytes 0xc3 0xa8
, interpreted under windows code page latin-1 è
. know nothing sql server collations, seems sql_latin1_cp1_ci_as similar windows "latin-1".
you either need
- fix encoding when reading database. ugly , confusing next poor victim has deal database , code.
- or, better, correct data in database matches collation. might change collation utf-8 or utf-16: need change data though.
Comments
Post a Comment