Question:
I have a text file that contains a mixture of utf-8 character data and ANSI character data. How can I convert just the utf-8 bytes to ANSI? (or instead convert just the ANSI bytes to utf-8?)
Answer:
There is no perfect solution. The best you can do scan the bytes one by one and then pick off the sequences that are most likely to be utf-8 bytes. If the utf-8 bytes are typically Western European characters with diacritics (i.e. accent marks) the best 2-byte sequence to look for is a 0xC2 or 0xC3 followed by a byte greater than 0x80. As an example: