Chilkat Email Components Home

utf-8 is the multibyte encoding of Unicode

Back

Question:

As long as I have your eyeballs, I do have one other question:

I am using this method in a class constructor:

m_zip.put_Utf8(TRUE)
 
then trying to call OpenZip:
 
CString sFilePath = _T("some file name");
if (m_zip.OpenZip(sFilePath.GetBuffer()))
...
 
and getting a compiler error:
 
Error 1 error C2664: 'CkZip::OpenZip' : cannot convert parameter 1 from 'wchar_t *' to 'const char *'
 
I am compiling with UNICODE and _UNICODE in the preprocessor definitions.

It was my understanding that by using the put_Utf8(TRUE) that the Chilkat code would be able to handle UNICODE filenames.

Any ideas what I am doing wrong???

Answer:
utf-8 is the multibyte encoding of Unicode. It is not 2-byte/char Unicode. From a C++ perspective, utf-8 is still a NULL-terminated "char *". If you call put_Utf8(true), then all "char *" inputs are expected to be utf-8 as opposed to ANSI.

You can create utf-8 like this:

wchar_t *myString;
....
CkString s;
s.appendU(myString);
// or
int numChars = 10;
s.appendNU(myString, numChars);

// then:
m_zip.OpenZip(s.getUtf8());

// Note: getUtf8() is different than get_Utf8().  getUtf8 returns the string as utf-8 encoded characters,
// whereas get_Utf8() returns true/false as to whether utf-8 strings are expected.

There is no source code associated with this article.