converting to UNICODE, _TCHAR and TCHAR, writing text files

Hi,

i'm converting a MFC application to support unicode. I have some (probably
noob) questions:

- What is the difference between the types _TCHAR and TCHAR ? I see some
UNICODE applications use _TCHAR and others TCHAR.

- I have to convert the way textfiles are written. These text-files are send
to machines using parallel port, so they have to stay the same as before
(when my application had no _UNICODE preprocessor definition). The commands
that are used to write these files are fputs, fopen, etc. When changing it
to support wide characters are the textfiles still the same?

example (_UNICODE has been defined):

 FILE *unicodeTextFile = _tfopen(_T("uni_test.txt"), _T("w+")); // used to
be fopen
 _TCHAR test[] = _T("Is this in UNICODE ??
abcdefghijklmnopqrstuvwxyz0123456789"); // used to be char
 _fputts(test, unicodeTextFile); // used to be fputs
 fclose(unicodeTextFile);

Above example will create a file uni_test.txt, but all characters are single
byte in this file. When will this text file support wide characters? I was
thinking that wide characters would be written into this file.

- My final question. How do i define a chinese/japanese string in the code?
When i cut/paste from chinese files all i get is question marks in the code.

-----
Thanks,
Ben.


0
6/29/2004 3:14:01 PM
vc.mfc 33608 articles. 0 followers. Follow

6 Replies
848 Views

Similar Articles

[PageSpeed] 16

> - What is the difference between the types _TCHAR and TCHAR ? I see some
> UNICODE applications use _TCHAR and others TCHAR.
No difference. I always use TCHAR, never mix them (joust to avoid confusion)

> - I have to convert the way textfiles are written.
....
> When will this text file support wide characters? I was
> thinking that wide characters would be written into this file.
The main advantage for a Unicode application is the ability to process
international data. You don't want to loose this.
Best option: convert to utf8 before writing to file, convert from utf8 after
you read from file.

> - My final question. How do i define a chinese/japanese string in the code?
> When i cut/paste from chinese files all i get is question marks in the
> code.
You don't do that in the code. You do it in resources.

-- 
Mihai
-------------------------
Replace _year_ with _ to get the real email
0
6/30/2004 7:32:29 AM
See below...
On Tue, 29 Jun 2004 17:14:01 +0200, "JoeKowalski" <JoeKowalski@usa.com> wrote:

>Hi,
>
>i'm converting a MFC application to support unicode. I have some (probably
>noob) questions:
>
>- What is the difference between the types _TCHAR and TCHAR ? I see some
>UNICODE applications use _TCHAR and others TCHAR.
****
_TCHAR is a synonym for TCHAR
****
>
>- I have to convert the way textfiles are written. These text-files are send
>to machines using parallel port, so they have to stay the same as before
>(when my application had no _UNICODE preprocessor definition). The commands
>that are used to write these files are fputs, fopen, etc. When changing it
>to support wide characters are the textfiles still the same?
>
>example (_UNICODE has been defined):
>
> FILE *unicodeTextFile = _tfopen(_T("uni_test.txt"), _T("w+")); // used to
>be fopen
> _TCHAR test[] = _T("Is this in UNICODE ??
>abcdefghijklmnopqrstuvwxyz0123456789"); // used to be char
> _fputts(test, unicodeTextFile); // used to be fputs
> fclose(unicodeTextFile);
>
>Above example will create a file uni_test.txt, but all characters are single
>byte in this file. When will this text file support wide characters? I was
>thinking that wide characters would be written into this file.
*****
I've not used FILE * for this, but ordinary WriteFile (CFile::WriteFile) has worked just
fine for me.
*****
>
>- My final question. How do i define a chinese/japanese string in the code?
>When i cut/paste from chinese files all i get is question marks in the code.
*****
There is a C compiler convention that represents wide characters as ?-escaped characters
(called "trigraph sequences"). It has been many years (about 15) since I last dealt with
this. The ISO Standard C defines a small number of these trigraph sequences, but some
Japanese and Chinese compilers define a lot more.

It could also be caused by a Unicode-to-ANSI conversion during the paste, where the ?
characters indicate 16-bit characters that have no 8-bit equivalences. At this point,
things get clumsy.

Here's how I'd tackle it. This may not be optimum, but someone may have a better idea.
Certainly I would consider writing a little program to do all this, for example, by
pasting all the lines into a Unicode file, then writing a 20-line C program that read each
line and emitted a piece of 8-bit C code which I would then copy-and-paste into my
program.

Create an ordinary text file with something like "test" in it, using NotePad.
Save this file in Unicode format.
Paste the string into this text window, replacing this line. You should see Unicode text.
Save the file.
Open the file in Visual Studio, in binary mode.
Retype the bytes you see into your string using \x formatting.

If I had to do this for more than two strings, or more than about 20 characters, I'd write
the program described above. You can probably write the program faster than you can retype
the lines.

void ConvertFileTo8Bit(const CString & filename)
   {
    CFile f;
    if(!f.Open(filename, CFile::modeRead))
       { /* deal with error */
        ... 
        return;
      } /* deal with error */

   DWORD len = f.GetFileLength();
   BYTE * buffer = new BYTE[len];
   LPWCSTR uni = (LPWCSTR)buffer;
   printf("L\"");
   for(int i = 0; i < len / sizeof(WCHAR); i++)
       {
         WCHAR ch = uni[i];
         if(ch == L'\r')
              continue;
         if(ch == L'\n')
             { /* eol */
               printf("\"\n");
               continue;
            } /* eol */
       if(ch >= L' ' && ch <= L'~')
          printf('%c', ch);
       else
          printf("\\x%04x", ch);
      }
   delete [ ] buffer;
   f.Close();
}

This will emit in stdout a series of strings of the form
L"ABC\x8d7E test"

(I have no idea what character 8d7E actually means; I flipped my Unicode book open to a
page on JIS characters and chose one randomly).

Note that you would not want to use _T around such strings; they really are always wide
strings and would have to be L" always, even in a non-Unicode app. 
			joe


>
>-----
>Thanks,
>Ben.
>

Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
0
newcomer (15974)
6/30/2004 10:13:24 PM
Thanks Joe, i was looking for something like that.

I also noticed from your email that you are the writer of the CString
Management page on www.flounder.com which was
a great help to us here.

"Joseph M. Newcomer" <newcomer@flounder.com> wrote in message
news:j9d6e0171a59m6leukk0b59gh8r5kq6qv7@4ax.com...
> See below...
> On Tue, 29 Jun 2004 17:14:01 +0200, "JoeKowalski" <JoeKowalski@usa.com>
wrote:
........
>
> Joseph M. Newcomer [MVP]
> email: newcomer@flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm


0
7/1/2004 10:32:21 AM
Follow up question-
How about support for multi-byte code pages in non-unicode MFC apps?

I hope never to start such a project, but I'm working on a "legacy"one right
now.

My question is, does one need to use TCHAR instead of char to get this kind
of app to work on, say a Chinese Win98 machine?

Richard



"JoeKowalski" <JoeKowalski@usa.com> wrote in message
news:40e178bf$0$93324$e4fe514c@news.xs4all.nl...
> Hi,
>
> i'm converting a MFC application to support unicode. I have some (probably
> noob) questions:
>
> - What is the difference between the types _TCHAR and TCHAR ? I see some
> UNICODE applications use _TCHAR and others TCHAR.
>
> - I have to convert the way textfiles are written. These text-files are
send
> to machines using parallel port, so they have to stay the same as before
> (when my application had no _UNICODE preprocessor definition). The
commands
> that are used to write these files are fputs, fopen, etc. When changing it
> to support wide characters are the textfiles still the same?
>
> example (_UNICODE has been defined):
>
>  FILE *unicodeTextFile = _tfopen(_T("uni_test.txt"), _T("w+")); // used to
> be fopen
>  _TCHAR test[] = _T("Is this in UNICODE ??
> abcdefghijklmnopqrstuvwxyz0123456789"); // used to be char
>  _fputts(test, unicodeTextFile); // used to be fputs
>  fclose(unicodeTextFile);
>
> Above example will create a file uni_test.txt, but all characters are
single
> byte in this file. When will this text file support wide characters? I was
> thinking that wide characters would be written into this file.
>
> - My final question. How do i define a chinese/japanese string in the
code?
> When i cut/paste from chinese files all i get is question marks in the
code.
>
> -----
> Thanks,
> Ben.
>
>


0
Sorry628 (10)
7/1/2004 11:24:46 PM
WideCharToMultiByte/MultiByteToWideChar with the "code page" designated as CPP_UTF8 (if I
recall the symbol correctly) will convert between Unicode and UTF-8. There are some other
multibyte character sets supported, but the only one I've ever needed is UTF-8. In this
case, you would use 8-bit (multibyte) strings for extenral representation and write all
the internals of the program using Unicode.
					joe

On Thu, 1 Jul 2004 16:24:46 -0700, "Richard Otter" <sorry@yahoo.com> wrote:

>Follow up question-
>How about support for multi-byte code pages in non-unicode MFC apps?
>
>I hope never to start such a project, but I'm working on a "legacy"one right
>now.
>
>My question is, does one need to use TCHAR instead of char to get this kind
>of app to work on, say a Chinese Win98 machine?
>
>Richard
>
>
>
>"JoeKowalski" <JoeKowalski@usa.com> wrote in message
>news:40e178bf$0$93324$e4fe514c@news.xs4all.nl...
>> Hi,
>>
>> i'm converting a MFC application to support unicode. I have some (probably
>> noob) questions:
>>
>> - What is the difference between the types _TCHAR and TCHAR ? I see some
>> UNICODE applications use _TCHAR and others TCHAR.
>>
>> - I have to convert the way textfiles are written. These text-files are
>send
>> to machines using parallel port, so they have to stay the same as before
>> (when my application had no _UNICODE preprocessor definition). The
>commands
>> that are used to write these files are fputs, fopen, etc. When changing it
>> to support wide characters are the textfiles still the same?
>>
>> example (_UNICODE has been defined):
>>
>>  FILE *unicodeTextFile = _tfopen(_T("uni_test.txt"), _T("w+")); // used to
>> be fopen
>>  _TCHAR test[] = _T("Is this in UNICODE ??
>> abcdefghijklmnopqrstuvwxyz0123456789"); // used to be char
>>  _fputts(test, unicodeTextFile); // used to be fputs
>>  fclose(unicodeTextFile);
>>
>> Above example will create a file uni_test.txt, but all characters are
>single
>> byte in this file. When will this text file support wide characters? I was
>> thinking that wide characters would be written into this file.
>>
>> - My final question. How do i define a chinese/japanese string in the
>code?
>> When i cut/paste from chinese files all i get is question marks in the
>code.
>>
>> -----
>> Thanks,
>> Ben.
>>
>>
>

Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
0
newcomer (15974)
7/2/2004 1:10:19 AM
> Follow up question-
> How about support for multi-byte code pages in non-unicode MFC apps?
> 
> I hope never to start such a project, but I'm working on a "legacy"one
> right now.
> 
> My question is, does one need to use TCHAR instead of char to get this
> kind of app to work on, say a Chinese Win98 machine?
It is not necessary. But may be usefull long term (if at some point you
want to switch to Unicode). What IS necessary is using the generic
functions (_t formats). Here is why: to compare 2 strings in a
case-insensitive way you have to use _stricmp for single byte languages,
_mbsicmp for double byte languages and _wcsicmp for Unicode applications. 
Isn't it easier to use _tcsicmp ?

Also you have to be carefull of string iterating. Example:

char *s = string;
while( *s != '\\' )
    	s++;

This is wrong!
Some Japanese characters have the second byte 0x5C (same as back-slash)
Use CharNext, CharNextEx, CharPrev, CharPrevEx for string iterations.

There are many other thing, but the general answer is: generic data
types are not necesary for DBCS, but they do help. They are not a MFC
mechanism, so there is no reason to avoid them for non-MFC applications.

-- 
Mihai
-------------------------
Replace _year_ with _ to get the real email
0
7/2/2004 6:00:05 AM
Reply:

Similar Artilces:

No text in email body
mail delivered to my mail account has no text in the message body. This just started happening in the last few days ?? HELP !!! ...

How can I capatalize text in a column?
I would like to select a column to incorporate only upper case letters. How do I do this? I would dump the column into Word then do Format - Change Case - UPPERCASE then dump it back into Excel. I do not see this feature in Excel. "dgysr" wrote: > I would like to select a column to incorporate only upper case letters. How > do I do this? will you get your desired results if you use the =UPPER() formula in another column? "dgysr" wrote: > I would like to select a column to incorporate only upper case letters. How > do I do this? ...

Using Logical OR on text in Conditional Formatting
I have a column with various text entries. I have created a conditional formatting rule for each type of entry but the formatting can be grouped together, so I would like a Conditional Formatting rule that says: If string contains "foo" or "bar" then colour red If string does not contain "yibble" or "wibble" then make bold etc. Any idea how I can do this rather thasn having to manage over 30 rules with each one looking for a single text entry type. Thanks Darius Try a formula of =OR(A2="foo",A2="bar") =AND(A2<>"y...

convert Excel data to contacts
When I try to import data from Excel spreadsheet to Outlook contacts I get an error message that says: "An ODBC error has occurred in the Microsoft Excel translator while getting the contents of a file system. [Microsoft][ODBC Driver Manager]Data source name not found and no driver specified." How can I fix this so I can do the conversion? Have you tried simply exporting the Excel file to CSV format, then importing that? On Sat, 20 Sep 2003 16:15:41 -0700, "Rohn" <rohn@larac.net> wrote: >When I try to import data from Excel spreadsheet to >Outlook cont...

personal folder files
I was given a disk that contains pst.files. I saved the information onto my C drive. I clicked on Outlook, went to open and selected my personal folder files. Once there I selected the location and the file and went to open it. An error message popped up stating the "properties for this information service must be defined prior to use". Then it tells me I don't have permission to open this file. I'm not sure what this warning means. Any suggestions would help me greatly. Also the person who created this disk stated it should be able to open in Outlook 2000, thoug...

Block Email of certain file name or even content
We have a user who somehow got the lists of salary for everyone in the company - we know the document name - IS there any way we can block by this name or even by the word salary? Thank you. Can your Antivirus software for Exchange block by keywords? I have McAfee Groupshield and it can accomplish that. Never tested it though. -- - Ahmed Jalal "JP" <jp@msn.com> wrote in message news:uaISWvOiGHA.1204@TK2MSFTNGP02.phx.gbl... > We have a user who somehow got the lists of salary for everyone in the > company - we know the document name - IS there any way we can...

Seperating text, backwards to frontwards #2
Hi All, I have a field which is like the below example: "Disappointed at failure - 68 Interior equipment - Repair group: Body - ..Archive" Now I want to split this into fields, as per below: A B C D ..Archive Repair group: Body 68 Interior equipment Dissapointed at failure I know that I can use text to columns to split the field by the "-" but then how do I get it reversed? NOT all of the fields can be split into 4, sometimes there can be up to 7 or as little as 1. Does anyone have an answer? Regards Adam use ...

excel file cannot be viewed
i used access to export a table to another excel file. but the excel file cannot be viewed by just double-clicking its icon. however, it can be viewed by opening it from excel itself. why is it so? please help me. any help is very much appreciated. thank you in advance Try this: <Tools> <Options> <General> tab, And *uncheck* "Ignore Other Applications". -- HTH, RD ============================================== Please keep all correspondence within the Group, so all may benefit! ============================================== "FooYC" <FooYC@discus...

Problem opening Txt file in excel 2007
Hi there, We are trying to open .txt documents in excel like we used to in excel 2003, but since upgrade to excel 2007 when we right click and try to open the txt file with excel 2007 it always opens it in Notepad... Even when we specifically choose Excel to open it with, it still opens it with notepad.. Does anyone know an easy fix for this by any chance?? Cheers! Open Excel, then Office button, Open, and get your txt file from there. T. Smith wrote: > Hi there, > > We are trying to open .txt documents in excel like we used to in excel 2003, > but since upgrade to...

Automate text import
Hi, I've wrtten a batch file to audit all our servers, what I'd now like to do though is come up with a way excel can load these text files into one worksheet. Can anyone please help? Below is an example of the data gathered - the text files are named after the computer. Thanks Ben Computername: TEST Kernel version: Microsoft Windows 2000, Uniprocessor Free Product type: Professional Service pack: 3 IE version: 5.0100 System root: C:\WINNT Processors: 1 Processor speed...

Restoring Money file to a new PC
I am trying to open a Money file on a newly configured Dell PC (new hard drive and a fresh copy of XP standard.) I reinstalled my copy of MS Money 2001 and tried to open the most recent money file (*.mny from my back up) and it is asking for a password before it will open?? Only problem is that I have never used a password on MS Money....ever! What am I doing wrong or is there a way around this? Any suggestions helpful...Thanks KK In microsoft.public.money, KKRoadie wrote: >I am trying to open a Money file on a newly configured Dell PC (new hard >drive and a fresh copy of XP s...

Export Mailbox to .PST file
I want to clean up our exchange server and while I do not want to delete mailboxes of users who are no longer here at the company, I would like to export these old mailboxes that are not being used to a .pst file. is this possible? Thank you As long as you have sufficient permissions to the mailboxes and the mailboxes are still associated with user accounts in Active Directory, you can use the ExMerge utility that is found in the Support tools folder on the Exchange server CD. It is used to export the contents of an active mailbox to a .PST file, or merge the contents of a .PST file into...

Fragmented .pst file in Outlook 2002
My .pst file in Outlook 2002 remains severely fragmented, and two separte defrag programs will not defragment it. How can I defragment it? Thanks. Make sure Outlook isn't running during the defrag or run a boot defrag (Diskeeper) -- Robert Sparnaaij [MVP-Outlook] www.howto-outlook.com Tips of the month: -What do the Outlook Icons Mean? -Create an Office 2003 CD slipstreamed with Service Pack 1 ----- "Martin" <mmenez1981@kellogg.northwestern.edu> wrote in message news:bbfd01c4893a$2409ff80$a601280a@phx.gbl... > My .pst file in Outlook 2002 remains severely fragme...

How to open Lotus123 files
I recently installed Office 2003 and cannot open Lotus 123 files (Millenium Edition). I get the error message "Cannot open Lotus 123 file". Does anyone know how to fix this?. Previously I was able to open Lotus files in Office 97. Hope you can help. Thanks for your time RB Excel cannot open 1-2-3 files that are in the .123 file format. It can only deal with files in the older .WK? format (WK4, WK3, WK1). You can save files using 1-2-3 to the older format or directly as .XLS files I believe. For a third party conversion tool look at Conversion Plus from www.datawiz.com ...

Error accessing Help files
I am unable to use any of the hyperlinks in the Help topics. I can select a topic, which opens right up, but then if I click on Related Topics or any of the other hyperlinked topics, I get a script error. I am running Windows 2000, Version 5.0. My Publisher is version 2000 SR-1. Suggestions? After managing to set up OE-QuoteFix on his new PC, Ed reads a message from TJF <tomme_fent@iand.uscourts.gov>... > I am unable to use any of the hyperlinks in the Help > topics. I can select a topic, which opens right up, but > then if I click on Related Topics or any of the other...

Converting Distribution List to Mailbox
Quick question: I have a distribution list that a user wants to convert to a mailbox. Is deleting the distribution list with all its members/e-mail addresses, then recreating it as a mailbox the easiest way to get this done? Thanks in advance, Vik Sorry I forgot to add we're using Exchange Enterprise 2K3 with Windows 2003 Server standard. "Vik" wrote: > Quick question: > > I have a distribution list that a user wants to convert to a mailbox. Is > deleting the distribution list with all its members/e-mail addresses, then > recreating it as a mail...

How do I point "Save My Settings" to a file, not the web?
I've run the Save My Settings wizard, but it seems to want me to save my settings online only, and I'm not given the chance to save them to a file on my backup drive, which is where I want it. How do I do this? This feature has changed in recent versions of Outlook. Read this article: http://support.microsoft.com/default.aspx?kbid=816040 >-----Original Message----- >I've run the Save My Settings wizard, but it seems to want me to save my >settings online only, and I'm not given the chance to save them to a file on >my backup drive, which is where I want i...

Problem Converting to 2007
I have an Access db that is in the 2000 file format. Every time I go through the steps to convert it, Save As, Access 2007, I get the message that I attempted to open a database that is already opened exclusively by... I have to open it to convert it don't I? I double-checked and the database is set to open in shared mode. I have tried the same procedure with a different database and also on a different computer - always get the same message. How do I get it into 2007 format? Thanks in advance for your help! ...

form & code allowing user to select location of associated file,
I have a simple database, basically a library list of documents. I have a column for hyperlink in my main table, but I would like to make it so any user can be prompted to navigate to the file location on the server, rather than have to manually type, or copy paste the address into the form. Any help would be greatly appreciated. Im not totally new to Access, but quite rusty as it's been some time since I have played with code. Thank you, Tracy Hi Tracy, use the code on the Access web http://www.mvps.org/access/api/api0001.htm to allow users to navigate to the file location on the se...

Is it possible to suppress the Security Warning Popup for Selected Excel Files
My diary is based in excel. It has Macros in it. Is it possible to disable the Security Warning Pop-up for that single excel file without disabling it for all other excel files. TIA Gerry. Gerry You can sign the workbook and get rid of the "enable macros" without changing security level to low. For the macros warning, if just for your own use on your computer, you can use the SelfCert tool to create a certificate that sets the macros in a workbook to "Trusted" and does not display the macros warning. Check out Help on Digital Signing and the the Self-Cert utili...

RSP file access error
OK, this is weird, and I've not figured out what is going on. One of my clients is using VS7, with the source on a shared file server. The directory is marked as "full access" for "autheticated users" (we've played a bit with this with no difference). What little VS7 use I've done builds a project off my own file server with no problems. Any attempt to build fails with the error that the cl command is unable to access the rsp00003.rsp file. When the cl command executes under VS7, the compiler gives the error that it is unable to find the .rsp file (for some s...

Huge increase in file size after update!
After installing the recent update to money premium 2005, the money folder has gone from 105mb to 200mb!. I'm pretty certain the size of the update is less than 2mb...anyone notice this?....is there a fix available? I pressed for space on my laptop, so this is important In microsoft.public.money, Tony wrote: >After installing the recent update to money premium 2005, the money folder >has gone from 105mb to 200mb!. I'm pretty certain the size of the update is >less than 2mb...anyone notice this?....is there a fix available? I pressed >for space on my laptop, so th...

win 7 64 bit program files question
I'm trying to run vst's with Sonar, a music arranger etc. Because the vst is 32 bit it is loaded in program file(86) and Sonar, which is 64 bit, can't find them. It's a real problem. Why are there two program file folders? Is it just to keep them straight or is it actually functional as to how windows handles the programs? It makes transitioning from 32 bit almost impossible. I am thinking of starting over, deleting the program files (x86) folder and making sure I redirect every install to plain "program files". Would that really screw things up? ...

2004 Download Help Files Don't Work
Hello, I recently downloaded the Money 2004 direct from the Microsoft website (M12USWEB)and then purchased the upgrade to the full Deluxe edition. My problem is that the help files are not working. The topics themselves show up in the help window, but when I click on any topic (or question that I've input), the window says "Unable to load". I contacted MS support and they suggested I uninstall and then redownload/reinstall and input my unlock code again. I have tried this to no avail. The help files do not work on my computer with either the trial version or the upgr...

file-send to-mail receipient(as attachment)
Whenever I use this option(see subject) my signatue is always doublespaced. I have tried every option I know of to correct this and cannot. A normal email the signature is single spaced. jeanne <anonymous@discussions.microsoft.com> wrote: > Whenever I use this option(see subject) my signatue is > always doublespaced. I have tried every option I know of > to correct this and cannot. A normal email the signature > is single spaced. What app are you running when you click "File"? I don't see a "Send" item under the "File" menu in Outl...