Want Input boxes to accept unicode strings on Standard Windows XP

 
   I have a MFC application that is currently built with MBCS mode. If I run 
the program on a Chinese OS (Windows XP), the input boxes  (Edit Controls) 
can accept  Chinese chars and display correctly. If I run it on a standard 
English XP, the input boxes won't accept Chinese chars (display as "????") -- 
please note that I have already installed CKJ on the system and IE and 
Outlook can display Chinese correctly. 

   Is this just because of different MFC libraries used for the application? 
Can I force the application running on Standard XP to use the unicode 
libraries so to force Edit Controls accept Chinese chars input? Currently 
completely rewriting the code to use unicode is not option for me.

   Any help will be appreciated.

  Paul


-- 
Developer
0
PaulWu (6)
7/24/2007 8:46:07 PM
vc.mfc 33608 articles. 0 followers. Follow

47 Replies
302 Views

Similar Articles

[PageSpeed] 27

Have you tried building with Unicode?  I switched my applications over to 
this and I got away from a ton of code page problems.  If you build with 
Unicode it will use the Unicode versions of the MFC as well.

Tom

"Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
news:98CBAF48-324F-4167-85F0-6F4E72588838@microsoft.com...
>
>   I have a MFC application that is currently built with MBCS mode. If I 
> run
> the program on a Chinese OS (Windows XP), the input boxes  (Edit Controls)
> can accept  Chinese chars and display correctly. If I run it on a standard
> English XP, the input boxes won't accept Chinese chars (display as 
> "????") -- 
> please note that I have already installed CKJ on the system and IE and
> Outlook can display Chinese correctly.
>
>   Is this just because of different MFC libraries used for the 
> application?
> Can I force the application running on Standard XP to use the unicode
> libraries so to force Edit Controls accept Chinese chars input? Currently
> completely rewriting the code to use unicode is not option for me.
>
>   Any help will be appreciated.
>
>  Paul
>
>
> -- 
> Developer 

0
tom.nospam (3240)
7/24/2007 9:02:22 PM
Thanks for replying. As I said, building it with Unicode takes huge effort -- 
not feasible at current stage. We just want the applications to be albe to 
process some unicode texts now (on Standard Windows XP). 

I looked at the application -- it was built with static MFC libraries 
(Visual Studio 2003).  So the MFC libraries may not the problem -- I just 
don't understand why when it runs on Chinese Windows XP, the Edit Controls 
can accept Chinese texts.

Paul

-- 
Developer


"Tom Serface" wrote:

> Have you tried building with Unicode?  I switched my applications over to 
> this and I got away from a ton of code page problems.  If you build with 
> Unicode it will use the Unicode versions of the MFC as well.
> 
> Tom
> 
> "Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
> news:98CBAF48-324F-4167-85F0-6F4E72588838@microsoft.com...
> >
> >   I have a MFC application that is currently built with MBCS mode. If I 
> > run
> > the program on a Chinese OS (Windows XP), the input boxes  (Edit Controls)
> > can accept  Chinese chars and display correctly. If I run it on a standard
> > English XP, the input boxes won't accept Chinese chars (display as 
> > "????") -- 
> > please note that I have already installed CKJ on the system and IE and
> > Outlook can display Chinese correctly.
> >
> >   Is this just because of different MFC libraries used for the 
> > application?
> > Can I force the application running on Standard XP to use the unicode
> > libraries so to force Edit Controls accept Chinese chars input? Currently
> > completely rewriting the code to use unicode is not option for me.
> >
> >   Any help will be appreciated.
> >
> >  Paul
> >
> >
> > -- 
> > Developer 
> 
0
PaulWu (6)
7/24/2007 9:14:01 PM
Paul Wu schrieb:

> Thanks for replying. As I said, building it with Unicode takes huge effort -- 
> not feasible at current stage. We just want the applications to be albe to 
> process some unicode texts now (on Standard Windows XP). 
> 
> I looked at the application -- it was built with static MFC libraries 
> (Visual Studio 2003).  So the MFC libraries may not the problem -- I just 
> don't understand why when it runs on Chinese Windows XP, the Edit Controls 
> can accept Chinese texts.

Go into control panel -- regional settings -- Languages
and select the support for east-aisan languages. This installs the required 
files to display and use chinese text on any windows xp.

Norbert
0
nunterberg (207)
7/24/2007 10:29:45 PM
"Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
news:11226219-DE3D-4722-BDCE-7FDB8C49C766@microsoft.com...
> Thanks for replying. As I said, building it with Unicode takes huge 
> effort -- 
> not feasible at current stage. We just want the applications to be albe to
> process some unicode texts now (on Standard Windows XP).
>
> I looked at the application -- it was built with static MFC libraries
> (Visual Studio 2003).  So the MFC libraries may not the problem -- I just
> don't understand why when it runs on Chinese Windows XP, the Edit Controls
> can accept Chinese texts.
>

There is a Regional Control Panel that lets you specify the default code 
page for non-Unicode apps.  If you set that to Chinese, then restart your 
app, does it work?

If this works, I think you can call SetThreadLocale() in your app's 
CWinApp-derived::OnInitInstance() method to accomplish the same thing 
without worrying about the Control Panel setting.

-- David 


0
dc2983 (3206)
7/24/2007 10:33:56 PM
Are you sure about that "huge effort"?  I've had non-Unicode apps cross my desk with
instructions to convert them to Unicode, and I can usually do it with about three days of
editing and testing, and get it nearly perfect...and the remaining bugs are found within a
couple days of testing.  I'm talking source here on the order of 60K SLOC, but the effort
is about the same for 20K SLOC because it is usually just a long set of very similar
substitution patterns most of which are automated pattern search-and-replace with my text
editor.
					joe
On Tue, 24 Jul 2007 14:14:01 -0700, Paul Wu <PaulWu@discussions.microsoft.com> wrote:

>Thanks for replying. As I said, building it with Unicode takes huge effort -- 
>not feasible at current stage. We just want the applications to be albe to 
>process some unicode texts now (on Standard Windows XP). 
>
>I looked at the application -- it was built with static MFC libraries 
>(Visual Studio 2003).  So the MFC libraries may not the problem -- I just 
>don't understand why when it runs on Chinese Windows XP, the Edit Controls 
>can accept Chinese texts.
>
>Paul
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
0
newcomer (15974)
7/25/2007 12:30:35 AM
Hi David,

This works most of the time, but I've found with Asian languages there are 
always some problems and the MFC libraries will still display in English. 
There is always the problem of ensuring that the code page on the user's 
computer is correct as well (not just the developer's).  If the user saves a 
file in Chinese (even if the code page is correct) then accesses it in 
English the file will get "corrupted".  There will also be problems with 
translating strings like XML and other things as well.

I worked through a lot of these issues then decided it was easier to just go 
to Unicode for any application where I actually need multiple byte 
characters (like Asian languages).

I guess you could make it work so long as you always know the exact code 
page for the strings, but this is always making an assumption.

Tom

"David Ching" <dc@remove-this.dcsoft.com> wrote in message 
news:o7vpi.53032$5j1.34602@newssvr21.news.prodigy.net...
> "Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
> news:11226219-DE3D-4722-BDCE-7FDB8C49C766@microsoft.com...
>> Thanks for replying. As I said, building it with Unicode takes huge 
>> effort -- 
>> not feasible at current stage. We just want the applications to be albe 
>> to
>> process some unicode texts now (on Standard Windows XP).
>>
>> I looked at the application -- it was built with static MFC libraries
>> (Visual Studio 2003).  So the MFC libraries may not the problem -- I just
>> don't understand why when it runs on Chinese Windows XP, the Edit 
>> Controls
>> can accept Chinese texts.
>>
>
> There is a Regional Control Panel that lets you specify the default code 
> page for non-Unicode apps.  If you set that to Chinese, then restart your 
> app, does it work?
>
> If this works, I think you can call SetThreadLocale() in your app's 
> CWinApp-derived::OnInitInstance() method to accomplish the same thing 
> without worrying about the Control Panel setting.
>
> -- David
> 

0
tom.nospam (3240)
7/25/2007 12:33:26 AM
Hi Joe,

I think the effort to do a really big application can be pretty "huge", but 
you have to weigh it against the effort of trying to get it to work other 
ways.  A lot of it depends on how the strings are implemented as well.  If 
they are mostly in files or the .rc file then it is easy to convert them 
(you can even use NotePad).  The 2005 version of VS will even use Unicode 
..RC files if they are converted first which is very handy.  Users will also 
have to use the _T() and TCHAR macros which is why we harp on it so much 
even when just starting with MBCS.  Still, you're right, once you go through 
the process a couple of times the conversion thing is pretty academic.  The 
compiler will gripe about strings that don't have the _T() macro around them 
so they are easy to find for the most part.  The hardest things are things 
like:

CString cs = "My String";

Since CString will try convert the string to MBCS...

I think Giovanni wrote some code to update un-macro'd strings to the _T() 
versions.  Did you post that on your site?

Tom

"Joseph M. Newcomer" <newcomer@flounder.com> wrote in message 
news:766da3dava3rt8jimcqb10jkm8bcn8m4hg@4ax.com...
> Are you sure about that "huge effort"?  I've had non-Unicode apps cross my 
> desk with
> instructions to convert them to Unicode, and I can usually do it with 
> about three days of
> editing and testing, and get it nearly perfect...and the remaining bugs 
> are found within a
> couple days of testing.  I'm talking source here on the order of 60K SLOC, 
> but the effort
> is about the same for 20K SLOC because it is usually just a long set of 
> very similar
> substitution patterns most of which are automated pattern 
> search-and-replace with my text
> editor.
> joe
> On Tue, 24 Jul 2007 14:14:01 -0700, Paul Wu 
> <PaulWu@discussions.microsoft.com> wrote:
>
>>Thanks for replying. As I said, building it with Unicode takes huge 
>>effort -- 
>>not feasible at current stage. We just want the applications to be albe to
>>process some unicode texts now (on Standard Windows XP).
>>
>>I looked at the application -- it was built with static MFC libraries
>>(Visual Studio 2003).  So the MFC libraries may not the problem -- I just
>>don't understand why when it runs on Chinese Windows XP, the Edit Controls
>>can accept Chinese texts.
>>
>>Paul
> Joseph M. Newcomer [MVP]
> email: newcomer@flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm 

0
tom.nospam (3240)
7/25/2007 12:39:54 AM
Thank you very much for this tip.  I still have tons of work to do -- but 
hopefully this can save me half of the time.

-- 
Developer


"David Ching" wrote:

> "Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
> news:11226219-DE3D-4722-BDCE-7FDB8C49C766@microsoft.com...
> > Thanks for replying. As I said, building it with Unicode takes huge 
> > effort -- 
> > not feasible at current stage. We just want the applications to be albe to
> > process some unicode texts now (on Standard Windows XP).
> >
> > I looked at the application -- it was built with static MFC libraries
> > (Visual Studio 2003).  So the MFC libraries may not the problem -- I just
> > don't understand why when it runs on Chinese Windows XP, the Edit Controls
> > can accept Chinese texts.
> >
> 
> There is a Regional Control Panel that lets you specify the default code 
> page for non-Unicode apps.  If you set that to Chinese, then restart your 
> app, does it work?
> 
> If this works, I think you can call SetThreadLocale() in your app's 
> CWinApp-derived::OnInitInstance() method to accomplish the same thing 
> without worrying about the Control Panel setting.
> 
> -- David 
> 
> 
> 
0
PaulWu (6)
7/25/2007 1:16:01 AM
"Tom Serface" <tom.nospam@camaswood.com> wrote in message 
news:291E3653-F927-48C8-AA3A-3A42E0BAED0F@microsoft.com...
> Hi David,
>
> This works most of the time, but I've found with Asian languages there are 
> always some problems and the MFC libraries will still display in English.

But wouldn't the MFC libraries still display in English even in the UNICODE 
build?  Building in UNICODE doesn't fix that....  Actually, we've statically 
linked to the MFC English version for years, and have never had an issue (at 
least none have been reported), probably because no MFC UI is normally 
displayed.


> There is always the problem of ensuring that the code page on the user's 
> computer is correct as well (not just the developer's).

By this do you mean by setting the Regional Control Panel or 
SetThreadLocale() appropriately?  I did some tests the other day and saw 
that MultiByteToWideChar(CP_ANSI, ...) converted a MBCS string to Unicode 
differently based on the Regional Control Panel setting.  My take was that 
setting the Regional Control Panel altered CP_ANSI.  I presume 
SetThreadLocale() does the same thing, albeit only for the calling thread 
and not on a system global basis.


> If the user saves a file in Chinese (even if the code page is correct) 
> then accesses it in English the file will get "corrupted".

Yes, these were all very well known (and grudgingly accepted) problems in 
the Win9x world where Unicode was not very well supported.


> There will also be problems with translating strings like XML and other 
> things as well.
>

For XML, even if you have an Ansi (non-Unicode) XML file, if the first line 
has at least

    <?xml encoding="<insert encoding">

then IE displays the XML file correctly.  (IE has become our default XML 
viewer.)  So the "encoding" attribute means a lot here.  I still don't know 
if saving the XML file in Unicode (with the 0xFFEE BOM) causes the text to 
be displayed correctly regardless of the "encoding" attribute.  Our little 
XML parser does not read Unicode XML files, nor does it honor the "encoding" 
attribute.  Therefore, even though it returns LPWSTR strings, they have been 
converted to Unicode strings based on the CP_ANSI codepage, and that (seems 
to) require the Regional Control Panel to be set to the language that was 
used to create the XML file.  Do you know if MSXML or some of the "big boys" 
or FirstObject parsers read Unicode files?


> I worked through a lot of these issues then decided it was easier to just 
> go to Unicode for any application where I actually need multiple byte 
> characters (like Asian languages).
>

UNICODE builds make it easier to display Asian text, but our problem is how 
to construct reliable LPWSTR from things like XML files.

In some cases it was not straightforward to port from Ansi to Unicode due to 
the fact that code relies on single-byte character strings to perform their 
functions.  Things from driver-land which wouldn't know what to do with a 
UNICODE string if we could even train device driver writers about UNICODE! 
;)


> I guess you could make it work so long as you always know the exact code 
> page for the strings, but this is always making an assumption.
>

Yes, and I'm not happy with that, but our scheme seems to have been 
acceptable so far.  Perhaps the results aren't so great, it's just that the 
poor people affected by this are so used to it, they don't complain.


-- David


0
dc2983 (3206)
7/25/2007 1:31:09 AM
Great, Paul!  :-)

To clarify, did both the Regional Control Panel and SetThreadLocale() work, 
or just one of them?

Thanks,
David



"Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
news:EFE95A13-E586-4554-88D6-EC7B47DC8356@microsoft.com...
> Thank you very much for this tip.  I still have tons of work to do -- but
> hopefully this can save me half of the time.
>
> -- 
> Developer
>
>
> "David Ching" wrote:
>
>> "Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message
>> news:11226219-DE3D-4722-BDCE-7FDB8C49C766@microsoft.com...
>> > Thanks for replying. As I said, building it with Unicode takes huge
>> > effort -- 
>> > not feasible at current stage. We just want the applications to be albe 
>> > to
>> > process some unicode texts now (on Standard Windows XP).
>> >
>> > I looked at the application -- it was built with static MFC libraries
>> > (Visual Studio 2003).  So the MFC libraries may not the problem -- I 
>> > just
>> > don't understand why when it runs on Chinese Windows XP, the Edit 
>> > Controls
>> > can accept Chinese texts.
>> >
>>
>> There is a Regional Control Panel that lets you specify the default code
>> page for non-Unicode apps.  If you set that to Chinese, then restart your
>> app, does it work?
>>
>> If this works, I think you can call SetThreadLocale() in your app's
>> CWinApp-derived::OnInitInstance() method to accomplish the same thing
>> without worrying about the Control Panel setting.
>>
>> -- David
>>
>>
>> 


0
dc2983 (3206)
7/25/2007 1:33:58 AM
"David Ching" <dc@remove-this.dcsoft.com> wrote in message 
news:xJxpi.27927$2v1.1892@newssvr14.news.prodigy.net...

> But wouldn't the MFC libraries still display in English even in the 
> UNICODE build?  Building in UNICODE doesn't fix that....  Actually, we've 
> statically linked to the MFC English version for years, and have never had 
> an issue (at least none have been reported), probably because no MFC UI is 
> normally displayed.

Yes, you're right about that.  That happens based on the windows 
installation so far as I can tell.

> By this do you mean by setting the Regional Control Panel or 
> SetThreadLocale() appropriately?  I did some tests the other day and saw 
> that MultiByteToWideChar(CP_ANSI, ...) converted a MBCS string to Unicode 
> differently based on the Regional Control Panel setting.  My take was that 
> setting the Regional Control Panel altered CP_ANSI.  I presume 
> SetThreadLocale() does the same thing, albeit only for the calling thread 
> and not on a system global basis.

The problem, for me, has been that I don't know what language will 
eventually be used.  We even tried embedding the code page number in our 
text file, but still had problems reading some files under different code 
pages.  It was a lot less hassle with Unicode.

> Yes, these were all very well known (and grudgingly accepted) problems in 
> the Win9x world where Unicode was not very well supported.

Yeah, but we're caring less about that all the time ;o)
>
> For XML, even if you have an Ansi (non-Unicode) XML file, if the first 
> line has at least
>
>    <?xml encoding="<insert encoding">
>
> then IE displays the XML file correctly.  (IE has become our default XML 
> viewer.)  So the "encoding" attribute means a lot here.  I still don't 
> know if saving the XML file in Unicode (with the 0xFFEE BOM) causes the 
> text to be displayed correctly regardless of the "encoding" attribute. 
> Our little XML parser does not read Unicode XML files, nor does it honor 
> the "encoding" attribute.  Therefore, even though it returns LPWSTR 
> strings, they have been converted to Unicode strings based on the CP_ANSI 
> codepage, and that (seems to) require the Regional Control Panel to be set 
> to the language that was used to create the XML file.  Do you know if 
> MSXML or some of the "big boys" or FirstObject parsers read Unicode files?

I think CMarkUp handles Unicode, but I haven't tried MSXML.  I know Xerces 
handles it as well.  I tried using the encoding= thing, but it has the same 
problems with using one file saved in one language in another.  I could be 
wrong, but I found the whole thing of balancing code pages more trouble than 
I thought it was worth.

> UNICODE builds make it easier to display Asian text, but our problem is 
> how to construct reliable LPWSTR from things like XML files.

I use the Xerces parser and I've never had a problem with reading or saving 
Unicode files.  Actually I store my XML in UTF-8 to compact them a bit. 
Seems to work OK.

> In some cases it was not straightforward to port from Ansi to Unicode due 
> to the fact that code relies on single-byte character strings to perform 
> their functions.  Things from driver-land which wouldn't know what to do 
> with a UNICODE string if we could even train device driver writers about 
> UNICODE! ;)

Can't argue with that and I understand your point.  Of course, if you are 
relying on single byte strings you're going to have trouble with MBCS in 
Asian languages as well :o)

Tom

0
tom.nospam (3240)
7/25/2007 5:47:24 AM
If something is SBCS/MBCS, it will only handle text in the current system 
code page. So no Chinese on English system and so on.

It is possible to do some patching work and have an application that uses
some Unicode controls, and to do some Unicode text handling, without changing 
the whole thing to Unicode.
But in the end it is more painfull than a complete migration.
You will discover bug after bug, and you will waste a lot of time tracking
down the problems.
Some examples: once I can input Unicode text, I want to move it around in my
application as Unicode. Then I want to put it in the Clipoard, and save it
to a file as Unicode. And Unicode search-replace. Then print and print 
preview.
Then I want to support Unicode file names. Which means the documents history 
should also be stored as Unicode. Meaning Unicode registry access.
And so on.

Plan for it, byte the bullet, and move give up MBCS.
It is less painfull in the long run.


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/25/2007 7:09:40 AM
> But wouldn't the MFC libraries still display in English even in the UNICODE 
> build?  Building in UNICODE doesn't fix that....  Actually, we've
> statically linked to the MFC English version for years, and have never had
> an issue (at least none have been reported), probably because no MFC UI is
> normally displayed.

Unicode build, system locale, Windows UI, and MFC UI have very little to do 
with each other.
You can get MFC messages that match the language of your UI if you do
the localization and deployment right.

This is how you redistribute MFC localization:
   http://msdn2.microsoft.com/en-us/library/ms235264(vs.80).aspx
This is how you can localize MFC in languages that are not provided by MS:
   http://support.microsoft.com/kb/q208983/
And this is what you do about your application:
   http://www.mihai-nita.net/article.php?artID=20070503a

And then there is the story with OS dialogs/common controls (like the 
Open/Save, Print, Find, Color/Date pickers, dialogs, the context menu
of the standard edit control, the Yes/No/Ok/Cancel in the MessageBox, etc.)

This have nothing to do with MFC, but with the OS UI locale.


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/25/2007 7:17:08 AM
> For XML, even if you have an Ansi (non-Unicode) XML file, if the first line 
> has at least
> 
>     <?xml encoding="<insert encoding">
If ther encoding is not specified, then the encoding is assumed to be utf-8 
(this is what the standard says)


> I still don't know 
> if saving the XML file in Unicode (with the 0xFFEE BOM) causes the text to 
> be displayed correctly regardless of the "encoding" attribute.
That would be wrong according to the standard. And it is not supported by IE.


> Our little 
> XML parser does not read Unicode XML files, nor does it honor the
> "encoding" attribute.
The standard (http://www.w3.org/TR/2006/REC-xml11-20060816/#charencoding)
"All XML processors MUST be able to read entities in both the UTF-8 and UTF-
16 encodings."
So and XML parser without UTF-16 and UTF-8 is not an XML parser, is a hack.


> Do you know if MSXML or some of the "big boys" 
> or FirstObject parsers read Unicode files?
All of them do, if they claim to be XML parsers. If not, they are toys.
MSXML, Xerces, Expat, all handle UTF-8, UTF-16, and support encoding.


> Yes, and I'm not happy with that, but our scheme seems to have been 
> acceptable so far.  Perhaps the results aren't so great, it's just that the 
> poor people affected by this are so used to it, they don't complain.
Or they stopped buying your product and moved to something better.


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/25/2007 7:26:57 AM
David Ching wrote:

> For XML, even if you have an Ansi (non-Unicode) XML file, if the first line 
> has at least
> 
>     <?xml encoding="<insert encoding">
> 
> then IE displays the XML file correctly.  (IE has become our default XML 
> viewer.)  So the "encoding" attribute means a lot here.  I still don't know 
> if saving the XML file in Unicode (with the 0xFFEE BOM) causes the text to 
> be displayed correctly regardless of the "encoding" attribute.  Our little 
> XML parser does not read Unicode XML files, nor does it honor the "encoding" 
> attribute.  Therefore, even though it returns LPWSTR strings, they have been 
> converted to Unicode strings based on the CP_ANSI codepage, and that (seems 
> to) require the Regional Control Panel to be set to the language that was 
> used to create the XML file.  Do you know if MSXML or some of the "big boys" 
> or FirstObject parsers read Unicode files?

David:

Why don't you use UTF-8 for your XML files? That's what I do. In my 
development version, I only do Unicode builds, but the GUI is UTF-16 and 
the back-end (which handles the XML serialization and application 
configuration file) uses UTF-8.

[This is why I am so irritated by the implicit conversion features of 
CString, which AFAIK always uses the local code page to convert between 
8-bit and 16-bit strings. If you could set the "code page" of CString 
separately from anything else, then you could set it to UTF-8 and the 
implicit conversions might actually be useful (though I would still 
rather do without them). As it is, these features are tied to a concept 
(the ANSI coded page) which is rapidly becoming history (thank God).]

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/25/2007 9:22:16 AM
"Mihai N." <nmihai_year_2000@yahoo.com> wrote in message 
news:Xns99784934397AMihaiN@207.46.248.16...
>>     <?xml encoding="<insert encoding">
> If ther encoding is not specified, then the encoding is assumed to be 
> utf-8
> (this is what the standard says)
>

Ah, UTF-8.  I know you discussed this at length several months ago here, but 
to be honest, this is my understanding of it:  it is an 8-bit encoding 
scheme no different than Ansi (that's how it fits in 8 bits).  Since it is 
8-bits, it cannot specify everything a LPWSTR can.  Yet it is somehow is 
supposed to be better than Ansi, not reliant on any codepage.  But if it's 
only 8 bits, how is that?

And UTF-8 begs the question about UTF-16.  Is UTF-16 the same as what 
Windows Notepad (in the Save As dialog) calls "Unicode"?  Or is Windows 
concept of Unicode and LPWSTR different than UTF-16?


>
>> I still don't know
>> if saving the XML file in Unicode (with the 0xFFEE BOM) causes the text 
>> to
>> be displayed correctly regardless of the "encoding" attribute.
> That would be wrong according to the standard. And it is not supported by 
> IE.
>

OK, thanks.  From what you, Tom, and David W. say, UTF-8 is the way to go 
when producing XML files.


> The standard (http://www.w3.org/TR/2006/REC-xml11-20060816/#charencoding)
> "All XML processors MUST be able to read entities in both the UTF-8 and 
> UTF-
> 16 encodings."
> So and XML parser without UTF-16 and UTF-8 is not an XML parser, is a 
> hack.
>

OK, now I'm confused.  How can you save a UTF-16 file (which you say is 
standard) without the 0xFFEE BOM (which you say is not standard)?


>
>> Do you know if MSXML or some of the "big boys"
>> or FirstObject parsers read Unicode files?
> All of them do, if they claim to be XML parsers. If not, they are toys.
> MSXML, Xerces, Expat, all handle UTF-8, UTF-16, and support encoding.
>

Well, I guess I would just say that XML wouldn't be the standard that it is 
if it required these kinds of XML parsers to be universal.  These types of 
parsers have severe redist issues (some are 5 MB big) or calling conventions 
(e.g. MSXML uses COM) that prevented them from being attractive alternatives 
for us.

Our parser does not have all this support, but the art is in finding one 
that holds true to our KISS (keep it simple, stupid) goals yet still 
preserves Asian languages.  I would hope a parser need not be 5 MB large to 
support Asian languages.


>
>> Yes, and I'm not happy with that, but our scheme seems to have been
>> acceptable so far.  Perhaps the results aren't so great, it's just that 
>> the
>> poor people affected by this are so used to it, they don't complain.
> Or they stopped buying your product and moved to something better.
>

When we released our product, it was an Ansi product because Win9x neded to 
be supported (and we couldn't redist the MS Unicode for Win9x, which we'd 
heard had problems anyway).  Since it was Ansi, it used the Ansi codepage. 
And therefore we didn't care if our XML files were Ansi either.

Someone else ported the product to Unicode, but apparently is still refining 
the XML part.  I got bit when I returned to this product and stumbled on 
these XML issues.

This product has deployed millions of copies worldwide, and if it is 
possible, is even more conscious about localization and global acceptance 
than your current company.

-- David


0
dc2983 (3206)
7/25/2007 10:53:12 AM
"David Wilkinson" <no-reply@effisols.com> wrote in message 
news:epFCK1pzHHA.2312@TK2MSFTNGP05.phx.gbl...
> David:
>
> Why don't you use UTF-8 for your XML files? That's what I do. In my 
> development version, I only do Unicode builds, but the GUI is UTF-16 and 
> the back-end (which handles the XML serialization and application 
> configuration file) uses UTF-8.
>

Hmm, perhaps this is just the ticket.  I've asked Mihai about this elsewhere 
in this thread.


> [This is why I am so irritated by the implicit conversion features of 
> CString, which AFAIK always uses the local code page to convert between 
> 8-bit and 16-bit strings. If you could set the "code page" of CString 
> separately from anything else, then you could set it to UTF-8 and the 
> implicit conversions might actually be useful (though I would still rather 
> do without them). As it is, these features are tied to a concept (the ANSI 
> coded page) which is rapidly becoming history (thank God).]
>

The only data stored in UTF-8 is XML, so it is really only needed when 
reading/writing XML.  Since the rest of Windows still uses Ansi, continuing 
CString's use of Ansi makes sense.  I agree it would make it nice for 
CString to accept a code-page parameter (so you could specify UTF-8 
conversions), but I don't think they should be default since Windows does 
not, nor will I thinik ever, natively support UTF-8.

-- David 


0
dc2983 (3206)
7/25/2007 10:59:25 AM
David Ching wrote:

> Ah, UTF-8.  I know you discussed this at length several months ago here, but 
> to be honest, this is my understanding of it:  it is an 8-bit encoding 
> scheme no different than Ansi (that's how it fits in 8 bits).  Since it is 
> 8-bits, it cannot specify everything a LPWSTR can.  Yet it is somehow is 
> supposed to be better than Ansi, not reliant on any codepage.  But if it's 
> only 8 bits, how is that?
> 
> And UTF-8 begs the question about UTF-16.  Is UTF-16 the same as what 
> Windows Notepad (in the Save As dialog) calls "Unicode"?  Or is Windows 
> concept of Unicode and LPWSTR different than UTF-16?

David:

Both UTF-8 and UTF-16 are complete encodings of Unicode. UTF-8 uses up 
to four 8-bit characters, and UTF-16 uses up to two 16-bit characters. 
When "Windows Unicode" first started out, all code points could be 
represented by one 16-bit code unit, but no longer. Modern Windows 
Unicode *is* UTF-16. The Windows ANSI code pages are (I think) all DBCS, 
so UTF-8 cannot be used as a code page (at any rate, it is not the ANSI 
code page for any language).

Some say, and I agree, that now there are surrogate pairs in UTF-16, it 
holds no advantage over UTF-8. Many Linux systems use UTF-8 as their 
native encoding, but this will never happen in Windows.

This does not mean that a Windows program cannot use UTF-8 internally. 
In fact the whole back end of my application uses UTF-8. XML 
serialization is just one of the things this back end does.

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/25/2007 11:43:50 AM
"David Wilkinson" <no-reply@effisols.com> wrote in message 
news:eJWXQErzHHA.4712@TK2MSFTNGP04.phx.gbl...
> David Ching wrote:
>
>> Ah, UTF-8.  I know you discussed this at length several months ago here, 
>> but to be honest, this is my understanding of it:  it is an 8-bit 
>> encoding scheme no different than Ansi (that's how it fits in 8 bits). 
>> Since it is 8-bits, it cannot specify everything a LPWSTR can.  Yet it is 
>> somehow is supposed to be better than Ansi, not reliant on any codepage. 
>> But if it's only 8 bits, how is that?
>>
>> And UTF-8 begs the question about UTF-16.  Is UTF-16 the same as what 
>> Windows Notepad (in the Save As dialog) calls "Unicode"?  Or is Windows 
>> concept of Unicode and LPWSTR different than UTF-16?
>
> David:
>
> Both UTF-8 and UTF-16 are complete encodings of Unicode. UTF-8 uses up to 
> four 8-bit characters, and UTF-16 uses up to two 16-bit characters.

Yes, thanks.  For some reason I had thought UTF-8 was SBCS (since it was 8 
bits) and not MBCS.  Even Ansi codepage is MBCS, so UTF-8 and Ansi are 
really different scheme for the same idea.  Makes sense now!  :-)


> When "Windows Unicode" first started out, all code points could be 
> represented by one 16-bit code unit, but no longer. Modern Windows Unicode 
> *is* UTF-16. The Windows ANSI code pages are (I think) all DBCS, so UTF-8 
> cannot be used as a code page (at any rate, it is not the ANSI code page 
> for any language).
>
> Some say, and I agree, that now there are surrogate pairs in UTF-16, it 
> holds no advantage over UTF-8.

Not to offend anyone, but I recently developed a small product in 30 
languages.  The languages were selected to match the ones where Windows had 
a native SKU.  UTF-16 was fine for this, we never worried about surrogate 
pairs.  I had understood surrogate pairs were only used for a few Han 
dialects in Chinese, and perhaps a couple other languages, but they weren't 
mainstream by any means.  How long before UTF-16 *really* does not work for 
all practical purposes?


> Many Linux systems use UTF-8 as their native encoding, but this will never 
> happen in Windows.
>

The way you've explained UTF-8, it has all the disadvantages of MBCS (in 
fact it is a MBCS) and is thus very hard to parse.  I'm not sure why any 
modern OS would want to be built internally on it.


> This does not mean that a Windows program cannot use UTF-8 internally. In 
> fact the whole back end of my application uses UTF-8. XML serialization is 
> just one of the things this back end does.
>

I take it STL string is UTF-8 friendly?  ;)  Seriously,what library to use 
to represent UTF-8 in memory?  I understood STL string (often typedef'd to 
be tstring) is just a UTF-16 string like CStringW.  I did not see any UTF-8 
capable string that is widespread.  What are you using?

Thanks,
David 


0
dc2983 (3206)
7/25/2007 2:11:33 PM
I'm usually starting with something written in terms of 'char' and no sign of _T()
anywhere.   I usually manage to avoid the example below simply by making sure there is no
undecorated quoted string anywhere, so what I usually do is just replace all quoted
strings with _T("...") (one regular expression handles this, typically).  Then I'm left
with the few strings that had \" in them (I usually don't bother with a more complex
pattern) but these blow up immediately.  GetProcAddress then complains, so these are easy
to find.

The *sizeof(TCHAR) and /sizeof(TCHAR) are the hardest to find, but these are usually found
quickly by looking for str() functions, ReadFile, WriteFile, and their aliases.  It
usually takes about two days of raw editing before it all compiles.  But as I said, the
effort is just about constant, and the reason they send the code to me is that they see it
as a "massive effort" or, in one case, "we need a complete rewrite in Unicode and can't
afford it" (about 150SLOC, five days, and it was only my second conversion, so it was a
bit slower than I can do today, because I was discovering the patterns I needed to worry
about).   So what I've learned over the last eight years or so of doing this is that
perception of the complexity is often much higher than the actual complexity.

I've probably done a dozen of these by now, and the only serious glitches are at the
interfaces of reading and writing files and network packets, and there we have to make
decisions about saving files as Unicode or UTF-8, reading Unicode or UTF-8, and ditto for
network transfers.  

No, I never got any code to post.
					joe
On Tue, 24 Jul 2007 17:39:54 -0700, "Tom Serface" <tom.nospam@camaswood.com> wrote:

>Hi Joe,
>
>I think the effort to do a really big application can be pretty "huge", but 
>you have to weigh it against the effort of trying to get it to work other 
>ways.  A lot of it depends on how the strings are implemented as well.  If 
>they are mostly in files or the .rc file then it is easy to convert them 
>(you can even use NotePad).  The 2005 version of VS will even use Unicode 
>.RC files if they are converted first which is very handy.  Users will also 
>have to use the _T() and TCHAR macros which is why we harp on it so much 
>even when just starting with MBCS.  Still, you're right, once you go through 
>the process a couple of times the conversion thing is pretty academic.  The 
>compiler will gripe about strings that don't have the _T() macro around them 
>so they are easy to find for the most part.  The hardest things are things 
>like:
>
>CString cs = "My String";
>
>Since CString will try convert the string to MBCS...
>
>I think Giovanni wrote some code to update un-macro'd strings to the _T() 
>versions.  Did you post that on your site?
>
>Tom
>
>"Joseph M. Newcomer" <newcomer@flounder.com> wrote in message 
>news:766da3dava3rt8jimcqb10jkm8bcn8m4hg@4ax.com...
>> Are you sure about that "huge effort"?  I've had non-Unicode apps cross my 
>> desk with
>> instructions to convert them to Unicode, and I can usually do it with 
>> about three days of
>> editing and testing, and get it nearly perfect...and the remaining bugs 
>> are found within a
>> couple days of testing.  I'm talking source here on the order of 60K SLOC, 
>> but the effort
>> is about the same for 20K SLOC because it is usually just a long set of 
>> very similar
>> substitution patterns most of which are automated pattern 
>> search-and-replace with my text
>> editor.
>> joe
>> On Tue, 24 Jul 2007 14:14:01 -0700, Paul Wu 
>> <PaulWu@discussions.microsoft.com> wrote:
>>
>>>Thanks for replying. As I said, building it with Unicode takes huge 
>>>effort -- 
>>>not feasible at current stage. We just want the applications to be albe to
>>>process some unicode texts now (on Standard Windows XP).
>>>
>>>I looked at the application -- it was built with static MFC libraries
>>>(Visual Studio 2003).  So the MFC libraries may not the problem -- I just
>>>don't understand why when it runs on Chinese Windows XP, the Edit Controls
>>>can accept Chinese texts.
>>>
>>>Paul
>> Joseph M. Newcomer [MVP]
>> email: newcomer@flounder.com
>> Web: http://www.flounder.com
>> MVP Tips: http://www.flounder.com/mvp_tips.htm 
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
0
newcomer (15974)
7/25/2007 2:28:37 PM
Hi David,

One nice thing about UTF-8 is that if you do most of your files in English 
(as I do), but need to support other languages you still get the benefit of 
having smaller files for English or other languages where more than one byte 
is not needed.  The bad thing is that Windows doesn't support UTF-8 very 
well so you always have to do a lot of converting to and from.  Fortunately, 
the conversions are supported quite well.

Tom

"David Ching" <dc@remove-this.dcsoft.com> wrote in message 
news:pSIpi.27972$2v1.25586@newssvr14.news.prodigy.net...

> The way you've explained UTF-8, it has all the disadvantages of MBCS (in 
> fact it is a MBCS) and is thus very hard to parse.  I'm not sure why any 
> modern OS would want to be built internally on it.

0
tom.nospam (3240)
7/25/2007 2:40:37 PM
"Tom Serface" <tom.nospam@camaswood.com> wrote in message 
news:6C70E294-D33A-4E7A-83EF-EB415367C0F3@microsoft.com...
> Hi David,
>
> One nice thing about UTF-8 is that if you do most of your files in English 
> (as I do), but need to support other languages you still get the benefit 
> of having smaller files for English or other languages where more than one 
> byte is not needed.  The bad thing is that Windows doesn't support UTF-8 
> very well so you always have to do a lot of converting to and from. 
> Fortunately, the conversions are supported quite well.
>

Well, it seems the best thing to do for my case is to request XML files 
containing the localizations to be delivered with UTF-8 encoding, then read 
the contents and store them using CStringA, then convert the CStringA to 
CStringW using MultiByteToWideChar(CP_UTF8, ...), after which my UNICODE 
build should display the text just fine.  Then it should not matter what the 
selected Ansi codepage is in the Control Panel.

If the XML files were in UTF-16 (with or without the 0xFFEE BOM), I could 
just read the contents directly into a CStringW, which seems to be easier, 
but oh well.  I think you're right that the size of the XML file is a 
concern, so UTF-8 should help with that.

Thanks,
David



0
dc2983 (3206)
7/25/2007 2:53:29 PM
David Ching wrote:

> The way you've explained UTF-8, it has all the disadvantages of MBCS (in 
> fact it is a MBCS) and is thus very hard to parse.  I'm not sure why any 
> modern OS would want to be built internally on it.
> 
> I take it STL string is UTF-8 friendly?  ;)  Seriously,what library to use 
> to represent UTF-8 in memory?  I understood STL string (often typedef'd to 
> be tstring) is just a UTF-16 string like CStringW.  I did not see any UTF-8 
> capable string that is widespread.  What are you using?

David:

I agree that there is a high probability that a UTF-16 user program that 
ignores surrogate pairs will work correctly, but if the OS uses an 
encoding as its native encoding then it has to always work.

I just use std::string for UTF-8. No, it is not UTF-8 aware, but neither 
is CString (for the most part) aware of MBCS or 16-bit surrogate pairs. 
Just as you can use CharNext() and CharPrev() in a MBCS or Unicode 
application, you can use CharNextA() and CharPrevA() to parse UTF-8 
strings. But, actually, my app does not do string manipulation, so I do 
not worry about this.

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/25/2007 3:28:48 PM
"David Wilkinson" <no-reply@effisols.com> wrote in message 
news:%23QrU%23BtzHHA.5160@TK2MSFTNGP05.phx.gbl...

> David:
>
> I agree that there is a high probability that a UTF-16 user program that 
> ignores surrogate pairs will work correctly, but if the OS uses an 
> encoding as its native encoding then it has to always work.
>
> I just use std::string for UTF-8. No, it is not UTF-8 aware, but neither 
> is CString (for the most part) aware of MBCS or 16-bit surrogate pairs. 
> Just as you can use CharNext() and CharPrev() in a MBCS or Unicode 
> application, you can use CharNextA() and CharPrevA() to parse UTF-8 
> strings. But, actually, my app does not do string manipulation, so I do 
> not worry about this.
>

Got it, it took me awhile to realize that any 8-bit string would be capable 
of handling UTF-8, or any MBCS string for that matter.

Cheers,
David 


0
dc2983 (3206)
7/25/2007 3:48:33 PM
Mihai,

   I think you are absolutely right -- we'll do this formally when project 
got started,  I'm still at investigation state. The code base is huge and it 
most excludely use non-unicode style, I just cannot change that in a short 
time to show something can be done .

-- 
Developer


"Paul Wu" wrote:

>  
>    I have a MFC application that is currently built with MBCS mode. If I run 
> the program on a Chinese OS (Windows XP), the input boxes  (Edit Controls) 
> can accept  Chinese chars and display correctly. If I run it on a standard 
> English XP, the input boxes won't accept Chinese chars (display as "????") -- 
> please note that I have already installed CKJ on the system and IE and 
> Outlook can display Chinese correctly. 
> 
>    Is this just because of different MFC libraries used for the application? 
> Can I force the application running on Standard XP to use the unicode 
> libraries so to force Edit Controls accept Chinese chars input? Currently 
> completely rewriting the code to use unicode is not option for me.
> 
>    Any help will be appreciated.
> 
>   Paul
> 
> 
> -- 
> Developer
0
PaulWu (6)
7/25/2007 5:30:01 PM
 David,

   It seems it only works with the change in the Control Panel. It does not 
work with the SetThreadLocale() call -- I check the document 
http://msdn2.microsoft.com/en-us/library/ms776285.aspx on SetThreadLocale -- 
it says "Windows 2000/XP: Do not use SetThreadLocale to select a user 
interface language" and suggests something with resource files.  This is 
inconvenient -- I'll try more and your further help will be much appreciated 
as before.

Paul
 
  


-- 
Developer


"David Ching" wrote:

> Great, Paul!  :-)
> 
> To clarify, did both the Regional Control Panel and SetThreadLocale() work, 
> or just one of them?
> 
> Thanks,
> David
> 
> 
> 
> "Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message 
> news:EFE95A13-E586-4554-88D6-EC7B47DC8356@microsoft.com...
> > Thank you very much for this tip.  I still have tons of work to do -- but
> > hopefully this can save me half of the time.
> >
> > -- 
> > Developer
> >
> >
> > "David Ching" wrote:
> >
> >> "Paul Wu" <PaulWu@discussions.microsoft.com> wrote in message
> >> news:11226219-DE3D-4722-BDCE-7FDB8C49C766@microsoft.com...
> >> > Thanks for replying. As I said, building it with Unicode takes huge
> >> > effort -- 
> >> > not feasible at current stage. We just want the applications to be albe 
> >> > to
> >> > process some unicode texts now (on Standard Windows XP).
> >> >
> >> > I looked at the application -- it was built with static MFC libraries
> >> > (Visual Studio 2003).  So the MFC libraries may not the problem -- I 
> >> > just
> >> > don't understand why when it runs on Chinese Windows XP, the Edit 
> >> > Controls
> >> > can accept Chinese texts.
> >> >
> >>
> >> There is a Regional Control Panel that lets you specify the default code
> >> page for non-Unicode apps.  If you set that to Chinese, then restart your
> >> app, does it work?
> >>
> >> If this works, I think you can call SetThreadLocale() in your app's
> >> CWinApp-derived::OnInitInstance() method to accomplish the same thing
> >> without worrying about the Control Panel setting.
> >>
> >> -- David
> >>
> >>
> >> 
> 
> 
> 
0
PaulWu (6)
7/25/2007 6:38:05 PM
Tom Serface wrote:
> Hi Joe,
> 
> I think the effort to do a really big application can be pretty "huge", 
> but you have to weigh it against the effort of trying to get it to work 
> other ways.  A lot of it depends on how the strings are implemented as 
> well.  If they are mostly in files or the .rc file then it is easy to 
> convert them (you can even use NotePad).  The 2005 version of VS will 
> even use Unicode .RC files if they are converted first which is very 
> handy.  Users will also have to use the _T() and TCHAR macros which is 
> why we harp on it so much even when just starting with MBCS.  Still, 
> you're right, once you go through the process a couple of times the 
> conversion thing is pretty academic.  The compiler will gripe about 
> strings that don't have the _T() macro around them so they are easy to 
> find for the most part.  The hardest things are things like:
> 
> CString cs = "My String";
> 
> Since CString will try convert the string to MBCS...

Tom:

I am confused by what you write here. In the code above, in a Unicode 
build, the CString constructor will convert the ANSI (MBCS) string to 
Unicode, which will fortuitously always give the right result if the 
string contains only ASCII characters.

In fact, I would suggest that 90% or more of the uses of this conversion 
constructor involve cases where the programmer simply forgot to put the 
_T() macro around a string literal. The deliberate use of this 
conversion constructor is quite rare I think. CString would be much 
better off without it, IMHO.

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/25/2007 9:33:11 PM
You are correct if the string is all ASCII (like English).  My example was 
bad.  The original discussion included Chinese translation and that might 
not yield the intended result after the "conversion" if the correct code 
page wasn't used.  Frankly, I find the CString auto conversion stuff to be a 
little annoying.

Of course, you can do something like:

CStringA cs = "This is a test";

If you want it to be ANSI, but I agree with you that we would be better off 
without the conversion and I typically use the ATL macros of functions like 
MultiByteToWideChar() or WideCharToMultiByte() to do conversions.

Sorry for the confusion.

Tom

This string will not be Unicode and CString will convert it to MBCS 
(CStringA).

> I am confused by what you write here. In the code above, in a Unicode 
> build, the CString constructor will convert the ANSI (MBCS) string to 
> Unicode, which will fortuitously always give the right result if the 
> string contains only ASCII characters.
>
> In fact, I would suggest that 90% or more of the uses of this conversion 
> constructor involve cases where the programmer simply forgot to put the 
> _T() macro around a string literal. The deliberate use of this conversion 
> constructor is quite rare I think. CString would be much better off 
> without it, IMHO.
>
> -- 
> David Wilkinson
> Visual C++ MVP 

0
tom.nospam (3240)
7/26/2007 6:12:57 AM
> Ah, UTF-8.  I know you discussed this at length several months ago here, 
but 
> to be honest, this is my understanding of it:  it is an 8-bit encoding 
> scheme no different than Ansi (that's how it fits in 8 bits).
Oh, it is very different from Ansi!
I would recomand reading this:
   http://www.mihai-nita.net/article.php?artID=20060806a


> Since it is 
> 8-bits, it cannot specify everything a LPWSTR can.
It can, because it is multi-byte (up to 4).
UTF-32, UTF-8, UTF-16, are all capable to represent the full Unicode range.
Think if this: you can imagine a UTF-4 (useless), and you can still represent
any numeric value, if you have enough coding units.


> Yet it is somehow is 
> supposed to be better than Ansi, not reliant on any codepage.  But if it's 
> only 8 bits, how is that?
I think I have to write the planned continuation to my article, covering 
UTF-8, UTF-16, (UTF-16BE & UTF-16LE), UTF-32 (both BE and LE), UCS2, UCS4
:-)

> And UTF-8 begs the question about UTF-16.  Is UTF-16 the same as what 
> Windows Notepad (in the Save As dialog) calls "Unicode"?  Or is Windows 
> concept of Unicode and LPWSTR different than UTF-16?
Short answer: for all Windows API and applications (incuding Notepad)
"Unicode" means "Unicode encoded as UTF-16LE"


> OK, now I'm confused.  How can you save a UTF-16 file (which you say is 
> standard) without the 0xFFEE BOM (which you say is not standard)?
It is not standard to save as UTF-16 and still have a different encoding
declared in xml. It is ok if you save as UTF-16 with BOM the encoding
is saying the same thing.


> Well, I guess I would just say that XML wouldn't be the standard that it is 
> if it required these kinds of XML parsers to be universal.
> These types of 
> parsers have severe redist issues (some are 5 MB big) or calling
> conventions (e.g. MSXML uses COM) that prevented them from being
> attractive alternatives for us.

XML does not require big parsers, it requires Unicode support.
If your own parser recognizes UTF-16 and UTF-8, you are compliant.
Parsers are not required to understand encodings other than the two UTF forms
mentioned. But they are required to understand the encoding directive.
If you see <?xml version="1.0" encoding="cp932"?> and you interpret
the XML as UTF-8 or ANSI, you are not compliant. But if you don't
know about cp932, you fail, and you are compliant.


> Our parser does not have all this support, but the art is in finding one 
> that holds true to our KISS (keep it simple, stupid) goals yet still 
> preserves Asian languages.  I would hope a parser need not be 5 MB large to 
> support Asian languages.
UTF (8 & 16) is the key. You support everything, and should KISS.


> This product has deployed millions of copies worldwide, and if it is 
> possible, is even more conscious about localization and global acceptance 
> than your current company.
I don't have a company :-D
But I know what you mean. And this is what keeps it interesting.
Imagine all our products would be perfect from i18n/l10n perspective,
all the developers are experts in i18n/l10n. What would be my role there?
I would look like an idiot between all the experts, and there would be
nothing for me to fix.
On the other side, you would have to admit that some of the products
where able to handle CCJK on English OS many years ago (and some of
that kind of work bites back now :-)


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/26/2007 7:31:47 AM
> Not to offend anyone, but I recently developed a small product in 30 
> languages.  The languages were selected to match the ones where Windows had 
> a native SKU.  UTF-16 was fine for this, we never worried about surrogate 
> pairs.  I had understood surrogate pairs were only used for a few Han 
> dialects in Chinese, and perhaps a couple other languages, but they weren't 
> mainstream by any means.  How long before UTF-16 *really* does not work for 
> all practical purposes?
UTF-16 knows about surrogates, and it works for all practical purposes.
An application using UTF-16, but not surrogate aware, is like an
ANSI application working with 932 (Japanese) and not being DBCS aware.
As long as you don't do fancy text processing you are safe.


> The way you've explained UTF-8, it has all the disadvantages of MBCS (in 
> fact it is a MBCS) and is thus very hard to parse.
In some ways is the same, but has some advantages.
For instance you can always tell by looking at a byte if you are in the 
middle of a character or not (this is not the case with most DBCS/MBCS).
And you can algorithmically convert to UTF-16 and UTF-32, while MBCS
needs conversion tables. And can represent all the Unicode characters,
while a MBCS can only represent a subset.
(ok, not to mess-up things, but to prevent nipicking: GB-18030 is MBCS,
can represent the full Unicode range, and you know when you are "in a 
character")

> I'm not sure why any 
> modern OS would want to be built internally on it.
UNIX is not modern. UTF-8 works well for legacy, because you can move
it around thru old char *, and nothing breaks
(ok, at least at the first lookd :-)


> I take it STL string is UTF-8 friendly?  ;)
No. But it should. Unfortunatelly the C++ standard is quite bad
regarding i18n issues.

> Seriously,what library to use 
> to represent UTF-8 in memory?  I understood STL string (often typedef'd to 
> be tstring) is just a UTF-16 string like CStringW.  I did not see any UTF-8 
> capable string that is widespread.  What are you using?
The std::string is basic_string instantiated with char, and
std::wstring is basic_string instantiated with wchar_t
On most Unix-es wchar_t is 32 bits, so std::wstring is UTF-32.
But wchar_t is 16 bit on Windows, so in this case std::wstring is UTF-16
(but unaware of that, so you can have problems with surrogates :-)
The "typedef to tstring" is a Windows trick (inspired by TCHAR), and you
will not see it too often in the UNIX world.
Even worse, wchar_t is not guaranteed to be Unicode, and there are UNIX
implementations using wchar_t to store characters in a Chinese or 
Japanese encoding.


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/26/2007 7:46:06 AM
> One nice thing about UTF-8 is that if you do most of your files in English 
> (as I do), but need to support other languages you still get the benefit of 
> having smaller files for English or other languages where more than one
> byte is not needed.

Somethimes this does not matter that much.
I had a lot of oposition to using UTF-16 for all the string resources we use
in one of our application (not standard Windows resources, which are UTF-16)
And the reason was size. Until I showed them that all the strings together,
encoded UTF-16, where smaller than the splash-screen :-)


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/26/2007 7:49:27 AM
> Well, it seems the best thing to do for my case is to request XML files 
> containing the localizations to be delivered with UTF-8 encoding,
Yes.

> then read 
> the contents and store them using CStringA, then convert the CStringA to 
Why CStringA? Just use a BYTE array.

> I think you're right that the size of the XML file is a concern,
> so UTF-8 should help with that.
Sometimes size does not matter :-)
But yes, as a rule of thumb: transfer with UTF-8, process with UTF-16/UTF-32
(whatever matches the API of the OS/libraries you are using)


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/26/2007 7:51:51 AM
> Just as you can use CharNext() and CharPrev() in a MBCS or Unicode 
> application you can use CharNextA() and CharPrevA() to parse UTF-8 
> strings.
Nope.
CharNext/CharPrev are aware of lead/trailing bytes in MBCS, and
are aware of surrogates in UTF-16.
But hey are *totaly* unaware of UTF-8!
If fact, using CharNext on UTF-8 if the ANSI code page is MBCS
will surely lead to a mess!
If you store UTF-8 in std::string or CStringA, then stay away
from MBCS aware API.


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/26/2007 7:55:26 AM
Overall I agree with you (with some troubles if
you have some public SDK that uses char, and you have
to support older plugins using the old SDK)

For those who never did a conversion sounds scarry,
but this is a good read to show how easy it can be:
   http://blogs.msdn.com/michkap/archive/2007/01/05/1413001.aspx
(this is the last part, and has links to the previous ones).


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/26/2007 8:00:51 AM
Mihai N. wrote:
>> Just as you can use CharNext() and CharPrev() in a MBCS or Unicode 
>> application you can use CharNextA() and CharPrevA() to parse UTF-8 
>> strings.
> Nope.

Hi Mihai:

Ah yes, I see you are right (as always). CharNextA() and CharPrevA() 
take a "code page" as the first parameter, but CP_UTF8 is not one of the 
allowed values. It should be :).

As I said, I do not actually do string manipulation in my application, 
so I just store my UTF-8 strings in std::string and convert them to and 
from UTF-16 as needed.

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/26/2007 9:44:48 AM
Mihai N. wrote:

>> I take it STL string is UTF-8 friendly?  ;)
> No. But it should. Unfortunatelly the C++ standard is quite bad
> regarding i18n issues.

Hi Mihai:

But do you not agree that the belief that CString *does* understand MBCS 
(and surrogate pairs in UTF-16) is (for the most part) a myth?

Right up there with the myth that Windows Unicode is great because it 
needs only one code unit per code point.

At least the C++ standard does not pretend that std::string and 
std::wstring are encoding-aware; they are just containers.

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/26/2007 9:54:20 AM
Mihai N. wrote:

> XML does not require big parsers, it requires Unicode support.
> If your own parser recognizes UTF-16 and UTF-8, you are compliant.
> Parsers are not required to understand encodings other than the two UTF forms
> mentioned.

Hi Mihai:

What does this mean exactly? My own parser, which seems to work fine for 
my purposes (serialization), simply writes std::string to and from XML. 
Why does it need to understand UTF-8?

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/26/2007 10:11:04 AM
David,

Yes.  CString is not as smart as all that.

Tom

"David Wilkinson" <no-reply@effisols.com> wrote in message 
news:O%23tOur2zHHA.3788@TK2MSFTNGP02.phx.gbl...

> But do you not agree that the belief that CString *does* understand MBCS 
> (and surrogate pairs in UTF-16) is (for the most part) a myth?
>
> Right up there with the myth that Windows Unicode is great because it 
> needs only one code unit per code point.
>
> At least the C++ standard does not pretend that std::string and 
> std::wstring are encoding-aware; they are just containers.
>
> -- 
> David Wilkinson
> Visual C++ MVP 

0
tom.nospam (3240)
7/26/2007 2:02:17 PM
I agree for internal strings, but when you are writing to files, could be 
lots of them, then the difference could add up.  I agree though.  We have 
way more memory to work with these days.  Of course we have more hard drive 
space too.  Could be I'm working too hard.

Tom

"Mihai N." <nmihai_year_2000@yahoo.com> wrote in message 
news:Xns99798647F3FCMihaiN@207.46.248.16...
>> One nice thing about UTF-8 is that if you do most of your files in 
>> English
>> (as I do), but need to support other languages you still get the benefit 
>> of
>> having smaller files for English or other languages where more than one
>> byte is not needed.
>
> Somethimes this does not matter that much.
> I had a lot of oposition to using UTF-16 for all the string resources we 
> use
> in one of our application (not standard Windows resources, which are 
> UTF-16)
> And the reason was size. Until I showed them that all the strings 
> together,
> encoded UTF-16, where smaller than the splash-screen :-)
>
>
> -- 
> Mihai Nita [Microsoft MVP, Windows - SDK]
> http://www.mihai-nita.net
> ------------------------------------------
> Replace _year_ with _ to get the real email 

0
tom.nospam (3240)
7/26/2007 2:03:47 PM
Hi David,

I think UTF-8 is used pretty widely in the Unix/Linux world so it probably 
depends on where you get your data.  I know, in my case, the records that 
are streamed to my program from a Java based JMS are UTF-8.  Probably, this 
was done to cut down on network traffic size.

Tom

"David Wilkinson" <no-reply@effisols.com> wrote in message 
news:e5l3E12zHHA.1168@TK2MSFTNGP02.phx.gbl...
> Mihai N. wrote:
>
>> XML does not require big parsers, it requires Unicode support.
>> If your own parser recognizes UTF-16 and UTF-8, you are compliant.
>> Parsers are not required to understand encodings other than the two UTF 
>> forms
>> mentioned.
>
> Hi Mihai:
>
> What does this mean exactly? My own parser, which seems to work fine for 
> my purposes (serialization), simply writes std::string to and from XML. 
> Why does it need to understand UTF-8?
>
> -- 
> David Wilkinson
> Visual C++ MVP 

0
tom.nospam (3240)
7/26/2007 2:09:54 PM
> But do you not agree that the belief that CString *does* understand MBCS 
> (and surrogate pairs in UTF-16) is (for the most part) a myth?
I would say that some members of CString are MBCS aware
(when the ANSI code page is the MBCS we are talking about)
and some members of CString are surrogate aware.
Most of these members are not directly handling the strings,
but pass it to "smarter" Windows API (stuff like Collate, CollateNoCase,
Compare, CompareNoCase, MakeLower, MakeUpper)
Even stuff like Remove/Replace/Find etc are MBCS aware (I did not try 
surrogate aware). On a Japanese system searching for '\' will not find
the Japanese characters where the second byte is '\' (correct), so
CString is MBCS aware. Not so sure about surrogates.
But again, searching for half of surrogate is a mistake, searching
for '\' in Japanese is not.
You cannot ask each function to validate the Unicode strings it takes
as parameters. You make sure you pass valid Unicode input, and
you are guaranteed that the function will give you back something valid.


> Right up there with the myth that Windows Unicode is great because it 
> needs only one code unit per code point.
UTF-16 does not need one code unit per code point, so neither does Windows.
Only UTF-32 does that. But that only solves the problems of surrogates,
many other things are still problematic (think combining characters,
variant selectors)


> At least the C++ standard does not pretend that std::string and 
> std::wstring are encoding-aware; they are just containers.
Well, by calling that a string and recomending it for text storage
they do a bit more than "just containers"?
I want a stupid container, I use std::vector.
By providing a different container for text, without giving the
proper support for text, is plain wrong.

But it seems that they started to see "the errors of their ways"
and work is done to fix this. I can hardly wait, and I would
like to see this finished and implemented in the major compilers.
Yesterday if possible :-)



-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/27/2007 4:16:39 AM
> What does this mean exactly? My own parser, which seems to work fine for 
> my purposes (serialization), simply writes std::string to and from XML. 
> Why does it need to understand UTF-8?


Well, it was some kind of answer to your post complaining 
that XML (as standard) "required these kinds of XML parsers"
and complained about size.
So my answer was that there is no need to be big, or to support
all the encodings in the world.

If your parser uses std::string, there is no problem.
Just make sure the std::string contains UTF-8, and you set the
encoding to UTF-8. And if you read an XML and the encoding
directive is not UTF-8, you fail. And you are compliant.
You can also read/write ANSI using std::string, as long as you
set the encoding to that ANSI code page, and you check it at
load time.

If you don't read UTf-16 and UTF-8 XML files, you parser is not
standard compliant, but you might not care, if you don't have
to interract with applications that are compliant.

But if your parser writes using ANSI on a Japanese system and
reads it as ANSI on a Russian one, ignoring the encoding directive,
then you have to care, because you corrupt your own data data.



-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/27/2007 4:24:26 AM
> I would say that some members of CString are MBCS aware
....
> and some members of CString are surrogate aware.

Actually, some research on how aware some members are,
both for CString and std::string/std::wstring would
make an interesting article :-)


-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/27/2007 4:29:06 AM
Mihai N. wrote:
> I would say that some members of CString are MBCS aware
> (when the ANSI code page is the MBCS we are talking about)
> and some members of CString are surrogate aware.
> Most of these members are not directly handling the strings,
> but pass it to "smarter" Windows API (stuff like Collate, CollateNoCase,
> Compare, CompareNoCase, MakeLower, MakeUpper)
> Even stuff like Remove/Replace/Find etc are MBCS aware (I did not try 
> surrogate aware). On a Japanese system searching for '\' will not find
> the Japanese characters where the second byte is '\' (correct), so
> CString is MBCS aware. Not so sure about surrogates.
> But again, searching for half of surrogate is a mistake, searching
> for '\' in Japanese is not.
> You cannot ask each function to validate the Unicode strings it takes
> as parameters. You make sure you pass valid Unicode input, and
> you are guaranteed that the function will give you back something valid.

> UTF-16 does not need one code unit per code point, so neither does Windows.
> Only UTF-32 does that. But that only solves the problems of surrogates,
> many other things are still problematic (think combining characters,
> variant selectors)
> 
> 
>> At least the C++ standard does not pretend that std::string and 
>> std::wstring are encoding-aware; they are just containers.
> Well, by calling that a string and recomending it for text storage
> they do a bit more than "just containers"?
> I want a stupid container, I use std::vector.
> By providing a different container for text, without giving the
> proper support for text, is plain wrong.
> 
> But it seems that they started to see "the errors of their ways"
> and work is done to fix this. I can hardly wait, and I would
> like to see this finished and implemented in the major compilers.
> Yesterday if possible :-)

Hi Mihai:

I did not realize that CString::Find(), etc, were MBCS aware as you 
describe. It is not mentioned in the documentation. So perhaps CString 
is smarter than I thought. But for me it is useless, because it is tied 
to the ANSI code page, and I am only interested in UTF-8.

Yes, UTF-16 needs surrogate pairs, but this fact is rarely mentioned in 
Microsoft documentation, and many people believe that its great 
advantage is that it always has one code unit per code point.

Yes, it would be great if the C++ library could be "encoding-aware".

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/27/2007 10:03:23 AM
Mihai N. wrote:
>> I would say that some members of CString are MBCS aware
> ...
>> and some members of CString are surrogate aware.
> 
> Actually, some research on how aware some members are,
> both for CString and std::string/std::wstring would
> make an interesting article :-)

Mihai:

Yes indeed. That would be very informative.

-- 
David Wilkinson
Visual C++ MVP
0
no-reply8010 (1791)
7/27/2007 10:06:32 AM
You guys are amazing, but also "making" my work harder and harder.
-- 
Developer


"David Wilkinson" wrote:

> Mihai N. wrote:
> > I would say that some members of CString are MBCS aware
> > (when the ANSI code page is the MBCS we are talking about)
> > and some members of CString are surrogate aware.
> > Most of these members are not directly handling the strings,
> > but pass it to "smarter" Windows API (stuff like Collate, CollateNoCase,
> > Compare, CompareNoCase, MakeLower, MakeUpper)
> > Even stuff like Remove/Replace/Find etc are MBCS aware (I did not try 
> > surrogate aware). On a Japanese system searching for '\' will not find
> > the Japanese characters where the second byte is '\' (correct), so
> > CString is MBCS aware. Not so sure about surrogates.
> > But again, searching for half of surrogate is a mistake, searching
> > for '\' in Japanese is not.
> > You cannot ask each function to validate the Unicode strings it takes
> > as parameters. You make sure you pass valid Unicode input, and
> > you are guaranteed that the function will give you back something valid.
> 
> > UTF-16 does not need one code unit per code point, so neither does Windows.
> > Only UTF-32 does that. But that only solves the problems of surrogates,
> > many other things are still problematic (think combining characters,
> > variant selectors)
> > 
> > 
> >> At least the C++ standard does not pretend that std::string and 
> >> std::wstring are encoding-aware; they are just containers.
> > Well, by calling that a string and recomending it for text storage
> > they do a bit more than "just containers"?
> > I want a stupid container, I use std::vector.
> > By providing a different container for text, without giving the
> > proper support for text, is plain wrong.
> > 
> > But it seems that they started to see "the errors of their ways"
> > and work is done to fix this. I can hardly wait, and I would
> > like to see this finished and implemented in the major compilers.
> > Yesterday if possible :-)
> 
> Hi Mihai:
> 
> I did not realize that CString::Find(), etc, were MBCS aware as you 
> describe. It is not mentioned in the documentation. So perhaps CString 
> is smarter than I thought. But for me it is useless, because it is tied 
> to the ANSI code page, and I am only interested in UTF-8.
> 
> Yes, UTF-16 needs surrogate pairs, but this fact is rarely mentioned in 
> Microsoft documentation, and many people believe that its great 
> advantage is that it always has one code unit per code point.
> 
> Yes, it would be great if the C++ library could be "encoding-aware".
> 
> -- 
> David Wilkinson
> Visual C++ MVP
> 
0
PaulWu (6)
7/27/2007 3:00:01 PM
> You guys are amazing, but also "making" my work harder and harder.

Don't let yourself distracted by all this noise :-)

Rules of thumb:
 - Unicode is the best thing, but no silver bullet
 - transfer and storage UTF-8
 - text processing UTF-16,
 - conversion "at the edge."

This will be the right thing 95% of the cases, without thinking :-)
Forget the rest for now :-)

-- 
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
7/28/2007 8:55:45 AM
Reply:

Similar Artilces:

windows updates installation fails
Hello there, I am using XP Pro sp3, I just installed it and when running the windows updates from IE I am able to download the updates bu then the installation fails. How can I delete all downloads and reset the windows client update settings? Thank you, T Which is the update(s) failing to install? Further information required: How to view the Update log in Windows Click Start, and then click Run. In the Open box, type %windir%\windowsupdate.log and then click OK. Scroll down towards the bottom for the latest entries to locate the error code(s) associated with the ...

Windows live won't print single page email... prints eye charts only?
Can someone tell me WHY the body of any email invoice has to be printed with BIG text (1/8th inch tall) and be spread across 2 to 3 pages???? Do I need to get away from HP? Do I need to get away from Microsoft? Does Apple have this problem? Can I go back to like Windows 3.1 on my new pavilion? It's real simple. I run a small home based business. I don't even use my computer to do that, except I sometimes order parts online and thus receive invoices via email. For tax purposes, I need to make a hard copy. I've tried everything I can but CANNOT reduce the text si...

Unicode Problems In Resource Files
Hi, I have the following unicode russian string in an excel document ????????? ???? ???????T when i paste it into the string table in the resource editor it comes out as follows Нас�,�?ойка иг�?�< �"а�?�"илд�"� why is this could someone help i am toatally at a deadend with this. Regards Adrian, >I have the following unicode russian string in an excel document >????????? ???? ???????T >when i paste it into the string table in the resource editor it comes out as >follows >�?а�?�,�?ойка иг�?�< �"а�?�"илд�"� Adrian, I believe the c...

Report generated from text box fields
hello. i have a 3 column report. The 3 products compared on the report are assigned when the report opens and asks the user for the ID.. is there are way to create 3 unbound text fields and have the report compare the IDs of the products listed in the text box.. i have done something similar that searches based on what was entered in a text box, but wasnt sure how to have the report pick up three different text boxes.... this is an example of what i have in another application that opens up a query Like "*" & [forms]![SEARCHMENU]![text37] & "*" -- Message po...

Office 2007 will not install on 1 particular Windows 7 machine
I have just performed a custom install of Windows 7 Pro on a computer that has been running Windows Vista Ultimate and Office 2007 successfully for several years. I had some trouble geting a stable installation of Win 7 but after an expensive visit to the Geek Squad, which determined that a particular Windows update was causing the system to crash, I now have a stable installation. (They disabled that particular update, too.) The problem is that I am unable to install any of my fully-licensed copies of Office 2007 on this machine, even though I have it running happily on two ...

Word Quits when I close document window
I'm running Office:mac X on OS 10.1.5. When I close a window in word, it usually quits with an error message that says it quit unexpectedly and that other applications and the system are not affected. This also happened when I was running Word for Mac '98 on Classic 9.2.2, but not as consistently. Does anyone know what is going on and how to fix it? In article <0aef01c3789d$3d5f4c00$a001280a@phx.gbl>, "Liz" <S0journer@aol.com> wrote: > I'm running Office:mac X on OS 10.1.5. When I close a > window in word, it usually quits with an error message th...

"Access denied" frm Vista client to Win2003 server, but not XP or
The setup: -A Win2003 server at a hosting site, configured for VPN access. -The Win2003 server has 2 users configured, e.g. Bill and Fred. Each has the 'Allow access' selection for "Remote Access Permission (Dial-in or VPN)" on their individual properties page. -The Win2003 server has 3 shared folders, with share names "Everybody", "Bills Folder", and "Freds Folder", as implied, the 'Everyone' group has read access via the 'Permissions' settings of the shared directory properties. Bill and Fred each have full ...

Black box when printing embedded chart to XPS
On a machine with Excel 2007 installed, open WordPad. Insert -> Object. Choose "Microsoft Office Excel Chart". A generic looking chart gets inserted into the document. Print it to a printer, and it prints fine (at least for me). Print it to the XPS printer (or PDF printer) and you get a black box where the chart is supposed to be. This is causing havok on a program I maintain that uses embedded excel charts. I thought it was a problem in my program, but now I've replicated it only using WordPad, Excel 2007, and the Microsoft XPS printer. ...

auto open jpeg attachments in outlook, office xp
In outlook express I had it set somehow to automatically open pics(jpeg)for viewing when the e-mails were open. The pics would display in the main e-mail body. How can I set outlook in office xp to do that? Can anyone smarter than me help? (that likely means anyone one reading this!) Help.....he....lp... Is it a security setting I'm missing? Outlook doesn't support inline image viewing. This may make it easier for you to view: http://blogs.officezealot.com/legault/archives/000254.html -- Diane Poremsky [MVP - Outlook] Author, Teach Yourself Outlook 2003 in 24 Hours Coauthor, One...

windows 98 restart
When windows is booting up i get this error massage .Windows found an error in your system files and restored a recent back-up of the files to fix the problem.Then it tells you to restart,but nothing happens but the same massage, over an over ,thanks in advance. The most likely cause of this error is a hardware failure of some sort, such as RAM or disk. You should run some diagnostics to check that the hardware is operating properly. http://support.microsoft.com/kb/188867 Troubleshooting Windows 98 Startup Problems If that does not identify the problem, try unplugging all u...

'Save As' hangs Excel XP 2002 SP3
I have a problem in our office. For the most part Excel works fine. We can open files and save them as long as we dont use the 'Save As' feature and try to save it to a Network Drive. Files can be saved locally then have to be copied to the Network Share. Of course my first reaction was 'Permissions Issue'. BUt each user has full access to the folder. They can right click and create a new Excel file and rename it. BUt if they open that same file then try to 'Save As' back to the same folder Excel hangs. Using the 'Save' feature works just fine. ...

How to transfer subfolders from hotmail to windows live?
Hi! I am operating windows 7 on a new computer and can not transfer my subfolders in my hotmail account to my windows live mail desktop. I used to have it on my windows xp and it was fine but since set up on here I can't get my other folders to move over. can anyone help? thanks If you configure your Hotmail account in WLM, it will automatically use DeltaSync which will sync all your folders and messages in Windows Live Mail. If you use POP, only messages from Hotmail's Inbox will be downloaded in WLM (Hotmail-Inbox). PS: there are no subfolders in Hotmail (you can'...

formula numeric input
Is there any way to have Excel assume, as it should, that a number input in a formula means a value? For example, is there a way to change the default so that instead of inputting "+7+8" to calculate 15, that I could just input "7+8". Lotus used to work this way, and I have to believe that Excel would allow the same capability. -- seamaml ------------------------------------------------------------------------ seamaml's Profile: http://www.excelforum.com/member.php?action=getinfo&userid=30296 View this thread: http://www.excelforum.com/showthread.php?threadid=...

How can I get a combo box to get data based on a text box?
Hi, I have a text box that looks up an ID number from a table. I want the User to type in an ID and then have a combo box on the same form that gives a list of dates that correspond to that ID in the text box. Right now, the combo box displays all the dates from all the ID's. I want it to only show the dates that are linked to that ID. Please help, Thanks Use the text box's AfterUpdate event to modify the combo box's RowSource property: Me.ComboBox1.RowSource = "SELECT ... FROM ... WHERE FieldID=" & Me.TextID & ";" tyler.deutsch@gmail.com wrote...

Check boxes to text 05-12-10
Is there any way that I can have check boxes automaticaly transfer their information to text? example: Check boxes = red, blue, green If you click the red and green check boxes, the title would be as bellow. Title on the page = Lable order for: red, green Thank you Again, what you ask is at present incomprehensible. If the link I posted earlier does not work for you then the alternative is a userform - For the basics, see Word MVP FAQ - Userforms http://word.mvps.org/FAQs/Userforms.htm for a more in depth explanation, see http://gregmaxey.mvps.org/Create_and_employ_a_User...

Windows cannot load the locally stored profile.. 03-24-10
I came into the office today and got the message "Windows cannot load the locally stored profile." --- I've had this happen once or twice before.. I rebooted and things were fine. This time it didn't work. So, I've been fooling around for like 3 hours setting everything up as I had it previously. I figured out how to get back all my Firefox bookmarks and passwords. They were still stored under my username. Also got back most of my Thunderbird emails.. I lost my sent box and some of my inbox. Got back my Outlook emails as well. I'm only using Outlook mai...

Trying to set up windows Live Mail
Hi I was wondering if someone can help me. I asked the people at windows live help, but I am getting nowhere with them. I am having problems adding my website email addresses to windos mail live. I have set up the email address on my website, so I know there is not a problem there... The first message I sent to them was: Hi Can someonme please help me! I am trying to set up Window Live Mail and I want to add some email addresses from my website. Each time I do this I keep geting this message: Unable to send or receive messages for the Angelpathway (mail) account. An ...

WINDOWS 7 07-18-10
Why is there no Windows 7 forum on Microsoft News? Because msnews is being discontinued. http://www.microsoft.com/communities/newsgroups/default.mspx -- .. -- "David Crisp" <dcrisp@batelco.com.bh> wrote in message news:eHqAl5nJLHA.5720@TK2MSFTNGP02.phx.gbl... > Why is there no Windows 7 forum on Microsoft News? On Sun, 18 Jul 2010 16:37:39 +0300, David Crisp wrote: > Why is there no Windows 7 forum on Microsoft News? MS is responding to their user base by exiting from the newsgroup scene. Try alt.windows7.general. On 18/07/2010 4:59 PM, . wrote: ...

Wanted -to know how to post FILE NAME and DIRECTORY at bottom of my worksheet
Who knows how to post the formula at the bottom of all my excel worksheets that will give the file name and the directory of my file??? thanks for a reply! =CELL("Filename",A1). file must be saved for the first time. Regards, Jon-jon "Pedro" <pdk1974@hotmail.com> wrote in message news:0b3d01c38f14$216fd180$a401280a@phx.gbl... > Who knows how to post the formula at the bottom of all my > excel worksheets that will give the file name and the > directory of my file??? > > thanks for a reply! In case you wanted to put full path including sheetnam...

How to install CRM 3.0 on Windows Server 2003 x64 version?
Hi there, I have a x64 CPU machine with SQL 2005 x64 version installed. Now I want to install CRM 3.0 on this machine. Is it possible for this? Thanks! Bennett sql 2005 x64 is supported but the web side of things is not. ============================== John O'Donnell Microsoft CRM MVP http://www.crowechizek.com/microsoft "Bennett" <Bennett@discussions.microsoft.com> wrote in message news:D737C886-2010-4328-A6EE-7A6FEBF20F49@microsoft.com... > Hi there, > > I have a x64 CPU machine with SQL 2005 x64 version installed. Now I want > to > install CRM 3.0...

Stuck email in my out box (15 MB)...cannot delete becuase it is "already transmitting."
Help!!! How do I delete it...I cannot delete a 15MB email in my out box becuase it said it is "transmitting." It's been transmiting for three days! Check: File | Work Offline and in: View | Layout: Uncheck the Preview Pane. Can you delete it now? -- Bruce Hagen MS-MVP [Mail] Imperial Beach, CA "THP" <teresahparker@sbcglobal.net> wrote in message news:3EDD2EEF-6FFE-4FF7-982E-D20FB1D53B91@microsoft.com... > Help!!! How do I delete it...I cannot delete a 15MB email in my out > box becuase it said it is &quo...

combo box
I have a series of command buttons number 1 to 12 that open a new form where I can select different attributes which then updates information on the original form (reason for this is the editing is password protected) one of them is a combo box which has a list from a table called “Cables” the list is 1 – 12 which when selected passes data back into the cables text box on the original form. if I select number 2 I want it to disappear from the list for that record ie 1, 3, 4, 5 etc so that I don’t have users select cable 2 more than once for that record. When I go to the next record I ...

I want to receive mail for different accounts in different folder.
My family each has their own email account. I've created folders for each of us, but when either of us receives mail it all goes into one folder. When I update the account in tools and change the delivery location, it changes each account to the same location. Can we each receive our mail in our respective folders, or do they always have to go to the same folder? You can either use profiles or use rules. However, without knowing what software you are using, providing an answer is impossible. --� Milly Staples [MVP - Outlook] Post all replies to the group to keep the discussion i...

Do I need to install Office 2003 updates for Window 7?
I have a new computer with Window 7 operating system. I used my original disk and product code to install Office 2003 and it seemed to go well. Now I am wonering if I need to go back and install Office 2003 updates. I am primarily worried about Office 2003 security but thinking maybe Windows 7 takes care of that. If Windows 7 does not, which ones need to be used to update and how do I get access to them? "Gracious" <Gracious@discussions.microsoft.com> wrote in message news:27918728-D48A-4CB9-B530-5EEFBEF82337@microsoft.com... > I have a new computer with ...

unhide multiple text boxes at a time after pressing a command button
I want to be able to add a text box everything I press the Add Task button. Something like this: ___________ | ADD TASK | then a text box becomes visible ________________________ | | |________________________| and repeat up to six times So far all I have been able to do is to get one text box to show can any one help plz. Mike -- Message posted via http://www.accessmonster.com mjquinon via AccessMonster.com wrote: >I want to be able to add a text box everything I press the Add Task button. > >Something like this: > > ______...