Convert ANSI to UTF-8?

Hi,

I have some legacy code that saves a file to ANSI, (using FILE* fp; fopen( 
&fp, ..., "w"); fwrite(...))
I am writing 'char*' to the file, (not wchar).

The file is saved as ANSI, a third party application uses the file, but one 
of their new requirement is that the file be UTF-8.

Is there a way of saving my file so it is UTF-8?

My application uses _UNICODE apart from that file saving class.
I don't mind changing all the chars to wchars but I am not sure that it will 
change the format of the file, (not the way I understand UTF-8 anyway).

Many thanks.

Regards,

Simon 

0
10/7/2008 3:57:20 AM
vc.mfc 33608 articles. 0 followers. Follow

9 Replies
744 Views

Similar Articles

[PageSpeed] 31

Simon wrote:
> Hi,
> 
> I have some legacy code that saves a file to ANSI, (using FILE* fp; 
> fopen( &fp, ..., "w"); fwrite(...))
> I am writing 'char*' to the file, (not wchar).
> 
> The file is saved as ANSI, a third party application uses the file, but 
> one of their new requirement is that the file be UTF-8.
> 
> Is there a way of saving my file so it is UTF-8?
> 
> My application uses _UNICODE apart from that file saving class.
> I don't mind changing all the chars to wchars but I am not sure that it 
> will change the format of the file, (not the way I understand UTF-8 
> anyway).
> 

If your app otherwise uses TCHAR/wchar_t then somewhere these data has 
to be converted to char to be given to fwrite.
Look at the WideCharToMultiByte API  (with CP_UTF8 flag) to convert you 
internal unicode representation to UTF-8.

br,
Martin
0
10/7/2008 6:29:48 AM
> I have some legacy code that saves a file to ANSI, (using FILE* fp; fopen( 
> &fp, ..., "w"); fwrite(...))
> I am writing 'char*' to the file, (not wchar).
> 
> The file is saved as ANSI, a third party application uses the file, but one 
> of their new requirement is that the file be UTF-8.
> 
> Is there a way of saving my file so it is UTF-8?

If you use a newer version of VS it should be easy:
fopen takes a ccs=ENCODING parameter
http://msdn.microsoft.com/en-us/library/yeby3zcb(VS.80).aspx



-- 
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
10/7/2008 6:35:14 AM
Mihai N. wrote:
>> I have some legacy code that saves a file to ANSI, (using FILE* fp; fopen( 
>> &fp, ..., "w"); fwrite(...))
>> I am writing 'char*' to the file, (not wchar).
>>
>> The file is saved as ANSI, a third party application uses the file, but one 
>> of their new requirement is that the file be UTF-8.
>>
>> Is there a way of saving my file so it is UTF-8?
> 
> If you use a newer version of VS it should be easy:
> fopen takes a ccs=ENCODING parameter
> http://msdn.microsoft.com/en-us/library/yeby3zcb(VS.80).aspx
> 
> 
Nice info.
However, in the typical MSDN way the docs fail to mention which 
character set / code page is used for converting to/from. I would assume 
the one set for the current locale?

cheers,
Martin
0
10/7/2008 7:30:01 AM
"Simon" <spambucket@example.com> ha scritto nel messaggio 
news:6l056rFa1jreU1@mid.individual.net...

> I have some legacy code that saves a file to ANSI, (using FILE* fp; 
> fopen( &fp, ..., "w"); fwrite(...))
> I am writing 'char*' to the file, (not wchar).
>
> The file is saved as ANSI, a third party application uses the file, but 
> one of their new requirement is that the file be UTF-8.
>
> Is there a way of saving my file so it is UTF-8?
>
> My application uses _UNICODE apart from that file saving class.
> I don't mind changing all the chars to wchars but I am not sure that it 
> will change the format of the file, (not the way I understand UTF-8 
> anyway).

Hi Simon,

to add to Mihai's correct answer, you may want to try a library I recently 
shared on Code Gallery:

  http://code.msdn.microsoft.com/UTF8Helpers

In the downloads section there is a simple console-based app to show basic 
use of it, there is a more complex MFC-based GUI app, and there is a 
PowerPoint presentation to show basic class usage.

There are couple classes (UTF8TextFileReader and UTF8TextFileWriter) that 
embed the logic of Unicode UTF-16 (i.e. standard Windows Unicode: wchar_t*) 
to Unicode UTF-8 (char*) conversion.

Writing to file using UTF-8 using that classes is very easy, like:

  UTF8TextFileWriter outFile( filename );
  ...
  outFile.WriteLine( /* your Unicode CString here */ )

The library also exports two static methods to convert strings to and from 
UTF-16/UTF-8:

 - UTF8Convert::ToUTF16() : converts UTF-8 --> UTF-16
 - UTF8Convert::FromUTF16() : converts UTF-16 --> UTF-8

So you could still use these static methods for Unicode conversion and 
manage file input/output yourself, if you don't want to use 
UTF8TextFileWriter/Reader classes.
Note that with an overload of UTF8TextFileWriter constructor you could also 
specify to write a BOM (i.e. a sequence of bytes at the beginning of file, 
marking the content of the file as UTF-8).

For more details on Unicode, you may want to read the FAQ at Unicode.org:

 http://www.unicode.org/faq/

and I belive that for very detailed questions/answers, you will benefit from 
Mihai's help here.

Giovanni


0
10/7/2008 8:14:28 AM
>> I have some legacy code that saves a file to ANSI, (using FILE* fp; 
>> fopen(
>> &fp, ..., "w"); fwrite(...))
>> I am writing 'char*' to the file, (not wchar).
>>
>> The file is saved as ANSI, a third party application uses the file, but 
>> one
>> of their new requirement is that the file be UTF-8.
>>
>> Is there a way of saving my file so it is UTF-8?
>
> If you use a newer version of VS it should be easy:
> fopen takes a ccs=ENCODING parameter
> http://msdn.microsoft.com/en-us/library/yeby3zcb(VS.80).aspx
>

That did the trick, thanks

All I had to do was to subclass the legacy code and simply change the 'save' 
function

Thanks again.

Simon 

0
10/7/2008 5:39:32 PM
>
>> I have some legacy code that saves a file to ANSI, (using FILE* fp; 
>> fopen( &fp, ..., "w"); fwrite(...))
>> I am writing 'char*' to the file, (not wchar).
>>
>> The file is saved as ANSI, a third party application uses the file, but 
>> one of their new requirement is that the file be UTF-8.
>>
>> Is there a way of saving my file so it is UTF-8?
>>
>> My application uses _UNICODE apart from that file saving class.
>> I don't mind changing all the chars to wchars but I am not sure that it 
>> will change the format of the file, (not the way I understand UTF-8 
>> anyway).
>
> Hi Simon,
>
> to add to Mihai's correct answer, you may want to try a library I recently 
> shared on Code Gallery:
>
>  http://code.msdn.microsoft.com/UTF8Helpers
>
> In the downloads section there is a simple console-based app to show basic 
> use of it, there is a more complex MFC-based GUI app, and there is a 
> PowerPoint presentation to show basic class usage.
>
> There are couple classes (UTF8TextFileReader and UTF8TextFileWriter) that 
> embed the logic of Unicode UTF-16 (i.e. standard Windows Unicode: 
> wchar_t*) to Unicode UTF-8 (char*) conversion.
>
> Writing to file using UTF-8 using that classes is very easy, like:
>
>  UTF8TextFileWriter outFile( filename );
>  ...
>  outFile.WriteLine( /* your Unicode CString here */ )
>
> The library also exports two static methods to convert strings to and from 
> UTF-16/UTF-8:
>
> - UTF8Convert::ToUTF16() : converts UTF-8 --> UTF-16
> - UTF8Convert::FromUTF16() : converts UTF-16 --> UTF-8
>
> So you could still use these static methods for Unicode conversion and 
> manage file input/output yourself, if you don't want to use 
> UTF8TextFileWriter/Reader classes.
> Note that with an overload of UTF8TextFileWriter constructor you could 
> also specify to write a BOM (i.e. a sequence of bytes at the beginning of 
> file, marking the content of the file as UTF-8).
>
> For more details on Unicode, you may want to read the FAQ at Unicode.org:
>
> http://www.unicode.org/faq/
>
> and I belive that for very detailed questions/answers, you will benefit 
> from Mihai's help here.
>
> Giovanni
>
>

This looks very interesting, I will certainly have a look at it.

Thanks for sharing,

Simon 

0
10/7/2008 5:40:32 PM
"Simon" <spambucket@example.com> ha scritto nel messaggio 
news:6l1lhmFa3c24U2@mid.individual.net...

> Thanks for sharing,

Thanks for your interest in it.

Giovanni



0
10/7/2008 7:00:02 PM
> Nice info.
> However, in the typical MSDN way the docs fail to mention which 
> character set / code page is used for converting to/from. I would assume 
> the one set for the current locale?

If the application calls fopen + ccs=ENCODING,
then the conversion will be between ENCODING and the system code page
If the application calls _wfopen + ccs=ENCODING,
then the conversion will be between ENCODING and UTF-16

There are some hints in the doc:
   "(as if by a call to the mbtowc function)
   ...
   (as if by a call to the wctomb function)"

And the "common knowledge" that the only code pages directly
handled by Windows APIs are UTF-16, ANSI, and OEM
(except for the very few code page conversion APIs that take code
pages as paramteres)

But it is not very clear, indeed.


-- 
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
0
10/8/2008 5:47:41 AM
If your app uses 8-bit characters but only in the range 32..127, it is already in UTF8
format.  Otherwise, you would have to run it into and out of UTF-16; the first conversion
(MultibyteToWideChar) to convert CP_ACP to wide characters, and the next conversion
(WideCharToMultiByte) to conver from Unicode to CP_UTF8.  Then write out the result.  Note
there is also an office UTF8 "Byte order mark" 0xEF 0xBB 0xBF that they may be expected as
the first 3 bytes of the file.
					joe

On Tue, 7 Oct 2008 05:57:20 +0200, "Simon" <spambucket@example.com> wrote:

>Hi,
>
>I have some legacy code that saves a file to ANSI, (using FILE* fp; fopen( 
>&fp, ..., "w"); fwrite(...))
>I am writing 'char*' to the file, (not wchar).
>
>The file is saved as ANSI, a third party application uses the file, but one 
>of their new requirement is that the file be UTF-8.
>
>Is there a way of saving my file so it is UTF-8?
>
>My application uses _UNICODE apart from that file saving class.
>I don't mind changing all the chars to wchars but I am not sure that it will 
>change the format of the file, (not the way I understand UTF-8 anyway).
>
>Many thanks.
>
>Regards,
>
>Simon 
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
0
newcomer (15974)
10/15/2008 8:57:07 PM
Reply:

Similar Artilces:

Re: Converting Access Database From 2003 to Access 97
jixiaoli <jiaxialoli@eyou.com> wrote in message news:... > > Tim <xtimx_wildingx@yah.com> wrote in message > news:27e001c3fc97$6b59bbd0$a501280a@phx.gbl... > I am developing applications in VB and Access. I distribute an Access 97 > database as part of the system. I have recently upgraded from using Access > XP to Access 2003 to manage the distributed database. Since I still wish to > distribute the database in Access 97 format to clients, I convert the file > from Access 2003 to Access 97 format before distribution. The distribution > database contains ...

How do I convert an existing MS Excel worksheet tracking a simple. #2
I am a novie Excel user who has tracked the activity of my simple savings account with an Excel worksheet. I would like to set up this existing worksheet to perform autosum functions when I enter deposits, withdrawals, and interest instead of manually doing the math on a calculator to arrive at the balance total after each entry. ...

Cumulative Security Update for Internet Explorer 8 for Windows XP
Last night I was going about my regular buisness of (after work) getting on and vegging in fron of my laptop while I watch TV. An update came around (mind you everything was working fine yesterday) which was: Cumulative Security Update for Internet Explorer 8 for Windows XP (KB980182). Now for some reason I can't access yahoo.com or lotro.com? These two sites are the only ones that I can't seem to access for some reason? I have tried to search for malicious spyware and viruses nothing comes across (McAfee, and Malewarebytes) I have tried to System restore to a date I ...

Problem when converting handwriting to text
Hi, I have a tablet, Motion Computing LE1700, with Windows 7 Professional 32 bits English. I installed Office 2007 and OneNote 2007 in Spanish. I take notes sometimes in English, sometimes in Spanish. I thought that OneNote could convert handwriting to text in English and Spanish but it is only working in English. When I try to convert handwriting in Spanish (choosing Spanish as main language) OneNote does nothing. No error, no messages, absolutely nothing. The same behavior happens with OneNote 2010 Beta Spanish. Is it correct? How could I convert Spanish handwriting to te...

How do I convert a csv file to an Excel file?
Hi- My Excel program does not recognize .csv files. I know Excel can easily convert these files, but I am not sure how to tell my version of Excel to do this. It is Excel 2002 on Windows XP. Any suggestions? Thanks- Anne In Excel File -> Open Change "Files of Type" to ALL Files You should be able to see your CSV file. "Anne B" wrote: > Hi- > > My Excel program does not recognize .csv files. I know Excel can easily > convert these files, but I am not sure how to tell my version of Excel to do > this. It is Excel 2002 on Windows XP. Any su...

Converting PostedFile to an XMLDocument.
I need to convert an HTTPPostedFile I am getting from the client to an XMLDocument. The class I am using must have a document. The following gives me an error when I try and build. I cannot change the converting class. Any help would be appreciated. The error is: Value of type 'System.Web.HttpPostedFile' cannot be converted to 'System.Xml.XmlDocument' The code: Select Case Extension Case "xml" ConvertXML.LoadCMRToTempTables(Session.SessionID, SelectedFile.PostedFile) <<< error here. Case ".B8" ConvertB8.LoadB8IntoTe...

convert an angle to degrees minutes seconds
angles are formatted like 90.5 degrees. or 90 degrees 30 mins 00 seconds it shoult look like this 90* 30' 00" (the * is a degree symbol) i odnt know how to get excell to do this See http://cpearson.com/excel/latlong.htm In article <3350A92A-E3AF-4977-B034-CFE6BC454AFF@microsoft.com>, Mac <Mac@discussions.microsoft.com> wrote: > angles are formatted like 90.5 degrees. or 90 degrees 30 mins 00 seconds it > shoult look like this 90* 30' 00" (the * is a degree symbol) i odnt know how > to get excell to do this ...

How to convert an Excel file to CSV File
IS there anyway to convert an excel file into comma delimited file - I used the Save as and CSV file format, but that didnt help much. Any other options/ways we can do this. Thanks, Damu "Damu"... > IS there anyway to convert an excel file into comma delimited file - I > used the Save as and CSV file format, but that didnt help much. hi Damu, Worksforme, what did it do instead for you. If you view it in Excel it is going to look like nothing changed. View the created .csv file with Notepad such as right click on the file from Explorer (File Manager) and View with No...

Publisher 2007's PDF converter removes transparency from pictures.
Hello, I really would appreciate any help in the following issue. I have some gif pictures in my publisher 2007 document that are transparent (tried this with png pictures too). Everything looks good when I print the document to a printer. But if I use Microsoft's built-in PDF converter then the transparancy becomes "white" and that is not good. My printer requires a PDF and can't work from publisher 2007 files. Any suggestions on how to STOP publisher from getting rid of the transparency?? Thanks! I'm not seeing this. When you move the image over another obje...

Adding MDAC 2.8 May Help Associate BCM/Outlook
Mag-- I forgot to add to my checklist that some people feel downloading and installing MDAC Microsoft Data Access Components 2.8 may help get BCM associated with Outlook. Chad Harris ______________________________________________ From: "mag" <anonymous@discussions.microsoft.com> Subject: Business contact manager (Outlook 2003) Date: Wednesday, May 19, 2004 2:16 PM My laptop is not recognizing that the Business contact manager has been installed. I have created one or two contacts but cannot access the business contact form not any business contact forlders, and when s...

Converting a PDF of an Excel sheet back to Excel?
Hi Everyone, I have just been asked by my boss if it is possible to convert a PDF of an excel worksheet back into excel!I have no idea if it is possible, can anyone please help in this regard? Cheers Ash You should be able to copy and paste the values, but you definitely will lose the formulas. On Thu, 23 Sep 2004 17:31:48 -0700, "Ashley" <anonymous@discussions.microsoft.com> wrote: >Hi Everyone, > >I have just been asked by my boss if it is possible to >convert a PDF of an excel worksheet back into excel!I have >no idea if it is possible, can anyone pl...

Converting time to decimal then rounding
Hi I have a formula =MOD(T11-R11,1)*24 Where T11 is 22:43 and R11 is 22:34 The difference of these time values is 9 minutes or .15. Currently I am getting the result of .1 because it is not rounding up. The result I need is ..2 -- so that Excel will round .01 to .04 down and .05 to .09 up. How should I adjust my formula? Thanks!! Maybe: ROUNDUP((A1-A2)*24,1) Micky "JB Bates" wrote: > Hi > > I have a formula > > =MOD(T11-R11,1)*24 > > Where T11 is 22:43 and R11 is 22:34 > > The difference of these time values i...

Converting Excel file to CSV Format
When I tried converting the excel file to CSV format, if my text in one of the columns contains inverted commas Eg: 10" Pipes; after conversion, if i open the csv file by using notepad, it will appear as "10" Pipes". How to avoid having the system auto-adding the open converted commas to the text. TQ You cant use comma's when converting to a csv file because it splits the cells by the comma's, i.e Comma Seperate Value (csv). If you can use a different value from comma that will solve this immediately >-----Original Message----- >When I tried con...

Field Service Installation Failed in GP 8 with Arabic Collation
Hi All, I unable to install the databse schema to any of the database due to field service does not install on Arabic Collation. Anybody has solution. Regards Akber ...

Duplicate detection rule when converting a Lead
Hi, I've created a duplicate detection rule where I'm supposed to get notify when a new account is created with the same account name. It works fine but when I convert a lead into an account and the account alreay exists, it let me convert the lead and does not give me any notification. We create 2 exact same accounts. Any idea? Valery Perhaps the standard dup detect only works in the parent pipeline? - Chris ...

Convert excel printout into electronic file
I've been asked to convert 130 pages of excel printout (roughly 3,800 lines with 8 columns per line) into an electronic excel file! What is the best way to get a scanned document ito excel format so that each cell is represented in the same way as the printouts? thank you so much for any help on this! -- rubaiyat543 ------------------------------------------------------------------------ rubaiyat543's Profile: http://www.excelforum.com/member.php?action=getinfo&userid=29815 View this thread: http://www.excelforum.com/showthread.php?threadid=495194 You will need some OCR sof...

Migrating from MFC 7 to 8 (error C2440)
Hello everybody. I'am migrating a project from MFC 7 to 8, and the compiler gives me the error: ....\MyCode.cpp(53) : error C2440: 'static_cast' : cannot convert from 'UINT (__thiscall CMyDerivedCwndClass::* )(CPoint)' to 'LRESULT (__thiscall CWnd::* )(CPoint)' Cast from base to derived requires dynamic_cast or static_cast The error is in the message map: BEGIN_MESSAGE_MAP(CMyDerivedCwndClass, CWnd) //{{AFX_MSG_MAP(CMyDerivedCwndClass) ON_WM_NCHITTEST() //Here is the error!!! //}}AFX_MSG_MAP END_MESSAGE_MAP() I need the MFC 7 version too, so, the c...

Converting months to weekdays and workdays for any given year
I'm hung up on a simple calculation: Col A lists 12-months Col B lists Workdays per Month, Col c is Year How do I calculate the number of workdays per month? The number of weekdays per month? Thanks On Sun, 04 Feb 2007 10:35:58 -0500, Cary <carygee@hotmail.com> wrote: >I'm hung up on a simple calculation: Col A lists 12-months Col B lists >Workdays per Month, Col c is Year > >How do I calculate the number of workdays per month? The number of >weekdays per month? > >Thanks Check HELP for the NETWORKDAYS function. If your Col A months are sequential, ...

Outlook stops responding #8
After being open for 1 to 2 minutes, the cursor flashes between the pointer and the hourglass several times after which the program locks up. Then the error message comes up warning that the program will close. I am running outlook 2002 on W98 SE in a network environment. Any sugggestions on how to fix, Thank you ...

Stuck in Exclusive Mode, Converting from Access 2000 to 2007
There is no .ldb file to delete. If there is I don't see it. Wouldn't it be in the same folder the .dbe file is in? I am the database administrator and at some point I opened the file in exclusive mode. Since then I have logged off many times and have even rebooted. I am trying to convert this 2000 database to Access 2007 by "Saving As." But this can't be done since the ghost user (me) already has the file open in the exclusive mode. Any advice? >>I am trying to convert this 2000 database to Access 2007 by "Saving As." Try ...

converting numbers into words
Does anyone have a script to convert numbers into words (e.g. in the way you would print on a check)? Thanks Joerg Joerg Have a look at the KB article here to build your own Function http://support.microsoft.com/support/kb/articles/Q213/3/60.ASP OR Download MOREFUNC.XLL from Laurent Longre's site. Does this(NumText) and much more. http://longre.free.fr/english/index.html Gord Dibben Excel MVP On Sat, 2 Oct 2004 15:40:52 -0400, "Joerg Reutershan" <joerg.reutershan@adelphia.net> wrote: >Does anyone have a script to convert numbers into words (e.g. in the way y...

Converting Express to Outlook 2003
I have just recently switched from express to outlook 2003. When entering the first couple of letters from a contact in my address book the name does not automatically appear as it does in express. How can I fix this problem? I have already imported all email addresses from express to outlook. The auto-complete addresses will be available after you first enter the address. This is different from OE. --� Milly Staples [MVP - Outlook] Post all replies to the group to keep the discussion intact. All unsolicited mail sent to my personal account will be deleted without reading. After fur...

Four-colored Publisher Document converted to PDF & emailed.......
when opened by recipiant, i.e., lettering and subtitles in created in blue comes out as black. Is there any way to created a multi-colored Publisher document, convert doc to PDF, email, and yet have all colors retained in such a PDF file once emailed and opened? Great board for info! ReBell Sticks, MS Are you using Acrobat? What version Publisher? How to create a PDF file in Publisher http://support.microsoft.com/kb/302835/en-us Spot colors in EPS graphics distill incorrectly to Printer Definition File (PDF) as color separations in Publisher 2003 http://support.microsoft.com/kb/8231...

=?Utf-8?Q?Manual_Payment_trx._print_with_=E2=80=98?= =?Utf-8?Q?Print_Check=E2=80=99?=
I am creating a payable transaction through Manual Payment window and save. Is possible to print ‘Print Check’ using Manual Payment Transaction which I created. In the current version (GP10), there is not check printing facility in the Manual Payment window. However, you can consider performing outright purchase transaction in Payables Transaction Entry window in the future for you to have check printing facility. Hope this helps! Cheers, Dennis Araullo, MACS | Business Consultant Microsoft Certified Technology Specialist GP,CRM MCITP Installation and Configuration AX 2009 MCITP ...

4-1/2" x 7/8" 60-grit Type29 Zirconia Al (481-93634) Category: Coated Flap Disc Abrasives
Price:$67.20 Image: http://megadiscountguru.info/image.php?id=B00295UZS6 Best deal: http://megadiscountguru.info/index.php?id=B00295UZS6 Item #: 481-93634. Diameter = 4 1/2 inArbor Diameter = 7/8 inGrit = 60Speed = 13300 rpm [Max]Abrasive Material = Zirconia AluminaHub Mounting = Arbor HoleApplicable Materials = Most Metals, Exotic Alloys, SteelType = Arbor Hole DiscQuantity = 10 per box Customers also search for: Discount 4-1/2" x 7/8" 60-grit Type29 Zirconia Al, Buy 4-1/2" x 7/8" 60-grit Type29 Zirconia Al, Wholesale 4-1/2" x 7/8" 60-grit Type29 Zirconia ...