|
|
Unicode text Find & Replace
I am developing unicode to general text program in VSTO (VB.NET). Iam
working on Gurmukhi/Punkabi unicode i.e. Raavi. There are many chars in it
which are formed ny combining two or more diferant chars like ਕ + ਿ = ਕਿ or ਕ
+ ਿ + ਂ = ਕਿਂ. Earlier i ws not able to search any single instance of ਿ or ਂ
but later i tried use wildcard option along with [ਿ] . This char is sarched
perfectly but while replacing it with general char "i" replaces whole "ਕਿ" or
"ਕਿਂ" with single char "i". Can anyone help How can i search single chars
and replace with some other chars to normalize Unicode to General font
convertion ? ? ?
|
|
0
|
|
|
|
Reply
|
Utf
|
12/31/2009 8:09:02 AM |
|
"Chand" <Chand@discussions.microsoft.com> wrote:
> I am developing unicode to general text program in VSTO (VB.NET).
> Iam working on Gurmukhi/Punkabi unicode i.e. Raavi.
> There are many chars in it which are formed ny combining two or more
> diferant chars like ਕ + ਿ = ਕਿ or ਕ + ਿ + ਂ = ਕਿਂ.
> Earlier i ws not able to search any single instance of ਿ or ਂ
> but later i tried use wildcard option along with [ਿ] .
That's a problem I know from diacritics in other languages. The diacritic
alone, or the letter it's combined with alone, isn't found.
The work-around I use is the same as yours.
I've filed a bug report years ago, but didn't hear back if it's going to be
fixed.
> This char is sarched perfectly but while replacing it with general
> char "i" replaces whole "ਕਿ" or "ਕਿਂ" with single char "i".
> Can anyone help How can i search single chars and replace
> with some other chars to normalize Unicode to General font
> convertion ? ? ?
I'm not sure from your description what you want to replace with what.
Since you're doing a wildcard replacement, you can re-use anything matched
if you put parentheses around it in "Find what", and then use the
appropriate placeholder in "Replace with" (\1 for the first expression in
parentheses, \2 for the second...).
I have no idea about Gurmukhi/Punkabi unicode. In other scripts using
ligatures and diacritics like say Arabic, the ligatures form automatically
if a well-designed font is used. The glyphs for ligatures may be only in
Unicode for compatibility reasons -- because old fonts don't do the
ligatures, or because old files used the ligatures since fonts back when
didn't do them automatically.
So maybe ask in a group with knowledgeable people (say
microsoft.public.word.international.features) if the replacements you are
trying to make are sensible, or if instead you can use a font that handles
the ligatures automatically?
Regards,
Klaus
|
|
0
|
|
|
|
Reply
|
Klaus
|
1/5/2010 5:06:32 PM
|
|
I'm unsure if this addresses the problem, but the code that I wrote to
replace both ANSI and Unicode character strings with ligatures required:
1. Specify match case (prevents character variants from matching)
2. Exclude small caps
3. Enable format matching (to detect caps and bold/italic properly)
Cheers
"Klaus Linke" wrote:
> "Chand" <Chand@discussions.microsoft.com> wrote:
> > I am developing unicode to general text program in VSTO (VB.NET).
> > Iam working on Gurmukhi/Punkabi unicode i.e. Raavi.
> > There are many chars in it which are formed ny combining two or more
> > diferant chars like ਕ + ਿ = ਕਿ or ਕ + ਿ + ਂ = ਕਿਂ.
> > Earlier i ws not able to search any single instance of ਿ or ਂ
> > but later i tried use wildcard option along with [ਿ] .
>
> That's a problem I know from diacritics in other languages. The diacritic
> alone, or the letter it's combined with alone, isn't found.
> The work-around I use is the same as yours.
> I've filed a bug report years ago, but didn't hear back if it's going to be
> fixed.
>
> > This char is sarched perfectly but while replacing it with general
> > char "i" replaces whole "ਕਿ" or "ਕਿਂ" with single char "i".
> > Can anyone help How can i search single chars and replace
> > with some other chars to normalize Unicode to General font
> > convertion ? ? ?
>
> I'm not sure from your description what you want to replace with what.
> Since you're doing a wildcard replacement, you can re-use anything matched
> if you put parentheses around it in "Find what", and then use the
> appropriate placeholder in "Replace with" (\1 for the first expression in
> parentheses, \2 for the second...).
>
> I have no idea about Gurmukhi/Punkabi unicode. In other scripts using
> ligatures and diacritics like say Arabic, the ligatures form automatically
> if a well-designed font is used. The glyphs for ligatures may be only in
> Unicode for compatibility reasons -- because old fonts don't do the
> ligatures, or because old files used the ligatures since fonts back when
> didn't do them automatically.
> So maybe ask in a group with knowledgeable people (say
> microsoft.public.word.international.features) if the replacements you are
> trying to make are sensible, or if instead you can use a font that handles
> the ligatures automatically?
>
> Regards,
> Klaus
>
> .
>
|
|
0
|
|
|
|
Reply
|
Utf
|
1/12/2010 4:15:01 AM
|
|
|
2 Replies
949 Views
(page loaded in 0.069 seconds)
Similiar Articles: Unicode text Find & Replace - microsoft.public.word.vba ...I am developing unicode to general text program in VSTO (VB.NET). Iam working on Gurmukhi/Punkabi unicode i.e. Raavi. There are many chars in it ... Macro to replace Unicode Complex Script - microsoft.public.word ...VBA Find a Replace - microsoft.public.word.vba.general I don't know how to integrate your script ... how do I use macro to find and replace ... Unicode text Find & Replace ... How do you find and replace unwanted symbols in text cells ...i am importing excel data to access. one particular spreadsheet has inverted ? symbol (unicode hex 00bf) in a text cell. access will not import. ... Instantly use selected text for find and replace dialog ...Ctrl+H to open the Replace dialog. 2. In the "Find ... that you have selected the next time you use Ctrl+H (Find and Replace ... Unicode text Find & Replace - microsoft ... Search, Replace & Symbol, Decorative fonts - microsoft.public ...For unicode fonts I'd use regular search / replace. In both cases, however, I'd need to have ... do you find and replace unwanted symbols in text cells ... FIND AND REPLACE ... VBA Find a Replace - microsoft.public.word.vba.generalUnicode text Find & Replace - microsoft.public.word.vba ... I am developing unicode to general text program in VSTO (VB.NET). Iam working on Gurmukhi/Punkabi unicode i.e ... How well can TextOut() handle Unicode? - microsoft.public.vc.mfc ...How well can TextOut() handle Unicode? - microsoft.public.vc.mfc ... Unicode text Find & Replace - microsoft.public.word.vba ... if a well-designed font is used. Convert UTF-16 Unicode to UTF-8 Unicode? - microsoft.public.vb ...How can I convert a UTF-16 (also known as UTF-7) unicode text file to a UTF ... Unicode text Find & Replace - microsoft.public.word.vba ... Convert UTF-16 Unicode to UTF-8 ... Help!!! Using WildCards for Find and Replace - microsoft.public ...Using WildCards for Find and Replace - microsoft.public ... Unicode text Find & Replace - microsoft.public.word.vba ... Can anyone help How can i search single chars and ... find in files and replace - microsoft.public.windows.powershell ...Unicode text Find & Replace - microsoft.public.word.vba ... Unicode text Find & Replace - microsoft.public.word.vba ... replace both ANSI and Unicode character strings ... Unicode Search and Replace (UTF-16LE) - Data Conversion | Find ...Use this filter to find Unicode text that appears exactly as you have typed it, or with a case variation. You can also use this filter to find multi-line text. Unicode text Find & Replace - microsoft.public.word.vba ...I am developing unicode to general text program in VSTO (VB.NET). Iam working on Gurmukhi/Punkabi unicode i.e. Raavi. There are many chars in it ... batch find/search and replace text in ansi/utf-8/unicode encoding ...search/find and replace text files in batch, it can process thousands of ansi/utf-8/unicode text documents include sub-folders within several minutes Find and replace text or other items - Word - Office.comIn Word 2003, search for and replace text, numbers, formats, paragraphs, page breaks, wildcards ... To find a character using the Unicode value, select the Match Case check box. Script to find and replace unicode text: Unicode batch file vbs ...I am looking for a script or small app I can use in a batch file to find and replace text in a unicode file. Either by typing in a varialbe ; filename.vbs ... 7/24/2012 7:16:34 AM
|
|
|
|
|
|
|
|
|