Whole Tomato Software Forums
Whole Tomato Software Forums
Main Site | Profile | Register | Active Topics | Members | Search | FAQ
User name:
Password:
Save Password
Forgot your password?

 All Forums
 Visual Assist
 Technical Support
 [1812] non ascii charactor in source file error
 New Topic  Reply to Topic
 Printer Friendly
Author Previous Topic Topic Next Topic  

ex3
New Member

5 Posts

Posted - Feb 25 2010 :  02:18:47 AM  Show Profile  Reply with Quote
env:
win7 64bit Chinese
vs2010 rc
va1812


any Chinese text(const string and/or comment) in source code will cause vax matching () {} [] in a wrong way, it seams that (){}[] matching only after Chinese text will wrong, if it is before any Chinese text, it is all ok.

and all the hightlighting after any Chinese text will wrong too, have the same problem.

ex3
New Member

5 Posts

Posted - Feb 25 2010 :  02:21:46 AM  Show Profile  Reply with Quote
by the way, my codepage = 936, but i dont think its mater, i try other codepage, not working!
Go to Top of Page

feline
Whole Tomato Software

United Kingdom
19025 Posts

Posted - Mar 02 2010 :  3:55:47 PM  Show Profile  Reply with Quote
So far I cannot reproduce this problem here. Can you please send me a sample code file that shows the problem, as a zip file. Sending it as a zip file makes sure it will have the correct encoding when I get it.

Please submit the files via the form:

http://www.wholetomato.com/support/contact.asp

including this thread ID or URL in the description, so we can match it up.

zen is the art of being at one with the two'ness
Go to Top of Page

ex3
New Member

5 Posts

Posted - Mar 04 2010 :  10:04:28 AM  Show Profile  Reply with Quote
quote:
Originally posted by feline

So far I cannot reproduce this problem here. Can you please send me a sample code file that shows the problem, as a zip file. Sending it as a zip file makes sure it will have the correct encoding when I get it.

Please submit the files via the form:

http://www.wholetomato.com/support/contact.asp

including this thread ID or URL in the description, so we can match it up.



Done!
Go to Top of Page

feline
Whole Tomato Software

United Kingdom
19025 Posts

Posted - Mar 05 2010 :  4:50:12 PM  Show Profile  Reply with Quote
I have the files, thank you for these:

case=40494

However I cannot see any Chinese letters when I open the cpp file. I have tried several different methods, but Windows never opens the file as Chinese.

Can you please save out the file with a Chinese codepage, zip it up, and send it to me? I suspect the problem is that I am using an English OS, but I do not read Chinese, so installing a Chinese OS is going to be very difficult.

I have tried typing Chinese letters into the test cpp file, but so far I cannot reproduce the problem.

zen is the art of being at one with the two'ness
Go to Top of Page

ex3
New Member

5 Posts

Posted - Mar 06 2010 :  01:59:21 AM  Show Profile  Reply with Quote
the cpp file i gave you, codepage=936(gb2312), witch is the codepage visual studio used in Windows Chinese version by default for Chinese languare.
about you cant see any Chinese in the cpp, i think you need install Chinese language packs from Windows Update, there are optional packs.
http://support.microsoft.com/?id=972813 (need ultimate version win7 or vista? not sure!)

By my observation, The problem is caused by mutibyte text coding, va count each Chinese char as two ascii char, therefore lenght=2, but vs count each Chinese char as one char so length=1.
after i put 6 Chinese char in the comment, the offset of hightlight after the comment is 6.

if you still need cpp in other codepage, tell me witch one(more than one) do you want.
Go to Top of Page

feline
Whole Tomato Software

United Kingdom
19025 Posts

Posted - Mar 08 2010 :  3:17:04 PM  Show Profile  Reply with Quote
When I open the solution and file in VS2010, and then look at:

IDE File menu -> Advanced Save Options... -> Encoding

the default code page is 1252, "Western European (Windows)", and this is what I am seeing:



I have the Chinese keyboard installed on this test machine, and I have deleted the invalid Chinese characters and typed the following text, and this is what I am seeing:



and I am not seeing any sign of a problem. Should I be seeing a problem with this comment?

zen is the art of being at one with the two'ness
Go to Top of Page

ex3
New Member

5 Posts

Posted - Mar 09 2010 :  04:46:02 AM  Show Profile  Reply with Quote
now i am sure that the problem caused by mutibyte char encoding.
it must be like this:
codepage = 936 (gb2312)
va1814 strlen("a") = 1, strlen("-?-?") = 2
vs2010 strlen("a") = 1, strlen("-?-?") = 1

codepage = 1252 (Western European) or other ascii encoding
va1814 strlen("a") = 1, strlen("-?-?") = 2
vs2010 strlen("a") = 1, strlen("-?-?") = 2

so, in a ascii(your) env, everything fine, but in nonascii(my) env, its incorrect.
to repeat this, you need a(any) unicode language package (in my last post) then change to this language to set up the env..
otherwise, no matter what encoding file you got, you env will treat it as ascii file.

to hightlight a obj, va calculate the pos in the file first then tell vs to hightlight that pos right? so eithter va use MultiByteToWideChar() to change to the vs way, or va have to somehow force the vs to read file in the va way.
Go to Top of Page

feline
Whole Tomato Software

United Kingdom
19025 Posts

Posted - Mar 09 2010 :  4:50:42 PM  Show Profile  Reply with Quote
I have tested this in a Unicode file, and so far I cannot reproduce the problem. I have opened the cpp file in a hex editor and confirmed that all of the characters are using two bytes.

I am going to ask internally and see if anyone has any ideas.

zen is the art of being at one with the two'ness
Go to Top of Page

feline
Whole Tomato Software

United Kingdom
19025 Posts

Posted - Mar 30 2010 :  6:59:48 PM  Show Profile  Reply with Quote
Many apologies for the delay, but I have finally found out what is going on here. The problem makes sense now I finally understand it. The file you sent me is only partly Unicode. Most characters in the file take one character, but the Chinese characters in the comment take two characters.

I also need to configure the OS in a specific, not obvious manor before the file will open and work correctly.

Since VA thinks the file is not unicode, and each character in the file is one character on the screen the highlighting gets offset.

case=41798

zen is the art of being at one with the two'ness
Go to Top of Page

jrynd
New Member

USA
8 Posts

Posted - Apr 19 2010 :  1:07:58 PM  Show Profile  Reply with Quote
It's not Unicode (at least, it's not Microsoft's usual UTF-16/UCS-2 Unicode). It's code page 936 aka "GBK", which is a variable length encoding. Single bytes 0-127 are equivalent between 936 and ASCII, but a byte above 128 indicates that it's part of a multiple-byte character, as follows:
http://en.wikipedia.org/wiki/GBK

Edited by - jrynd on Apr 19 2010 1:15:27 PM
Go to Top of Page

support
Whole Tomato Software

5566 Posts

Posted - Apr 23 2010 :  03:19:40 AM  Show Profile  Reply with Quote
case=41798 is fixed in build 1822

Whole Tomato Software, Inc.
Go to Top of Page
  Previous Topic Topic Next Topic  
 New Topic  Reply to Topic
 Printer Friendly
Jump To:
© 2023 Whole Tomato Software, LLC Go To Top Of Page
Snitz Forums 2000