Page 1 of 1

loading jpn-eng / eng-jpn bgl

PostPosted: Mon Jul 20, 2009 1:01 pm
by makemeunsee
Goldendict exits when attempting to load the japanese - english - japanese dictionaries.
These dicts can be found, packed, here (first 2)
http://www.babylon.com/category/14/Japanese.html
It's due to the shift-jis charset id in bgl_babylon.hh not being recognized by libiconv; use "SHIFT_JIS" instead of "SJIS-WIN" to make it work.
I cannot use svn for now but I'll be glad to contribute directly in the future...

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Thu Jul 23, 2009 6:28 pm
by ikm
Does this seem to be an issue with the Windows version of libiconv supplied? There were no problems with those dictionaries under Linux.

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Wed Jul 29, 2009 8:35 am
by makemeunsee
yes with the supplied windows libiconv.
gonna test the workaround I mentionned above with the linux version this week end, I'll update this post with the results I get.

EDIT:
On linux, whichever of the constant name (SHIFT_JIS or SJIS-WIN), it works.
I've also tested with the most recent version of libiconv I compiled for windows (1.13 instead of the old 1.9 provided in the package), and the bug still happens.

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Wed Jul 29, 2009 12:42 pm
by ikm
Do you know if there is any difference between SHIFT_JIS and SJIS-WIN? During the indexing of Japanese bgls some of them give a lot of charset decoding errors, and I had a *feeling* that SJIS_WIN gave a bit less of them. But I don't know for sure. I could change the value to SHIFT_JIS, but I'm afraid of possible implications (the whole S-JIS thing is a bit of a mess).

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Thu Jul 30, 2009 12:49 pm
by makemeunsee
I just found this(check part '(6) Differences') and this
So there are some significant differences... I'll investigate and try to compare the decoding errors for both charsets.

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Fri Jul 31, 2009 11:46 am
by ikm
If you could sort out the issues with the SJIS encodings in Babylon files, that would be great. Originally the babylon parser which goldendict took used plain SHIFT_JIS. But then I noticed a lot of encoding conversion errors in some dictionaries, and tried changing it to SJIS-WIN instead. I noticed no drawbacks, and I *think* it helped with those errors *a bit*. They'd still show up here and there though. So it's like this. I don't really know Japanese too well to see those errors while browsing the dictionaries themselves in the program -- they look okay to me, but it doesn't mean they really are (same is like it was in e.g. Hebrew -- all looked fine to me but then the people who actually knew how to read it noticed that sentences lacked last letters).

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Tue Aug 04, 2009 8:32 am
by makemeunsee
I'm not good enough in japanese either to detect all errors, I just can read basic stuff and so far the bgl imports seem alright to me. I'll try to think of some test cases directly using the babylon output.
Indeed there are more iconv errors when using SHIFT_JIS, I agree it's not the proper solution. Actually, I've found out that SJIS-WIN is called CP932 instead in MS Windows, and it's compatible with my Ubuntu also. I compared the iconv errors of SHIFT_JIS, SJIS-WIN and CP932 (plus other charsets, with no success), and CP932 and SJIS-WIN are really the same, with SHIFT_JIS generating a few more errors.
So replacing "SJIS-WIN" with "CP932" seems the good solution.

Re: loading jpn-eng / eng-jpn bgl

PostPosted: Sun Nov 22, 2009 7:47 am
by hinomiya
Here is description of difference between sjis and cp932. ;)

http://dev.mysql.com/doc/refman/5.4/en/ ... -sets.html