New user registration is currently disabled due to spam abuse / Регистрация новых пользователей в настоящее время приостановлена из-за злоупотреблений спаммерами

GD abbreviation functionality (Abbrev.dsl)

General discussion

Re: GD abbreviation functionality (Abbrev.dsl)

Postby det » Sun Jul 03, 2011 12:00 pm

So this must be a unicode-related bug. I tried the following:
- Added a latin-script-only abbreviation entry: it worked perfect, showing full form on mouse-over
- Added a latin-script abbrev. with mixed full form text: the mouse-over box appears, but the bengali unicode text is garbled latin script.

The top of my modified _abrv.dsl file with the latin-script entry.
Code: Select all
inference
   not abbreviated (অনুমান)
প্র.
   প্রবাদ, প্রবচন (proverb)
বর্জি.
   বর্জিত বা বর্জনযোগ্য (rejected)

When holding the mouse over the word inference, I get the following. The latin script part is fine; the bengali script part is garbled:
not abbreviated (a¦a¦Ëa§ŕa¦®a¦sa¦Ë)
det
 
Posts: 37
Joined: Fri Jul 23, 2010 7:22 am

Re: GD abbreviation functionality (Abbrev.dsl)

Postby Tvangeste » Sun Jul 03, 2011 12:03 pm

Can you post a small sample so that we could try it out?
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

Re: GD abbreviation functionality (Abbrev.dsl)

Postby det » Sun Jul 03, 2011 12:19 pm

I've attached a mini version of the dictionary. 'degree' is the english word added to the abrv list in this case; it is also indexed for that entry. Thanks for looking into it.
Attachments
bangla-small.zip
Contains both dictionary and abrv file
(2.28 KiB) Downloaded 645 times
det
 
Posts: 37
Joined: Fri Jul 23, 2010 7:22 am

Re: GD abbreviation functionality (Abbrev.dsl)

Postby Tvangeste » Sun Jul 03, 2011 12:34 pm

det wrote:I've attached a mini version of the dictionary. 'degree' is the english word added to the abrv list in this case; it is also indexed for that entry. Thanks for looking into it.

Heh, that was easy. :) Your main dictionary is in UTF-16LE, but your abbr file is in UTF-8, but without BOM mark (some editors would consider the file to be in ASCII).

Goldendict *does* work with UTF-8 dictionaries out of the box, but you need to explicitly set UTF-8 BOM at the beginning of the file so that GoldenDict won't be guessing the format. Most editors out there provide this functionality to save files in particular encodings, e.g. Notepad++.

Alternatively, you could convert your abbr file to UTF-16LE, to be in sync with the main dictionary format.

Once you correct the abbr file encoding issue, everything works just fine, and abbreviations are shown correctly.
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

Re: GD abbreviation functionality (Abbrev.dsl)

Postby det » Mon Jul 04, 2011 3:35 am

...right, I should have thought to check that.

Well, problem solved. Thank you!
det
 
Posts: 37
Joined: Fri Jul 23, 2010 7:22 am

Previous

Return to General

Who is online

Users browsing this forum: No registered users and 78 guests