Page 1 of 1

Can a new format created to make the dict smaller?

PostPosted: Wed Oct 12, 2011 4:39 am
by bnu05071009
Can a new format created to make the dicts smaller?So we can read wikipedia in goldendict offline.
Or can goldendict read for a .rar or .zip archive?

Re: Can a new format created to make the dict smaller?

PostPosted: Wed Oct 12, 2011 6:21 am
by Tvangeste
GoldenDict can read dictzipped dictionaries (at least for Stardict and DSL formats for sure).

And that's what I do to reduce the dictionary size: Convert the DSL file to UTF-8 and unix EOLs (compared to UTF-16 with windows EOLs that gives quite significant savings) and then dictzip the file so that: dictionary.dsl --> dictionary.dsl.dz

The resulting compressed file is quite small.

Re: Can a new format created to make the dict smaller?

PostPosted: Wed Oct 12, 2011 6:50 am
by bnu05071009
Tvangeste wrote:GoldenDict can read dictzipped dictionaries (at least for Stardict and DSL formats for sure).

And that's what I do to reduce the dictionary size: Convert the DSL file to UTF-8 and unix EOLs (compared to UTF-16 with windows EOLs that gives quite significant savings) and then dictzip the file so that: dictionary.dsl --> dictionary.dsl.dz

The resulting compressed file is quite small.



Thank you!

Re: Can a new format created to make the dict smaller?

PostPosted: Wed May 21, 2014 2:51 am
by the_cla5h
Thanks for the explanation! But how can I convert the DSL file to UTF-8 and unix EOLs? :roll:

Re: Can a new format created to make the dict smaller?

PostPosted: Wed May 21, 2014 9:12 am
by wargus
the_cla5h wrote:Thanks for the explanation! But how can I convert the DSL file to UTF-8 and unix EOLs? :roll:


A DSL file is a simple text file (with tags), you can convert it to UTF-8 with every word processor (i prefer Notepad++ or EmEditor).
For Unix EOLs you can see here: http://ubuntugenius.wordpress.com/2010/10/26/how-to-convert-windowsdos-text-files-to-linuxunix-format/ maybe it can help you...

Re: Can a new format created to make the dict smaller?

PostPosted: Wed May 21, 2014 3:33 pm
by the_cla5h
Thanks a lot for you link wargus, it was very useful!
I didn't need to use those EOL tools though, because I discovered that my file was already in Unix format. I had some problems converting it from UTF-16 to UTF-8 because it was a 2,4 GB file, so gedit and other text editors would crash while opening it or while trying to save it. Fortunately, i found a really light and fast text editor named AkelPad (it's for Windows but works in Linux with Wine), which was able to open it without crashing and save it as new file in UTF-8 format. From 2,4 GB it shrank to 1,2 gb (50%!). Then I was able to compress it with dictzip to a 338 MB file. :)