New user registration is currently disabled due to spam abuse / Регистрация новых пользователей в настоящее время приостановлена из-за злоупотреблений спаммерами

Can a new format created to make the dict smaller?

General discussion

Can a new format created to make the dict smaller?

Postby bnu05071009 » Wed Oct 12, 2011 4:39 am

Can a new format created to make the dicts smaller?So we can read wikipedia in goldendict offline.
Or can goldendict read for a .rar or .zip archive?
bnu05071009
 
Posts: 2
Joined: Wed Oct 12, 2011 4:18 am

Re: Can a new format created to make the dict smaller?

Postby Tvangeste » Wed Oct 12, 2011 6:21 am

GoldenDict can read dictzipped dictionaries (at least for Stardict and DSL formats for sure).

And that's what I do to reduce the dictionary size: Convert the DSL file to UTF-8 and unix EOLs (compared to UTF-16 with windows EOLs that gives quite significant savings) and then dictzip the file so that: dictionary.dsl --> dictionary.dsl.dz

The resulting compressed file is quite small.
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

Re: Can a new format created to make the dict smaller?

Postby bnu05071009 » Wed Oct 12, 2011 6:50 am

Tvangeste wrote:GoldenDict can read dictzipped dictionaries (at least for Stardict and DSL formats for sure).

And that's what I do to reduce the dictionary size: Convert the DSL file to UTF-8 and unix EOLs (compared to UTF-16 with windows EOLs that gives quite significant savings) and then dictzip the file so that: dictionary.dsl --> dictionary.dsl.dz

The resulting compressed file is quite small.



Thank you!
bnu05071009
 
Posts: 2
Joined: Wed Oct 12, 2011 4:18 am

Re: Can a new format created to make the dict smaller?

Postby the_cla5h » Wed May 21, 2014 2:51 am

Thanks for the explanation! But how can I convert the DSL file to UTF-8 and unix EOLs? :roll:
the_cla5h
 
Posts: 15
Joined: Sun Mar 09, 2014 1:38 am

Re: Can a new format created to make the dict smaller?

Postby wargus » Wed May 21, 2014 9:12 am

the_cla5h wrote:Thanks for the explanation! But how can I convert the DSL file to UTF-8 and unix EOLs? :roll:


A DSL file is a simple text file (with tags), you can convert it to UTF-8 with every word processor (i prefer Notepad++ or EmEditor).
For Unix EOLs you can see here: http://ubuntugenius.wordpress.com/2010/10/26/how-to-convert-windowsdos-text-files-to-linuxunix-format/ maybe it can help you...
wargus
 
Posts: 14
Joined: Wed Feb 08, 2012 3:03 pm

Re: Can a new format created to make the dict smaller?

Postby the_cla5h » Wed May 21, 2014 3:33 pm

Thanks a lot for you link wargus, it was very useful!
I didn't need to use those EOL tools though, because I discovered that my file was already in Unix format. I had some problems converting it from UTF-16 to UTF-8 because it was a 2,4 GB file, so gedit and other text editors would crash while opening it or while trying to save it. Fortunately, i found a really light and fast text editor named AkelPad (it's for Windows but works in Linux with Wine), which was able to open it without crashing and save it as new file in UTF-8 format. From 2,4 GB it shrank to 1,2 gb (50%!). Then I was able to compress it with dictzip to a 338 MB file. :)
the_cla5h
 
Posts: 15
Joined: Sun Mar 09, 2014 1:38 am


Return to General

Who is online

Users browsing this forum: Google [Bot] and 28 guests