DeDRM_tools/Topaz_Tools/lib/topaz-readme.txt

130 lines
4.6 KiB
Plaintext
Raw Normal View History

2010-01-19 18:11:59 +06:00
Contributors:
cmbtc - removal of drm which made all of this possible
clarknova - for all of the svg and glyph generation and many other bug fixes and improvements
skindle - for figuing out the general case for the mode loops
some updates - for conversion to xml, basic html
2010-01-21 18:14:31 +06:00
DiapDealer - for extensive testing and feedback, and standalone linux/macosx version of cmbtc_dump
2010-01-20 18:13:31 +06:00
stewball - for extensive testing and feedback
2010-01-19 18:11:59 +06:00
2009-01-22 18:15:33 +06:00
and many others for posting, feedback and testing
2010-01-19 18:11:59 +06:00
2010-01-17 18:10:35 +06:00
This is experimental and it will probably not work for you but...
ALSO: Please do not use any of this to steal. Theft is wrong.
This is meant to allow conversion of Topaz books for other book readers you own
Here are the steps:
1. Unzip the topazscripts.zip file to get the full set of python scripts.
The files you should have after unzipping are:
2010-01-24 18:19:20 +06:00
cmbtc_dump.py - (author: cmbtc) unencrypts and dumps sections into separate files for Kindle for PC
cmbtc_dump_nonK4PC.py - (author - DiapDealer) for use with standalone Kindle and ipod/iphone topaz books
decode_meta.py - converts metadata0000.dat to make it available
2010-01-17 18:10:35 +06:00
convert2xml.py - converts page*.dat, other*.dat, and glyphs*.dat files to pseudo xml descriptions
flatxml2html.py - converts a "flattened" xml description to html using the ocrtext
stylexml2css.py - converts stylesheet "flattened" xml into css (as best it can)
2010-01-20 18:13:31 +06:00
getpagedim.py - reads page0000.dat to get the book height and width parameters
2010-01-17 18:10:35 +06:00
genxml.py - main program to convert everything to xml
genhtml.py - main program to generate "book.html"
2010-01-24 18:19:20 +06:00
gensvg.py - (author: clarknova) main program to create an xhmtl page with embedded svg graphics
2010-01-21 18:14:31 +06:00
2009-01-27 18:20:37 +06:00
Please note, these scripts all import code from each other so please
keep all of these python scripts together in the same place.
2010-01-17 18:10:35 +06:00
2. Remove the DRM from the Topaz book and build a directory
of its contents as files
All Thanks go to CMBTC who broke the DRM for Topaz - without it nothing else
would be possible
2010-01-24 18:19:20 +06:00
If you purchased the book for Kindle For PC, you must do the following:
2010-01-17 18:10:35 +06:00
cmbtc_dump.py -d -o TARGETDIR [-p pid] YOURTOPAZBOOKNAMEHERE
2010-01-24 18:19:20 +06:00
However, if you purchased the book for a standalone Kindle or ipod/iphone
and you know your pid (at least the first 8 characters) then you should
instead do the following
cmbtc_dump_nonK4PC.py -d -o TARGETDIR -p 12345678 YOURTOPAZBOOKNAMEHERE
where 12345678 should be replaced by the first 8 characters of your PID
2010-01-17 18:10:35 +06:00
This should create a directory called "TARGETDIR" in your current directory.
It should have the following files in it:
metadata0000.dat - metadata info
other0000.dat - information used to create a style sheet
dict0000.dat - dictionary of words used to build page descriptions
page - directory filled with page*.dat files
glyphs - directory filled with glyphs*.dat files
2010-01-24 18:19:20 +06:00
3. REQUIRED: Create xhtml page descriptions with embedded svg
that show the exact representation of each page as an image
with proper glyphs and positioning.
2010-01-17 18:10:35 +06:00
2010-01-24 18:19:20 +06:00
The step must NOW be done BEFORE attempting conversion to html
2010-01-17 18:10:35 +06:00
2010-01-24 18:19:20 +06:00
gensvg.py TARGETDIR
2010-01-17 18:10:35 +06:00
2010-01-24 18:19:20 +06:00
When complete, use a web-browser to open the page*.xhtml files
in TARGETDIR/svg/ to see what the book really looks like.
2010-01-17 18:10:35 +06:00
2009-01-27 18:20:37 +06:00
If you would prefer pure svg pages, then use the -r option
as follows:
gensvg.py -r TARGETDIR
2010-01-24 18:19:20 +06:00
All thanks go to CLARKNOVA for this program. This program is
needed to actually see the true image of each page and so that
the next step can properly create images from glyphs for
monograms, dropcaps and tables.
2010-01-17 18:10:35 +06:00
2010-01-24 18:19:20 +06:00
4. Create "book.html" which can be found in "TARGETDIR" after
completion.
2010-01-17 18:10:35 +06:00
genhtml.py TARGETDIR
2010-01-24 18:19:20 +06:00
***IMPORTANT NOTE*** This html conversion can not fully capture
all of the layouts and styles actually used in the book
and the resulting html will need to be edited by hand to
properly set bold and/or italics, handle font size changes,
and to fix the sometimes horiffic mistakes in the ocrText
used to create the html.
2010-01-17 18:10:35 +06:00
2009-01-27 18:20:37 +06:00
If there critical pages that need fixed layout in your book
you might want to consider forcing these fixed regions to
become svg images using the command instead
genhtml.py --fixed-image TARGETDIR
This will convert all fixed regions into svg images at the
expense of increased book size, slower loading speed, and
a loss of the ability to search for words in those regions
2010-01-24 18:19:20 +06:00
FYI: Sigil is a wonderful, free cross-
platform program that can be used to edit the html and
create an epub if you so desire.
2010-01-17 18:10:35 +06:00
2010-01-24 18:19:20 +06:00
5. Optional Step: Convert the files in "TARGETDIR" to their
xml descriptions which can be found in TARGETDIR/xml/
upon completion.
2010-01-17 18:10:35 +06:00
2010-01-24 18:19:20 +06:00
genxml.py TARGETDIR
These conversions are important for allowing future (and better)
conversions to come later.
2010-01-17 18:10:35 +06:00