• Staff Applications are OPEN! [ Staff / Moderator ] More Info HERE Help us make a better forum for everyone!

Calibre - Ebook creation/conversion/editing - Tutorial

FC1, I just did an experiment.
<snip>

Sounds like the downloaded information got put into the epub.

I'd manually check the content.opf or metadata.opf (same file, different name, pointed to by the container.xml), and if it's there then i'd consider that a success. Though if the transferred data is going with the book on a completely different system that suggests it will be in the opf file.

One question. Did you manually copy the information into Calibre, or was it using a built-in tool/option to download that information based on book title/author information?
 
Sounds like the downloaded information got put into the epub.

I'd manually check the content.opf or metadata.opf (same file, different name, pointed to by the container.xml), and if it's there then i'd consider that a success. Though if the transferred data is going with the book on a completely different system that suggests it will be in the opf file.

One question. Did you manually copy the information into Calibre, or was it using a built-in tool/option to download that information based on book title/author information?
I don't recall for the book I spoke about earlier. It was from an author who sells her stuff at Amazon, so I likely downloaded the info.

However, I just did another test on a Word document I had in calibre where all the meta data were manually input. I had made this Word document from an anthology epub that I wanted this story as a stand alone.

I verified, those fields in the Word document (file/info tabs) were blank.

I made an epub, and copied then added it to calibre as I did previously.

Every single field in the calibre meta data editor window was populated with the exact same info as was originally present in the Word document version, including the custom cover I made.

Here's the epub I made from the Word document.
 

Attachments

I usually just cut and paste the data from Goodreads, because downloading metadata is slow, and inconsistent, and I usually prefer their presentation.
Series *
Title
Author
Ratings info *
summary
Genres *
Page Count *

I just wish their page count was with the rating summary (on top), and...
The more important things to me starred
I prefer the ratings in the write-up because I use eCalibre's Ratings to both see if I've read it, and give it my Rating, so I can quickly see my opinion of an author's writing.
 
When you say slow and inconsistent, do you download a batch of books or individually?

I've noted that the first choice calibre presents, isn't always the correct book, so I've stopped doing batch downloads.

I don't find that it takes too long for an individual book, but I've got a high speed connection.

On a slightly separate note, I think it was you that said calibre developed a very slow response time once the library got too big.

I noticed something the other day that I hadn't paid any attention to before. There's a "FT" icon at the left of the gear icon in the search window. If you click on it, it has the settings for the indexing of the library. This enables the search to look inside the books themselves rather than just the contents of the metadata.

They warn about how using the fast indexing will slow down the response of calibre, but perhaps even the slow indexing is affecting your performance.

I typically only use the search function for authors or tags, so I don't really need an indexing of my library.
 
The more important things to me starred
If we find what the other tags are named for the opf file, i can start taking advantage of those for metadata addition in my work, and update the epub post. I can't make heads or tails of the technical specs so i stopped trying. I dislike how hard it is to find the information i want anymore.
 
I was the one who mentioned the large library slowdown, and I did not know of the FT icon, and never used it, so it's not that.
As far as downloading metadata - I've never tried the mass update because of the poor returns to the singles, but I'm glad you mentioned that because I was about to give it a try, 'when I found it again'
Now I won't.
Mostly I guess the slowness was attempting to find book covers, but I was so disgusted with its consistencies that I gave up on it long ago.
 
Merging and Splitting files

A number of instances i see where it seems the splits are in the wrong place (giving a title/chapter name it's own separate page), or maybe the file is just too big. Calibre complains if you exceed 200k. Yeah 200k chapters is a little long. But chapters (or books) can be split by Calibre. Right-click within the text of whatever HTML file you wanted to split.

split1.png


Splitting a file is fairly easy to use; Though the wizard is a little less than great at explaining how it's determining the rules, so when you select the header type you might only be splitting on specific header numbers, or all header numbers. So for simplicity split at the headers, specifically //h:h1 or //h:hN where N is the level you want to split at (probably 1-3). In some of my OCR's i usually do h1 and h3, and rarely do h2 unless there's multiple books.

split_at.png


The result is as you see. (splitting i had 3 books/chapters but only 2 showed...)

split_result.png




Merging is a pretty simple affair, select the files you want to merge, and then select the 'head' which is probably the first one in the group. The order is defaulted to whatever the file order is in the 'file browser' (which determines the order of contents in the spine/TOC)


merge_1.png


merge_2.png


This will result in a merged file, some minor head or tail cleanup may be needed, just remove or adjust as needed.. Calibre is pretty good at keeping bookmarks and links consistent, so if a filename changed, it will point to the id it last had but in another file. Same for if you rename the files, perhaps to specify chapters or books or stories.

The split files has a good result of helping to separate the chapters, forcing new pages on splits rather than an endless rolling text.
 
As long as we're talking about calibre conversion anomalies, One thing I see from time to time is that some of the letters that end up in the converted Word document are not recognized by spell check as being those letters.

They look perfectly fine, but just about very word in the document has the dotted red line under them, indicating there is a spelling error. It's not a country selection in the Word document, it's just that the letters don't show up in the spell check tool as their actual value.

Typically, it's the vowels that are problematic, along with t, l, m, p.

I simply copy one of those letters and past it in the search slot and then type the same letter in the replace and do a global search and replace. Annoying, but typically not a long process to fix.

If I see it in one book by an author, I will likely see it in most or all of their works. Which makes me think it might have to do with how they generated the original source document (what software they used) and if they used any typesetting software to prep it for publication.

It's not Reedsy, which is a company I've seen mentioned by various authors, since their books don't have this issue.

If I can recall a book that had this problem, I'll let you know.
OK, here's an example of an epub that has the spelling anomaly.

This one was just the letter "o" that I had to do a global replace to fix.

1737741733903.png
 

Attachments

Mhmm....
OK, here's an example of an epub that has the spelling anomaly.

This one was just the letter "o" that I had to do a global replace to fix.

Mhmm


ASCII: Hodges -> 48 6F 64676573
Book: Hοdges -> 48 CEBF 64676573

raw: 1100 1110 1011 1111
UTF identifiers to strip: (110) 01110 (10) 11 1111
Real UTF code: 011 1011 1111 -> 3BF

Code Glyph Decimal Description #
U+03BFο959Greek Small Letter Omicron0431

Heh, fun fun... Yeah that could be annoying. Agreed, bulk replace is the answer.

Probably make a list of characters with this conversion issue and i'll incorporate it in my scripts, or make a script to handle it.
 
WINRAR 7.01
I think this is the closest thing we have to a tech discussion thread - so I'm posting my question here
If anyone has a better location for it please let me know.

On occasion I'm having problems adding files to my individual author's *.rar/*.zip files (which is where/how I store books I haven't added to my library)
The problem I'm having is every once in a while (maybe 1 in 500) when I add files the achive gets ?destroyed?
All the files disappear and a generic appearing archive replaces it.
Has anyone seen this? If so do you know of a recovery from it?

Currently as an author issues a take-down request I'm adding all that author's books to my library so I don't loose them, but being anal retentive, this includes a bit of work - ensuring genre's and descriptions are included with the book.
 
Merging and Splitting files

A number of instances i see where it seems the splits are in the wrong place (giving a title/chapter name it's own separate page), or maybe the file is just too big.
~snip~

Not sure if this is why that happens, but in Word, there is a formatting command in the paragraph section to "page break before".

This can be a manual formatting command, or included in a Style.

I've seen some tutorials on how to prep a book for Amazon, that suggests using this format command to generate page breaks for your chapters.

I've also seen ebooks that I've converted to Word, using calibre, have the page break command included in the styles that get assigned by calibre.
 
WINRAR 7.01
I think this is the closest thing we have to a tech discussion thread - so I'm posting my question here
If anyone has a better location for it please let me know.

On occasion I'm having problems adding files to my individual author's *.rar/*.zip files (which is where/how I store books I haven't added to my library)
The problem I'm having is every once in a while (maybe 1 in 500) when I add files the achive gets ?destroyed?
All the files disappear and a generic appearing archive replaces it.
Has anyone seen this? If so do you know of a recovery from it?

Currently as an author issues a take-down request I'm adding all that author's books to my library so I don't loose them, but being anal retentive, this includes a bit of work - ensuring genre's and descriptions are included with the book.
I did a quick google on "file add to rar archive bug" and the AI Overview made one statement that may be germane.

Solid Archive:
Disable "Solid Archive" option: If you want to add files to an existing archive, make sure the "Solid Archive" option is disabled when creating the archive.
 
I used to convert documents into .epub with Calibre, but I was always annoyed by some quirks, such as all the classes the software put everywhere, and some weird spans that would pop up. I discovered Pandoc through a user from the 8muse forum (named Venner).

So I figured I'd suggest you try writing your docs in the markdown format (.md), which is basically an enhanced .txt format, and using pandoc to convert them into .epub.
I just recently figured out how to get the tags recognized automatically as such: you need to use a YAML Frontmatter, and use the template:

YAML:
subject:
-"tag"

YAML:
---
title: Mastery
creator:
- role: author
  text: Erenisch
- role: illustrator
  text: Erenisch
- role: editor
  text: Cunegonda
- role: editor
  text: Venner
- role: editor
  text: Wivers
- role: publication-place
  text: https://subscribestar.adult/erenisch
date: 2024-03-28
cover-image: Mastery_Illustrations/_CoverMastery.jpg
belongs-to-collection: Erenischverse
group-position: 01
description: "A young man buys himself a sex-slave, and get closer to his female neighbor"
subject:
  - "Novella"
  - "Bdsm"
  - "Maledom"
  - "Femsub"
  - "Slave"
  - "Illustrated"
...

I'm attaching the CSS file I made in a .zip (from the basic Pandoc-made file)
 

Attachments

  • PandocW.zip
    PandocW.zip
    1.6 KB · Views: 4
Back
Top Bottom