• Staff Applications are OPEN! [ Staff / Moderator ] More Info HERE Help us make a better forum for everyone!

[Mags Inc/Reluctant Press] Sissy / Feminization

Upside-down illustrations is a quirk of the PDF format; one of the image compression methods (DCT) it uses is identical to the one JPEG files uses, but it begins scanning the image from the opposite end. If you extract the image without reencoding for maximum possible quality, it results in an upside-down image.

There are other quirks of the way PDFs encode images. Transparency is not included in the image itself, but in a separate "transparency map" which is extracted as separate PNG grayscale image. Also, PDFs often encode images in what *would* be a CMYK JPEG scheme, but which results in weird colors, vaguely similar to a color photo negative, when extracted without reencoding.

Blame Adobe. I certainly do.
 
Upside-down illustrations is a quirk of the PDF format; one of the image compression methods (DCT) it uses is identical to the one JPEG files uses, but it begins scanning the image from the opposite end. If you extract the image without reencoding for maximum possible quality, it results in an upside-down image.

There are other quirks of the way PDFs encode images. Transparency is not included in the image itself, but in a separate "transparency map" which is extracted as separate PNG grayscale image. Also, PDFs often encode images in what *would* be a CMYK JPEG scheme, but which results in weird colors, vaguely similar to a color photo negative, when extracted without reencoding.

Blame Adobe. I certainly do.
I appreciate you sharing that as I couldn't figure out how to convert and not have it happening. I tried multiple Ways in different converting tools to no avail. The only way to ensure it didn't happen was screenshot every page and copy paste in word adding images/illustrations manually but for hundreds or more conversions it was too much so I learned to put up with the upside down images as I wasn't that bothered as long as the text/story came out correctly but I didn't consider some are more discerning than I when I thought to share what I have and now wish I had kept the original pdf files for those that prefer it but I have over 5000 ebooks in my folders so keeping originals wasn't part of my plan. Hopefully the old archives at the start of these threads are still up and anyone who wants to can download and go through if they want a certain story in the original format but I found it horrid trying to read them in the old image form of pdf personally hence the converting to epub but I never understood why the illustrations did that till now so thank you
 
There are ways to work around the issue, but they involve a little bit of work. My current workflow to deal with upside-down images from PDFs goes as follows:
1. Extract the images using the pdfimages command-line tool, parameter "-all" (this demands the Poppler version of the pdfimages tool)
2. If the image is too small and/or has noticeable compression artifacts (aka defects), I run it through Upscayl to clean up the defects and/or upscale it to a more desired size. Upscayl will save images as PNG by default, so that gives a nice, clean image for further processing.
3. Basic processing (resizing/flipping, sometimes cropping, and file compression to a more compact format such as JPEG) can be done either by batch tools (I use mostly XnviewMP, but ImageMagick is quite useful too, although it requires you to learn command-line syntax), or in more complicated cases by hand in a full image editor (I use GIMP, but there are plenty free good editors around without having to cough up an Adobe subscription)

Transparency is a different matter. As I mentioned, PDFs store transparency as a separate image. You *can* reapply those by hand using GIMP or similar, but it's fiddly and demands experimenting with your tool to achieve good results. Instead, I use ImageMagick to merge the transparency map in command-line.

Oh... another thing ImageMagick is good for, when dealing with images extracted from PDFs: in some cases the software that generated the PDF will slice images and store them as two or more separate images. Instead of joining them by hand in GIMP, I use the -append command in ImageMagick to do the job. Much quicker.

(Sometimes it seems too much trouble to work with the images in their original format. Like in very old PDFs, which use obsolete internal formats originally created for fax machines. Or in the weird-colored JPEGS I mentioned before. In those cases, I extract the images using the -PNG parameter instead of -all. Those weird obsolete formats will result in an upside-down, negative black-and-white image, but that's easy to fix with XnviewMP)
 
Last edited:
it might not be to everyone's taste due to occasional errors

That's not me.jpg
 
There are ways to work around the issue, but they involve a little bit of work. My current workflow to deal with upside-down images from PDFs goes as follows:
1. Extract the images using the pdfimages command-line tool, parameter "-all" (this demands the Poppler version of the pdfimages tool)
2. If the image is too small and/or has noticeable compression artifacts (aka defects), I run it through Upscayl to clean up the defects and/or upscale it to a more desired size. Upscayl will save images as PNG by default, so that gives a nice, clean image for further processing.
3. Basic processing (resizing/flipping, sometimes cropping, and file compression to a more compact format such as JPEG) can be done either by batch tools (I use mostly XnviewMP, but ImageMagick is quite useful too, although it requires you to learn command-line syntax), or in more complicated cases by hand in a full image editor (I use GIMP, but there are plenty free good editors around without having to cough up an Adobe subscription)

Transparency is a different matter. As I mentioned, PDFs store transparency as a separate image. You *can* reapply those by hand using GIMP or similar, but it's fiddly and demands experimenting with your tool to achieve good results. Instead, I use ImageMagick to merge the transparency map in command-line.

Oh... another thing ImageMagick is good for, when dealing with images extracted from PDFs: in some cases the software that generated the PDF will slice images and store them as two or more separate images. Instead of joining them by hand in GIMP, I use the -append command in ImageMagick to do the job. Much quicker.

(Sometimes it seems too much trouble to work with the images in their original format. Like in very old PDFs, which use obsolete internal formats originally created for fax machines. Or in the weird-colored JPEGS I mentioned before. In those cases, I extract the images using the -PNG parameter instead of -all. Those weird obsolete formats will result in an upside-down, negative black-and-white image, but that's easy to fix with XnviewMP)
I will be honest and tell you a lot of that went over my head (by a long shot!) but I appreciate anyone willing to share something knowledgeable whether I understand or not 🫶 I find it strange that they protected those old pdf files to such an extent though or perhaps it was a side effect unintended of the way they created them? The way you understand the makeup of those pdf files though it makes me wonder if you know why some punctuation marks come out in weird symbols in conversion? Is that DMCA or just a glitch?
 
One of my favorite free apps these days is PDFGear. not only a good reader app but it alows me to flip images/edit pages without restriction and save the edited file as I like it. Also I find PDF24 is another great free app I use but primarily only for image extraction. I have never had great success converting my PDFs to epub though...
 
I will be honest and tell you a lot of that went over my head (by a long shot!) but I appreciate anyone willing to share something knowledgeable whether I understand or not 🫶 I find it strange that they protected those old pdf files to such an extent though or perhaps it was a side effect unintended of the way they created them? The way you understand the makeup of those pdf files though it makes me wonder if you know why some punctuation marks come out in weird symbols in conversion? Is that DMCA or just a glitch?
Probably not any form of DRM. It's just that... well... PDF was never intended to be an *editable* file format, it was intended to be a *final*, *immutable* format. Sort of an electronic version of a printed page. You edit your document in Word, InDesign, whatever. When you are happy that is *exactly* what you want the reader to see, you "print" it as a PDF.

Originally, PDF simply did not take *any* steps to make it easy to edit afterwards. Its main concern was to make sure that each letter, dot and picture looks *exactly* how the author intended, in *exactly* the place he intended. It didn't even embrace concepts such as "words" and "paragraphs." It put letters in precise positions on the page. It didn't use whitespace characters such as "space" and "tabs" -- it just moved the letters a bit to the right. Also there are no rule that says that the letters have to be listed inside the PDF in the same order the reader will interpret them. Some softwares make a right mess of that when generating PDFs. Is it intentional, as a sort of primitive DRM to make it hard to extract the text? Maybe, I don't know.

More recent versions of the PDF standard include support for things such as reflowable text -- but that does not necessarily mean that any particular PDF file *uses* those features. So, converting a PDF to an editable format such as Word involves a lot of educated guessing on the part of the software, to figure out which line breaks are also paragraph breaks and such. It's not really different from what an OCR software does to convert an image into text -- it just skips the first step of the process, that is, identifying the individual characters. All the rest -- figuring out words, lines, paragraphs etc. -- is essentially the same.

Regarding punctuation marks... PDFs can also contain a copy of the font you want the reader to see. Nowadays most of those fonts follow the Unicode standard, so the character number for, say, a right-double-angle-quote in a font will be the same as for thousand of other fonts.
But that didn't use to be true. Back in the Dark Ages of the Nineties, there were all sorts of different encodings for characters beyond the (very) basic ASCII set. If you generated your PDF in a Mac with a Mac font, it would look right on a PC because Acrobat Reader would look up the characters in the embedded font... but if you extracted a raw copy of the text, it would look similar to an old Mac txt file opened in Windows Notepad.

An additional complicator (which I have seen mentioned) is ligatures. Some fonts have special letterforms for things like ff, fi, fl and such, to make the text look better. In old files, again, those were not well standardized, the software pointed to the right positions in the embedded font and it looked right, but the conversion software may have a hard time figuring out which characters correspond to that particular code point.

To sum it up: all the conversion problem stem from the basic fact that PDFs were not designed to be modified after finalized. It's not deliberate DRM, it's more like... well... the innards of your TV. It is not deliberately designed to make it easy to separate the individual components and assemble a different product. It's just that offering that possibility would make the circuit board bulkier, more expensive and more prone to issues. People want a TV, not a box of electronic Legos.
 
Probably not any form of DRM. It's just that... well... PDF was never intended to be an *editable* file format, it was intended to be a *final*, *immutable* format. Sort of an electronic version of a printed page. You edit your document in Word, InDesign, whatever. When you are happy that is *exactly* what you want the reader to see, you "print" it as a PDF.

Originally, PDF simply did not take *any* steps to make it easy to edit afterwards. Its main concern was to make sure that each letter, dot and picture looks *exactly* how the author intended, in *exactly* the place he intended. It didn't even embrace concepts such as "words" and "paragraphs." It put letters in precise positions on the page. It didn't use whitespace characters such as "space" and "tabs" -- it just moved the letters a bit to the right. Also there are no rule that says that the letters have to be listed inside the PDF in the same order the reader will interpret them. Some softwares make a right mess of that when generating PDFs. Is it intentional, as a sort of primitive DRM to make it hard to extract the text? Maybe, I don't know.

More recent versions of the PDF standard include support for things such as reflowable text -- but that does not necessarily mean that any particular PDF file *uses* those features. So, converting a PDF to an editable format such as Word involves a lot of educated guessing on the part of the software, to figure out which line breaks are also paragraph breaks and such. It's not really different from what an OCR software does to convert an image into text -- it just skips the first step of the process, that is, identifying the individual characters. All the rest -- figuring out words, lines, paragraphs etc. -- is essentially the same.

Regarding punctuation marks... PDFs can also contain a copy of the font you want the reader to see. Nowadays most of those fonts follow the Unicode standard, so the character number for, say, a right-double-angle-quote in a font will be the same as for thousand of other fonts.
But that didn't use to be true. Back in the Dark Ages of the Nineties, there were all sorts of different encodings for characters beyond the (very) basic ASCII set. If you generated your PDF in a Mac with a Mac font, it would look right on a PC because Acrobat Reader would look up the characters in the embedded font... but if you extracted a raw copy of the text, it would look similar to an old Mac txt file opened in Windows Notepad.

An additional complicator (which I have seen mentioned) is ligatures. Some fonts have special letterforms for things like ff, fi, fl and such, to make the text look better. In old files, again, those were not well standardized, the software pointed to the right positions in the embedded font and it looked right, but the conversion software may have a hard time figuring out which characters correspond to that particular code point.

To sum it up: all the conversion problem stem from the basic fact that PDFs were not designed to be modified after finalized. It's not deliberate DRM, it's more like... well... the innards of your TV. It is not deliberately designed to make it easy to separate the individual components and assemble a different product. It's just that offering that possibility would make the circuit board bulkier, more expensive and more prone to issues. People want a TV, not a box of electronic Legos.
That makes sense really, as the technology has advanced formats have developed to allow editing and multiple viewing options it's just frustrating as some of those authors were really good writers and too many tales are being lost to time I guess. Hopefully not all will be forgotten and others will share either the originals or converted (hopefully more skillfully than I) versions. It's one of the reasons I love this forum as there are so many willing to share and enjoy what others share. I also get to learn more at the same time as I had no idea about converting PDF files till I started interacting on here
 
That's not the only technical limitation on those old PDFs. Low-numbered Reluctant Press PDF titles (below #300 or so) have really crummy illustrations. The thing is, sometimes a first-print of those stories will show up and the images were actually quite nice. There was an user in the old forums who had a bunch of those really old books and a semi-pro scanner rig who contributed a bunch of rescanned images, in fact. Here's what we figured out happened:

1. Reluctant Press originally published those stories back in the late Seventies / early Eighties. That was before affordable desktop publishing existed, before affordable laser/inkjet printers existed in fact. They were made in the old-fashioned way, fanzine style, with typed text glued in a board, and the pictures also glued there, before being either xeroxed (for low print runs) or made into cheap offset printing plates by a photographic process (for larger print runs). Those original boards were eventually discarded or lost, as well as the original artwork.

2. When selling through BBSs and later the Internet became a viable business model, those stories were scanned to be turned into PDFs. The thing is, back then (early Nineties) good flatbed scanners were still very, very expensive. Most people could only afford cheap handheld scanners, sometimes limited to monochrome (pure black and white, no grey tones). Those were enough to scan text and feed through an OCR software (expensive, but there was a lot of piracy going around, so maybe not). They were NOT good enough to capture all the lines in Brian Dukehart's (the most prolific RP illustrator) or Puyal's work.

A few books have better-looking images, however. My guess is that those were scanned at a later date, with better hardware. Maybe RP decided to redo them, or maybe they were skipped originally.

Anyway, that poor quality of the original scans is likely the main driving reason for Mags/RP republishing those old books with new illustrations.
 
That's not the only technical limitation on those old PDFs. Low-numbered Reluctant Press PDF titles (below #300 or so) have really crummy illustrations. The thing is, sometimes a first-print of those stories will show up and the images were actually quite nice. There was an user in the old forums who had a bunch of those really old books and a semi-pro scanner rig who contributed a bunch of rescanned images, in fact. Here's what we figured out happened:

1. Reluctant Press originally published those stories back in the late Seventies / early Eighties. That was before affordable desktop publishing existed, before affordable laser/inkjet printers existed in fact. They were made in the old-fashioned way, fanzine style, with typed text glued in a board, and the pictures also glued there, before being either xeroxed (for low print runs) or made into cheap offset printing plates by a photographic process (for larger print runs). Those original boards were eventually discarded or lost, as well as the original artwork.

2. When selling through BBSs and later the Internet became a viable business model, those stories were scanned to be turned into PDFs. The thing is, back then (early Nineties) good flatbed scanners were still very, very expensive. Most people could only afford cheap handheld scanners, sometimes limited to monochrome (pure black and white, no grey tones). Those were enough to scan text and feed through an OCR software (expensive, but there was a lot of piracy going around, so maybe not). They were NOT good enough to capture all the lines in Brian Dukehart's (the most prolific RP illustrator) or Puyal's work.

A few books have better-looking images, however. My guess is that those were scanned at a later date, with better hardware. Maybe RP decided to redo them, or maybe they were skipped originally.

Anyway, that poor quality of the original scans is likely the main driving reason for Mags/RP republishing those old books with new illustrations.
I have seen a few of those ones with almost monochrome illustrations like no colour at all so now I know why! I thought it was just copies of a badly scanned black and white drawing that came out bad in the electronic scan version but what you said explains it. To be honest I gave up going through the archives I first got from here because they were so large but also I found so many were just badly scanned and almost unreadable even after screenshot and OCR so I tried to convert some and tried to manually (through Word and copy/paste) to recreate some but it was so time consuming I gave up as there is a huge variety available in modern format so if nobody wants them enough they will eventually be lost which is a shame as I have read many of those old ones and really enjoyed them even with all the older types of language as they were the pioneers really and all now came from what was then🤷‍♂️
 
so if nobody wants them enough they will eventually be lost which is a shame as I have read many of those old ones and really enjoyed them even with all the older types of language as they were the pioneers really and all now came from what was then🤷‍♂️
And if SOMEbody wants them ... :unsure:
 
I have some of Bea, Stella and Tiffany which have been converted to epub but they are all files i got from here I'm sure and one or 2 of the converted files have glitched letters or something as the epub versions show weird signs instead of certain letters or punctuation signs. I am still able to read and follow the stories and most converted fine so I can post what I have if you like?

I have included an image/screenshot of the error I mean but it only happened in a couple of difficult pdf to epub conversions and there is occasionally an odd spacing issue where there's a few sentences that only show one word per line but it's still in correct order just a weird spacing issue although that happens rarely and doesn't affect the reading of it. I have 26 epubs a mix of the 3 authors?

View attachment 2405056
These type of errors can be corrected in ecalibres epub editor, typically they are caused by "smart quotations" that don't convert well.
if in the editor you highlight the each individual ?, or whatever, for it's position front or back, then copy it to the find box, and have it replaced with the appropriate non-smart punctuation in a replace all you can correct this. Front ? to a quotation with ", and then the same with the back.
In your example 1st instance highlight the ? and put it in the find and then put ' in the replace (?with?) and then replace all it will fix them throughout that portion, rinse repeat to each portion.
Yes - very tedious!
Usually I only do this with Reluctant Press type books as I am also removing the headers and footers from every page of the document so I don't have the story repeatedly interrupted at the end of every page with BS Title/author/threat/infomercial whatever, from the authors I really like after i convert them to epub.
 
These type of errors can be corrected in ecalibres epub editor, typically they are caused by "smart quotations" that don't convert well.
if in the editor you highlight the each individual ?, or whatever, for it's position front or back, then copy it to the find box, and have it replaced with the appropriate non-smart punctuation in a replace all you can correct this. Front ? to a quotation with ", and then the same with the back.
In your example 1st instance highlight the ? and put it in the find and then put ' in the replace (?with?) and then replace all it will fix them throughout that portion, rinse repeat to each portion.
Yes - very tedious!
Usually I only do this with Reluctant Press type books as I am also removing the headers and footers from every page of the document so I don't have the story repeatedly interrupted at the end of every page with BS Title/author/threat/infomercial whatever, from the authors I really like after i convert them to epub.
Now I wish I kept the original pdf files or even the word documents I used to copy paste but at least I know for future attempts with similar issues so thanks for that! Seems so simple once someone else points out a solution as Word has same find and replace as you mean but I have Calibre on my laptop if necessary 🫶
 
As per the only other thread referencing in a search of Sasha Scott I figured I link my full 170 book collection of Sasha Scott. Have a good day :)

 
Does anyone have "Auntie's Sissy Cross Dressed Maid part 1" by Janice Wildflower Gemini
Do you mean "She made me Aunties Sissy Crossdressed maid"? there seems to be just the one part
 

Attachments

  • She Made Me Auntie's Sissy Cross Dressed Maid  by Janice Wildflower Gemini.pdf
    She Made Me Auntie's Sissy Cross Dressed Maid by Janice Wildflower Gemini.pdf
    736.7 KB · Views: 74

Attachments

  • M502e – Auntie’s Sissy Cross Dressed Maid – Series – Story One ≡ Part One – She Made Me Auntie...PDF
    M502e – Auntie’s Sissy Cross Dressed Maid – Series – Story One ≡ Part One – She Made Me Auntie...PDF
    701.5 KB · Views: 67
  • M509e – Auntie’s Sissy Cross Dressed Maid – Series – Story Two ≡ Auntie’s Sissy Cross Dressed ...PDF
    M509e – Auntie’s Sissy Cross Dressed Maid – Series – Story Two ≡ Auntie’s Sissy Cross Dressed ...PDF
    621.9 KB · Views: 59
Back
Top Bottom