OT: Scanning pages to PDF file

Please forgive the off-topic post -- I just know that the people here in comp.cad.solidworks are more responsive than in most other Usenet newsgroups, and that someone here probably has a good answer to my question.

I have a fairly large (approx 70 pages or so) and very unique document pertaining to WWII historical records that I'd like to scan into an Adobe Acrobat file to put it on my Web site. I normally use CorelDraw for purposes such as that, and I can do so this time, but I note that scanned pages usually take up an awful lot of file space, even when formated as PDF. Does anyone have suggestions as to settings or software alternative to CorelDraw that will give better compression? The scanned pages will initially be JPEG format, and I've tried a bunch of different options to try to minimize file size output by CorelDraw. I've also tried using Adobe Distiller from a PostScript output (.prn) file, but the size seems to be much larger that way.

TIA Mark 'Sporky' Stapleton Watermark Design, LLC

formatting link

Reply to
Sporkman
Loading thread data ...

how large is it presently? a lot of folk have broadband now a days so maybe it doesn't matter provided you give the file size with your link....also you could give some sample pages for people to look at and decide if they want to download the whole thing.

Reply to
neil

Reply to
Sporkman

can your scanned JPG files be reduced down to grayscale colors instead of millions of colors? or does your document have to be in full color?

Hope that helps Steve T.

Reply to
SteveT

Reply to
Sporkman

what if the pictures were reduced in size after scanning - could save a lot of space and look better?

Reply to
neil

Hi Sporkman,

For my 2 cents - don't worry about filesize. Whenever I scan documents to preserve them for prosperity I go hi-res all the way. PDF compression can be turned on and turned up (under advanced options of acrobat) but they tend to mangle fine details. Remember that in the very near future the filesize of your pdf scans will certainly be considered quaint, even if they are large by todays standards.

Also, my canon all-in-one fax gizmo has a sheet fed scanner which will scan to a multiple page pdf directly - really nice feature. You can just load all 70 pages into the hopper and hit go. Works great.

Zander

Reply to
Zander

Umm, maybe, but everything will end up as bitmaps anyway. Text was from a typewriter and I'm sure was mimeographed. Possiblity of getting decent text recognition is low, and would require a LOT of cleanup, but that would also mean that the document as presented would not be an unaltered copy of the original. Part of the purpose of publishing the thing on the Web is to counter revisionist claims that the whole concentration camp thing was a hoax, or at least greatly exaggerated out of proportion. Visit alt.revisionism to see some of the outrageous denial. There's hardly any single sufficient word for it in English. In German, one would say "Bludsinn" (bloody sense or bloody non-sense, basically wanton and evil lies). I visited Dauchau camp myself

30-some-odd years ago, and the memories are still very vivid today.

'Sporky'

Reply to
Sporkman

I guess you better publish a high quality version then so there is no question it is authentic... perhaps you could break it down to chapters or sections of 10p with a simple index for researchers.

Reply to
neil

Reply to
Sporkman

is this group responsive or what? less than an hour and we cracked it.... everyone should have this group in place of an encyclopaedia

Reply to
neil

I was there in 1976, and as you state, no matter how much people have tried to change your mind, once you look into the ovens and the gas chamber, and see the collection of bones that's there, you can not deny that it happened.

Several years ago we had the wonderful opportunity to have a lady come speak at our church - she was a Polish survivor of the camps. My kids didn't necessarily want to go hear some old lady speak, but I told them that they most certainly were going to go, and I expected them to listen well as some day someone would try to make them believe that it never happened. But having met her and talked to her, they knew she was real. And the things she knew could not have been made up.

WT

Reply to
WT

My Canon scanner software does a pretty good job of converting to PDF size-wise, but I think you have to assemble the resulting pages yourself. Of course if you have Acrobat Capture, you'd use that.

I sugget you try bitmap (black/white) GIF for the scan format. JPGs are good for photographic images, but really crummy for things with sharp edges like text or line art. They tend to have blurry artifacts around the edges unless you go really high with the resolution. PDFs can contain either.

Best regards, Spehro Pefhany

Reply to
Spehro Pefhany

Sporky

Go into Adobe Acrobats help and search for "compression" the first on the list that follows, "Methods of compression", details the various methods that Adobe uses. "Run Length" lossless compression, may be the best method for your purpose. However reading this particular help section will possibly influence your decision.

John Layne

formatting link

Reply to
John Layne

Really? GIF images, huh? Well, my printer/fax/copier/scanner only does JPEGs, but would it help to convert those to black & white GIF (using Corel PhotoPaint or something similar) before inserting into CorelDraw to create the PDF? Well, anyway, I'll try it and see what results I get. Thanks, Spehro.

'Sporky'

Reply to
Sporkman

Actually a would do it twice one scan hi res 600dpi or better if it is B/W the TIF file would be perfect an burn it on CD just for safe keeping. As far as posting text documents on website, PDF is way to go, scan it 400 dpi in text mode which is black and white not gray scale, depending in your scanner save it to TIF or PDF, if your hardware can't save PDF open Acrobat take the TIFs and drop them on top of it , Acrobat will convert them to separate pages in one PDF.

Now here is the trick: Once you have created PDF (you may want to save extra copy for the usual reasons). Inside acrobat under: Document Paper Capture Start Capture Make sure you have primary OCR Language set to German, Output style Searchable Image (Exact) rest is up to you. (this is for Acrobat 6.0 you can do this in 5.0 but I think you need some plugin) This will make the Acrobat to do text recognition and put the text behind the scanned picture so you can search for it or do copy / paste operation. Depending on your web hosting place, there is service in windows server 2000 and up, that can be set to search PDF so if some one is looking for word or name it will bring the file and page where the text is located.

For Pictures I would use JPG just be careful with resolution (keep the original TIFs)

Reply to
mr.T

PolyTech Forum website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.