Skip to content

Adobe Acrobat Capture

Adobe Acrobat Capture: as a Document Management Solution

Product: Adobe Acrobat Capture 3.0

For: Windows 95, Windows NT 3.51 and 4.0

Price: US$895 and up

Product: Adobe Acrobat Capture 2.01

For: Windows 95, Windows NT 3.51 and 4.0

Price: US$577 and up (the basic price provides a dongle that allows 20,000 pages to be processed. Additional payment is required to process more pages. Upgrades from US$159.

Acrobat Capture 3, announced Feb. 1, 2000, in both US$699+ Personal and high-end (US$7000+) "Cluster" editions, runs only on Windows NT. Its most notable features are improved OCR accuracy and provisions for the editing of scanned pages. Capture 2.0 will, the company says, continue to be made available as an entry-level solution for Windows 95/98 users. Upgrades from US$199.requires Windows NT. Adobe says Capture 2.0 will continue to be made available as an entry-level solution for Windows 95/98 users.

We had mixed feelings about Capture 2.0, the last Windows 95/98 compatible version of Adobe's estimable document management package. On one hand, Capture's architecture makes it very simple to set up a system where paper-based documents can automatically be transformed into electronic documents with all fonts, formatting and graphics intact; these documents can then be deposited directly into a directory that has been designated as a web site. Voilà -- a nearly ideal solution for corporate intranet access to paper-based documents. (For more information about the Adobe "Paper-to-Web" solution, visit Adobe's Web site at http://www.adobe.com/prodindex/capture.)

Such documents can then be indexed, searched, and/or annotated with Acrobat's mature set of tools for PCs or Macs. Better still, Adobe provides free Acrobat Reader software for Windows, Macintosh and Unix, and there is an Acrobat Reader available for download for OS/2 as well.

Unfortunately, there were a few flaws in the grand design of Capture 2.0. Most notably, during our tests, we found that the Acrobat document viewing software, for one reason or another, was broken on nearly every machine we regularly use here. We walked around our office and tried to open an Acrobat document on any one of several machines -- both Mac and Windows-based -- we knew had the software installed. On one of the Macs, Adobe Type Manager complained that it needed "authorization" -- presumably, the control panel had been updated or moved. On another Mac, the Symbol font had been configured as TrueType instead of PostScript, disabling Acrobat's ability to initialize correctly. A third Mac was apparently missing one of the "Multiple Master" fonts Acrobat needs to function -- it, too refused to open the file. On one of the PCs we tried, Acrobat seemed to open, but the file wouldn't display. Apparently, one of our many web browser upgrades had broken the Acrobat plug-in. On another PC, our previously installed Acrobat Exchange had been accidentally "updated" to the less capable Acrobat Reader, which is installed along with a growing number of software applications that include documentation on CD-ROM in Acrobat PDF format.

Obviously, all of these problems are fairly easy to fix by simply reinstalling the Acrobat software, and it is likely that a corporate office with a strict regimen forbidding user changes to machines will have fewer problems than we (notorious fiddlers that we are) have seen here. However, it does raise a flag of concern that Acrobat-based solutions have the potential to add to IS administration costs and user frustrations.

So, when it is working, how is Acrobat? In general, we find Acrobat files to be rather large, although Acrobat Distiller version 3.0 (not included with Capture, by the way), offered new options to reduce file sizes. For example, we saved a full-page ad -- an electronic file in CorelDraw format used by our ad production dept. -- in several formats:

  • CDR (native CorelDraw format; viewable on web pages using a freely available tool from www.corel.com)
  • EPS (PostScript) file, viewable using GhostScript, CorelDraw, Adobe Illustrator 7.0, Adobe Photoshop 4.0 or other PostScript viewing/editing tools
  • Acrobat file (created with Acrobat Distiller 3.0 from the printed-to-disk PostScript output of the above-mentioned CDR file), viewable using Acrobat Reader, Acrobat Exchange or Acrobat plug-ins for Netscape or Internet Explorer
  • Corel Barista (Java) file, viewable -- maybe -- using any system with a Java interpreter
  • Text file, viewable (without advanced formatting, of course) with any text editor
  • GIF-plus-text (created using the "Save as image map" function in Adobe Illustrator or CorelDraw 7); viewable with any web browser

The original CDR file was approximately five megabytes in size. The EPS ballooned to almost 8MB; the Barista file, which took absolutely forever to display, incidentally, was about 4MB. The Acrobat file was a relatively svelte 3.2 MB, but this was nothing compared to the GIF image map, which weighed in at only 80K. The GIF file included the text of the original CDR file saved as an HTML page that included a link to the GIF. Thus, when you performed a search, using a web search engine, the words in the text file are searchable and when the link is clicked, up pops the GIF. While this is perhaps not as elegant as Acrobat's all-in-one resolution-independent graphics and text file format, the smaller files make it an attractive alternative.

We also tested the program using a variety of other documents. The following examples show results typical of the program's output.

  1. Original Document, saved as a PDF file with hidden text. (83K)
  2. The same page saved as HTML (about 90K including graphics -- note that the HTML export keeps the signature and letterhead graphics.)
  3. Screen capture showing the program's Reviewer component, which allows the user to edit the text embedded before saving as PDF, HTML or in one of several popular word processor formats.

In other cases, particularly when starting with paper-based originals, Acrobat Capture is clearly a better solution. Version 2.0 of the program adds the ability to save in popular word-processing formats (again, without all the formatting that the Acrobat format maintains), and the program's highly graphical interface makes it easy to set up a less skilled worker to process documents for intranet perusal. A batch-processing option can convert up to 20,000 documents without intervention. Those considering Acrobat Capture 2.0, which unlike the viewing tools, is available only for Windows 95 and NT, should be aware that Adobe has instituted a new policy where the program's original US$895 purchase price only allows you to use it to create 20,000 pages. A parallel-port dongle bundled with the package decrements with each use; when you hit zero, you'll have to shell out more money for an additional license. (An additional 20,000 pages costs US$595; US$4995 buys you 200,000 pages.)

Visit the Capture web page at http://www.adobe.com/prodindex/capture/main.html for more details.

Adobe, on June 19, 2000, announced the immediate availability of Adobe Acrobat Distiller Server, a software package for the Linux, Solaris, and Windows NT platforms (a list that notably, does not include the Mac). The software, says the company, provides business and institutional intranets with the means to provide a centrally administered, server-based mechanism for creating Adobe Portable Document Format (PDF) files. Details at www.adobe.com.

For Further Reading

  • More information on Acrobat is available here.
  • More info on scanners is at http://www.pcbuyersguide.com/hardware/peripherals/index.html
  • Ghostscript is a freely available PostScript alternative.
  • PC Week Online has details on Adobe Acrobat Capture 3.0

From a May 9, 2002 Slashdot posting by Dan Kaminsky of DoxPara Research (Thread): Run, don’t walk, to http://djvu.research.att.com/home.html. DJVu is a image-based competitor to PDF that is a feat of beautiful engineering — 300DPI scans break down to about 10-30K a page, the viewer is about an order of magnitude faster than PDF, the format cleanly supports separate encoding of page texture/graphics vs. page text, there’s significant amounts of open source for it, and more. It’s truly a brilliant format.

Comments

Post new comment

Image CAPTCHA
Enter the characters shown in the image.