PageZephyr Sees What Spotlight Can't
Proprietary file types from applications like Quark and InDesign have a knack for hiding from Mac OS X's built-in search app, Spotlight. For users will vast stores of these types of files, search can be an awkward process of trial and error. Markzware's new PageZephyr knows how to look for these types of proprietary files.
General purpose search applications like Spotlight do an excellent job of giving byte-slingers swift access to the content in a panoply of common file types. When it comes to proprietary formats like those created by Quark Express and Adobe InDesign, those file ferrets can be blind as olms. That poses a big problem for publishers of every stripe who have countless documents in those formats and need to find something in them. Now, though, there's a solution to that problem. It's called "PageZephyr."
Released earlier this month, PageZephyr, made by Markzware, is built on technology from the company's other professional publishing products, which include applications for converting InDesign files to Quark format and Microsoft Publisher files to InDesign format.
Markzware calls the technology the "Common Reader Architecture." It decodes the file formats of certain desktop publishing applications to determine what's in the files. "When a regular search engine looks at one these files all it sees ASCII armored binary hash," Markzware's Sales Director Robert C. Claborne told MacNewsWorld. "It can't make any sense out of it so it passes on by."
PageZephyr can decode those publishing files so it knows what text is in them and the formatting metrics for that text -- font, styles, etc. The application uses that information to create search results and extract text from the files.
"To my knowledge, this technology is unique in the industry," Claborne maintained. "We are the only ones that can do this.
Corpus of Bones
Considering the length of time programs like Quark and InDesign have been in the market and the key role search plays in almost every facet of computer use, one might think a program like PageZephyr would have been introduced long ago, but it wasn't. "Archeology always seems to come last," Claborne opined. "You have to have a corpus of old bones out there before you can go digging.
"This is close to our 15th year in this particular business, and we've been hearing more and more from customers over the last five years or so about the difficulty of finding content in these proprietary file types," he said.
"They can't be searched by common means, by Spotlight or other search methods," he continued. "The only way to find what you're looking for is to open up an actual document in its native application and hope you get lucky."
"If you're a big firm and have an extensive archive of files and you're looking to reuse content from times past for one reason or another, there's literally no way you can find it," he added.
Dual Zephyrs
Markzware has introduced two versions of PageZephyr -- PageZephyr Search (US$129) and PageZephyr Search & Extract ($299).
Both versions will index all Quark and InDesign files stored on attached drives or in mounted volumes, including network volumes. They will also display search results based on file size, name, type and modification and creation date, as well as by keyword. Boolean and/or and wild card searches are supported in keyword mode.
PageZephyr S&E has additional editing and exporting features. For instance, text discovered in documents can be edited in a Storyboard Window. Text fonts can be altered and their color changed. In addition, text can be copied and pasted across applications, as well as integrated with the System Services from third-party application providers.
The S&E version also allows stories discovered in a search to be exported in plain text or RTF format either as single or multiple stories from a single document or multiple stories from multiple documents.
Tantalizing Search Engine Makers
On a Mac, PageZephyr works in conjunction with OS X's native search engine, Spotlight. PageZephyr creates an index of what's inside the Quark and InDesign documents on the machine, and Spotlight uses that index to display search results. "PageZephyr acts as a translation window for Spotlight, which allows it to peer into and find content in encoded file types," Claborne explained.
The PageZephyr engine, though, need not be restricted to use with Spotlight. It could be licensed as an OEM product to others. "PageZephyr has the ability to interface to essentially any search engine," he observed.
"If a search engine company," he added, "wanted to extend their searching capability into the billions of desktop publishing documents that exist out in the world, that could happen with the PageZephyr [OEM] product."


Headline Feeds