Directory Publishing

Directory Publishing includes the creation of online listings, B2B directories, databases, city guides, mailing lists and other long documents which usually have complex formatting rules so publishers can control the appearance of the final layout. Footnotes, tabular and complex tables that run over several pages; the need to generate tables of contents, indices, cross-references, and dynamic running headers and footers are just some of the tasks a publisher must manage.
Publishing is the process of preparing a publication for distribution to readers which could be as simple as exposing XHTML on the web for immediate display, or generating a simple binary print format like PDF for non-paper publishing. If the final format is printed, the publishing process of assembling and formatting contents will culminate with the generation of a file containing all of the information required to support final printed output.
As digital technology and e-commerce become more prevalent, the directory publishing industry is changing from traditional paper media to the electronic age of information which allows directory information to be publicized online, complete with text, images and even video. To accomplish this, more and more publishers are opting for an XML based content management system (CMS) because it allows all forms of content to be easily converted to print or digital versions without the need for purchasing traditional publishing software.
The eSided team has expertise and experience with XML based content management systems, e-business integration systems, custom data connections to line-of-business systems, and other projects leveraging Web technologies, XML, Java, and databases. Our solutions have transformed the directory publishing process for our clients, providing them with efficiencies that resulted in savings of time and money while also contributing to greater accuracies of published content.
What is XML?
XML is an extensible markup language that was developed to retain the flexibility and power of HTML while reducing most of the complexity. XML maintains the order of content and makes documents more readable by humans. Most new document formats are XML and old formats are either already XML such as Open Document for Open Office or, in the case of Microsoft, the binary .doc format will have an XML format called .docx Because XML has a text representation it can be opened by any text editor and it has browser support including the ability to generate an image on the fly.
The Extensible Markup Language (XML) originally grew out of the technical publication community. People had been using the larger, more complex SGML standard to publish technical documentation, manuals, financial reports, books and much more, but they needed something that would fit in better with the World Wide Web, and that would have a lower total cost. Source: http://www.w3.org/standards/xml/
How is XML used in Publishing?
Using an XML platform on top of an existing content base enables the creation of new products, and the reuse existing content. XLM is an open source language so there are many and diverse tools available. In the past, re-purposing documents that were created with traditional publishing platforms was difficult, plus users are often locked into a single vendor for document processing tools.
XML CMS
An important tool is an XML-oriented content management systems (CMS). An XML CMS offers publishers numerous options for the authoring and publishing of documents so they can easily realize the full value of content.
Assemble Content
An XML Content Management System (CMS) enables content assembly. This includes creating new content, editing existing content, team authoring where content creation and management is performed in a collaborative fashion, creating navigation links, component management; content styling, editing, manipulation and re-use; as well as searchable content, and permissions-based sharing of content
For complicated print publishing workflows XML supports content assembly using a map or aggregation across multiple XML files into a print aware XML format such as formatted objects as defined in XSL/FO to generate printable files.
XSL/FO is a formatted object language that addresses the layout and formatting of content onto pages. It is used to publish large or complex XML documents to HTML, PDF or other formats. where automated publishing is needed because it has more sophisticated formatting than HTML and CSS.
Using an XSLT processor the stored XML is transformed into a published form, for example XHTML or HTML for the purpose of displaying in any browser, or PDF.
Content Meta Data
XML supports meta data which makes searching for and categorizing documents easier by annotating documents with data, for example, date, author, publisher.
XML DBMS for Data Storage
An XML database management system (DBMS) stores and retrieves XML Documents as fully accessible information trees which is different from the more limited access provided by traditional relational database management systems (RDBMS). An XML DBMS uses XQuery to generate content such as a table of contents, lists of titles, lists of abstracts or other manipulations of the content, and can be used to query any item within a document or combine items from multiple documents. XQuery supports security, scheduled backups, transactions, recovery, and storage functions.