Content as Services

Content is information that is intended for human consumption, as opposed to “data,” which are information intended for machine or system use. At times, we use other words such as knowledge, semantics, and intellectual assets to describe content. What differentiates human-oriented content from machine-oriented data is that people must create, manage, publish, and distribute content so that it can be represented in a variety of different ways, all the while maintaining the same overall meaning.

Content represents information such as news, facts, fiction, charts, illustrations, photos, opinions — anything that communicates something to someone. Of course, information without structure is meaningless; a random assortment of facts doesn’t do anyone any good. Information must be organized and structured in a way that makes sense. This need for organizing and managing the creation and flow of content represents the core of all organizations’ content-based processes.

Even though the uses of content and data are quite different, the parallels between organizing and managing disparate content in the enterprise and managing distributed application functionality are remarkable. Enterprises typically disperse content in unmanaged, isolated “islands” of information, while application functionality is frequently locked in proprietary systems, requiring integration technology to extricate it. Line-of-business content users require content aggregated from multiple content sources in the enterprise. Similarly, line-of-business application users require high-level business functionality aggregated from multiple enterprise application sources. Portals provide universal access to content. Portals also provide universal access to application functionality. Distribution of content has serious security and rights management issues… and so does distributed computing. However, what is not analogous between the worlds of human-oriented content and machine-oriented application functionality is the extent to which companies have automated the processes for working with these forms of information.

Content process applications are currently where distributed computing applications were in the mid-1980’s. Enterprises have a long way to go before they can realize the full promise of agile, reusable content. Content today is frequently out of context, unstructured, and hard to locate. However, there is hope: everything we have learned about how to componentize application functionality and abstract it to the level where we can access it anywhere on the network can be applied to content. All that’s required is a shift in the way we architect, implement, and manage content processes.

Key Business Driver: Content Reuse
The process of creating content is almost always effort-intensive. People must spend time organizing information prior to creation, constructing the content, and laying out the information so that people can easily read it. Content must be accurate, timely, and relevant. People must then review, lay out, and publish that content. With so much time, cost, and effort invested in content, it makes sense to reduce costs by reusing content as much as possible. Furthermore, for organizations that get their lifeblood from the production and sale of content, reusing pre-existing content is as business critical as servicing customers.

A necessary first step to gaining the benefits of content reuse is componentizing or “chunking” content into discrete blocks that contain metadata describing their contents. However, just chunking content is not sufficient to solve the enterprise’s content agility needs. Companies also need a way to describe content chunks in an abstract manner so that they can be discovered on the network and aggregated on demand for a particular use — the same problems that Service-Oriented Architectures (SOAs) built with Web Services hope to solve.

Service-Oriented Content
Most of the content management challenges facing businesses today have to do with content that is inflexible. To provide this missing flexibility, companies must add a layer of abstraction on top of enterprise content in order to isolate the content consumer from the content producer to give consumers the flexibility they require to locate content and producers the agility they need to change it as needed.

ZapThink calls this layer of abstraction Service-Oriented Content. The main goal of Service-Oriented Content (SOC) is to allow arbitrary applications, systems, and data stores to access content no matter how it is produced or where it is stored, removing the content management bottleneck that exists in the enterprise. Using a standards-based SOA approach leveraging Web Services, companies can produce content that is exposed as services on the network that any Web Services-compliant application can access. Content becomes accessible on the network as if it were any other information asset.

However, in order to build such content “services,” the content must be encapsulated into discrete chunks and composed into usable information. Encapsulation is important because it breaks up large documents into content chunks that can be assigned to different content creators. At its most basic, the rearchitecture process enterprises must undertake to offer SOC involves encapsulating content components with Web Services interfaces and then composing (virtualizing) these fine-grained content components into coarse-grained business-level documents.

Once enterprises have properly chunked their content, they can make use of SOA technologies to allow content consumers to dynamically locate and access this content. First, each content chunk has a content metadata wrapper that describes it in much the same way that a service description describes application functionality. In fact, we can even make use of the Web Services Description Language (WSDL) to describe and define content chunks, and Universal Description, Discovery, and Integration (UDDI) compliant registries to store those descriptions, providing a way for content consumers to access the information.

Principles of Service-Oriented Content
ZapThink believes that there are principles of Service-Oriented Content that can help companies realize the benefits of business agility through content services:

  • Content should be created just once — Users won’t realize any efficiencies of scale if enterprises allow redundant content creation processes to persist. Smart content creation systems will first check to see if duplicate content exists before creating new versions.
  • Use metadata to add context and meaning to content — Clearly, if enterprises are only allowed to create content once, that content will require metadata that describes the information and its uses.
  • Content must exist separate from presentation — When content exists separate from its specific use, it should not contain any specific presentation requirements. The more a content service assumes about the content consumer, the less reusable it is.
  • Allow users to access content at different levels of granularity — Content, just like application functionality, exists at different levels of granularity. Sometimes users are interested in whole documents and sometimes they are interested in a single paragraph. Content creation, management, storage, publishing, distribution, syndication, and protection schemes must work with content at various levels of granularity.
  • Content must have capabilities — When thinking of content as reusable services, enterprises should also extend the notion of rights, policies, and other attributes to content services. They can then grant each piece of content certain privileges such as its ability to be used in certain scenarios.
  • Content lifecycle systems must support the notion of extensible content components — Content lifecycle systems should allow companies to easily evolve their content as time passes and new ideas and needs arise. They may want to add new chunks, change the behavior of existin
    g ch
    unks, change hierarchy relationships, order the content a new way, reassign attribute values, etc.

Content never stands still. Content created from reusable sources is in constant flux, assembled in real-time and always reflecting what is of interest to the reader. Through the adoption of cohesive content lifecycle processes in conjunction with approaches to chunk content that they companies package, discover, and reuse, companies can improve the overall value of content to their business.