Data Mashups: Enabling Ad Hoc Composite, Headless, Information Services
Organizations that have been implementing Service-Oriented Architecture (SOA) are increasingly focusing on how to empower users by giving them greater control over the flexible composite applications that SOA enables. Business empowerment is the most under-appreciated benefit of SOA. SOA empowers business users by offering Services that abstract capabilities and information and enable the consumers of Services to create value outside the capacity, budget, and planning of the central IT organization. Indeed, SOA “democratizes” IT application development by separating the roles of Service creation and management (the role of IT) from Service consumption and composition (the role of Business).
But SOA is not the only area of IT to introduce this democratization of technology capabilities. The emergence of Web 2.0 introduced us to the ideas of user generated content, collaborative information creation, crowd sourcing, and mashups. In the realm of the web experience, technologies such as YouTube are just vessels for delivering the real value created by end-users and not technologists. Enterprises crave the same value proposition for their IT assets. Why would any business organization want to restrict the future potential of its technology investments within an IT organization that limites the business from realizing its full potential? If I can get instant access to information online without having to get IT involved when I’m at home, why do I have such trouble when I’m at work? Increasingly, business wants to see IT as a provider of capabilities to be leveraged by parts of the organization outside of IT, and as a result, the largely separate worlds of Service-Oriented Architecture (SOA) and the collection of collaborative Web applications known as Web 2.0 are converging.
As we’ve defined many times before, enterprise mashups are governed compositions of loosely-coupled Services within a Service-oriented environment that facilitate dynamic, ad hoc, and consumer-centric styles of application development. Yet in typical IT fashion, just as the industry has settled on a definition of the term “mashup”, things have changed. In much the same way that the term “cloud” has been diversified to cover a range of different concepts (data clouds, compute clouds, application clouds), the term “mashup” is likewise diversifying.
A year or two ago, assuming that a mashup was a web browser-based, static, user interface composition of web-based functionality would be a reasonable presumption. But in the enterprise context, none of those assumptions necessarily hold — we might want non-Web access to mashed applications, we might want to change them regularly, and we might want to mash up information that exists below the user interface abstraction. For sure, Web mashups might embody the ideals of the original mashup concept, but we now have the desire to mash up a wide variety of IT resources from application to infrastructure to data that might be exposed with a wide range of interfaces — or without. And, it’s the desire to mash up information freed from the application that diversifies the mashup term to include the concept of the data mashup.
What Data Mashups Add to the Picture
To understand the concept of the data mashup, we should first understand the fundamental ideas that all mashups share. First and foremost, a mashup should be a composition of discrete, and often heterogeneous subparts. More importantly, those subparts should be unrelated to each other and most likely used in different contexts. After all, we shouldn’t confuse the concept of mashing up from programming, which also consists of combining subparts. So, imagine the corporate IT environment is in an IT version of Junkyard Wars (or Scrapheap Challenge for you Brits), and the IT landscape is our big junkyard. The organization is tasked to meet new, and potentially unforeseen requirements, and build something with what is at hand. The parts of the IT junkyard (legacy, if you will) that facilitate the best composition in an environment of governance will allow us to meet the needs as quickly and reliably as possible. In this manner, the mashup facilitates ad hoc styles of composition in ways that the traditional mechanism for building things doesn’t.
This brings us to the second concept that all mashups share: the consumer is in control of the application. The end-user provides the value, since they are the ones that know best how to combine the subparts, and that end-user had nothing to do with the creation of those subparts in the first place. Which brings us to the last thing that enterprise mashups share in common: they must be governed. Just because you can combine subparts together doesn’t mean you should or that you can do so reliably.
From this perspective, data mashups enable users to compose disparate, discrete sources of information into a new, and potentially more valuable composition hosted and managed by a party independent of the original data providers. First, it’s important to note that data mashups might not necessarily have a user interface, and for certain, no need to have a web-based user interface. From a SOA perspective, a user interface is just a rendering of a Service composition, so a data mashup simply would provide a contracted interface that web-based, desktop-based, mainframe-based, or even automated, “headless” processes can access.
Second, we should understand that a data mashup is not simply another form of data integration. For certain, the act of composition achieves the goal of integration, and the consumer-centric mashup is just an aspect of composition. However, data mashups intend to solve different problems than traditional data integration approaches, and in many ways, do not replace the need for or value of traditional data integration solutions.
There are many scenarios for composing data, but some are better suited for static, tightly-coupled, IT-driven, non-Service Oriented form. In fact, 80% of the value that businesses derive from data come from the 20% of fixed, highly optimized data integration approaches implemented over decades. In this realm, traditional data integration approaches retain high value. However, it’s the other 80% of data integration requirements, most of which come from the need to meet short-term, often ad hoc, integration requests that cause 80% of the problems. Anyone who has lived long enough in the enterprise IT space knows that business-driven requests for reporting, forecasting, analysis, or other interpretations of data can present significant complications and cost to the IT organization. The reason for this is that the IT organization is set up to meet the recurring needs of the business and not “situational” needs for information.
But this is where SOA and the data mashup come to the rescue. Just as “situational applications” — one-time, ad hoc application development to meet short term needs — are quite easily implemented through Service-oriented composition, so too can the need for “situational data”.
Data Mashups and the Data Service Layer
However, to make this vision of ad hoc, composable data that empowers the business a reality, the IT organization must take a few major steps. First, the component data sources that that can be composed must first be exposed as Service assets in a governed, managed, SOA environment. Second, the IT organization must give Service consumers the tools and methods they need to be able to successfully compose those Services with low cost and risk. Finally, the IT organization needs to shift itself from having sole responsibility for data integration to simply being providers of Data Services to be mashed.
In order to facilitate this vision, the IT organization should consider its various data assets as Services exposed in a Data Service Layer (DSL). As defined in our Service-Oriented Data Access White Paper, A Data Services layer provides an abstracted set of interfaces that expose reusable Data Services for reading and writing data independent from the underlying data sources. One of the important benefits of a Data Services layer is that it enables loose coupling between the applications using the Data Services and the underlying data source providers. Loose coupling enables data architects to modify, combine, relocate, or even remove underlying data sources from the Data Services layer without requiring changes to the interfaces that the Data Services expose. As a result, IT can retain control over the structure of data while providing relevant information to the applications that need it. Over time, this increased flexibility eases the maintenance of enterprise applications.
With this perspective in mind, the DSL provides the SOA infrastructure needed to enable data mashups as well as more static forms of data composition, including the use of SOA composition approaches and technologies such as Business Process Execution Language (BPEL)-enabled process runtime engines and Service-oriented Business Process Management (BPM) environments. Indeed, the DSL actually broadens the choice and availability of technology platforms that can derive value from discrete data sources.
The ZapThink Take
While exposing and consuming Data Services through data mashups doesn’t eliminate the need for traditional data integration, it does add to the diversity of mechanisms by which the business can extract greater value from its existing information assets. In this light, it is surprising that many companies have not added the DSL and data mashup capabilities to their environments. One would think that it would simply provide more means for the business to get value from their existing information while at the same time, minimizing the role of IT to meet those needs. ZapThink believes it is just a matter of time before organizations of all sorts, both businesses and software vendors, realize the value of exposing and consuming Data Services and the potential of the data mashup. This is your opportunity to pioneer, innovate, and provide value before the mainstream market catches on.