1 Introduction

During the last decade, organizations across the globe have struggled to comprehend and adapt to the changes brought on by the ubiquitous growth of Information Technology and the Internet (Goethals 2009), (Casadesus-Masanell and Ricart 2011). One of the changes is the emergence of open data, which resulted from opening up and sharing of non-sensitive information in a machine-readable format with organizations and the general public (Davies et al. 2013), (Ren and Glissmann 2012). Drivers for opening up data include ensuring accountability, delivering quality services, reducing operating costs, and stimulating innovations (Vickery 2011), (Zuiderwijk and Van Den Braak 2012), (Zuiderwijk and Janssen 2013), (Manyika et al. 2013). Data underpins organizations and the economy, and thus key in providing new insights into consumer needs and enabling new products and services to be developed (HM Government 2013).

Recently, the attention of major stakeholders in the open data community, including policymakers have shifted to the economic value of the rapidly growing open data asset. For instance, the European Commission estimates that the direct economic gains from opening up Public Sector Information (PSI) or government data could amount to €40 billion a year. Similar beliefs across the world have spurred a growing number of open data small and medium enterprises seeking to tap into the shared resources or commons. As new entrants flood the marketplace, open data-driven organizations are seeking to position themselves uniquely through specialization in the landscape to create and capture value for their stakeholders (IBM Business Consulting Services 2005).

To create and capture value for open data stakeholders, open data-driven organizations must employ emerging set of capabilities to catalyse positive change in the organization (Broek et al. 2012). Capabilities enable organizations to achieve a specific purpose or output (Ulrich and Rosen 2011), (Carrara et al. 2015).

While scholars acknowledge the need to address the question of what are open data capabilities and how to identify and develop them; efforts in this direction so far have been very limited, in silos and remains within boundaries of general business capabilities (Janssen and Zuiderwijk 2014), (Klievink and Janssen 2009), (The World Bank 2015), (HEDIIP 2016), (HM Government 2013). This research addresses these same questions by constructing open data value capability architecture which serves as a blueprint to facilitate open data-driven organizations to identify, mapping, plan, and develop their open data value capabilities. To demonstrate and validate the developed open data value capability architecture, we implemented the architecture in two open data-driven organizations located in Ireland.

We argue that this paper has both theoretical and practical significance. Our intended contribution to theory is to provide a grounded architecture that affords future empirical studies on impact of open data value capabilities in respective organizations. For practitioners, the goal is to offer managers of both public and private open data-driven organizations tools on how to identify, map, plan, and develop open data value capabilities to generate both organization and economic value from open data.

The following section provides the literature background on existing value chain models, general types of organizational capabilities and data capabilities. This is followed by a presentation of the approach taken to carry out the research in Section 3. In Section 4, we analyze and categorize identified open data capabilities, and also present a synthesis of open data value capability architecture. In Section 5, we attempt to validate the architecture by presenting application of the Open Data Value Capability Architecture in two open data driven organizations. In Section 6, we present the qualitative evaluation of the research based on some qualitative criteria. Section 7 presents the discussion and in Section 8, research concludes by outlining a series of potential opportunities for future research in this area.

2 Literature background

This section develops literature background for open data capability architecture in organizations. It examines open data-driven organizations in Section 2.1, Value Chain models in Section 2.2, taxonomy of Organization Capabilities in Section 2.3 and Data Capability frameworks in Section 2.4.

2.1 Open data-driven organizations

Organizations have so much to gain from open data from maximizing financial gain to lowering risks (McKinsey Global Institute 2015). Open data promotes innovation and growth by empowering organizations and allow them to unlock customer value from massive data volumes. Large numbers of organizations at the forefront of the data economy have put open data at the centre of their operations, serving as the testament to the long-term and stable benefits that being open data-driven can bring. According to (Ahmadi Zeleti and Ojo 2014; Ahmadi Zeleti et al. 2014), in open data-driven organizations, data plays a critical role and is considered to be the main resource in making new products and services. By utilizing open data, organizations can find market niche, identify business opportunities, capitalize on the opportunity and create new value for themselves, their customers, and their partners, and develop new business models (Ahmadi Zeleti et al. 2016).

In this research, we define open data-driven organization as both non-profit and for-profit organizations that use, produce, or otherwise invest in open data as a key aspect of their operation for generating customer value and achieving organization’s mission goals. These organizations could be data publishers providing open data for others to use or they could be users of open data involved in creating data-intensive products and services. In addition, these organizations could be enablers providing training or advisory services to other organizations on how to effectively work with open data.

In open data-driven organizations, having the right data and tools in place does not guarantee value generation from the open data. Generation of value from open data is contingent on fully exploiting available open data value capabilities in place (Ahmadi Zeleti and Ojo 2014).

2.2 Value chain

For a better understanding of the activities through which an organization creates and develops value for shareholders, it is useful to separate organization systems into a series of value-generating activities known as the value chain (Brits et al. 2007). Value chain consists of stages of the process of creating value for stakeholders (Rayport and Sviokla 1995). Value chain as described in (Rayport and Sviokla 1995) is a model to describe a series of value-adding activities and processes connecting an organization’s supply side to its demand side. Moreover, value chain offers organizations a means by which they can evaluate both existing and new strategic opportunities to create customer value (Walters and Rainbird 2007).

Given the maturity of research work in the area, three well-known value chain frameworks are highlighted: 1) Porter’s Value Chain which relies on the concept of physical value chain of the organization (Bhatt and Emdad 2001), (Porter 1985), 2) Rayport and Sviokla’s Value Chain which relies on the concept of virtual value chain of the organization (Rayport and Sviokla 1995), and the 3) Open Government Data Value Chain which is based on the concept of PSI (Ubaldi 2013) (Carrara et al. 2015).

Below we present the first two value chain models and defer the elaboration of the Open Government Data or PSI Value Chain to Section 4.

Porter’s conceptualization of value chain

Porter’s value chain presented in Fig. 1 is associated with the concept of the physical value chain of the firm (University of Cambridge n.d.) where values of the organization activities are mostly concerned with the physical flow of material (Bhatt and Emdad 2001). Porter’s value chain consists of two sets of activities: primary and secondary activities.

Fig. 1
figure 1

Porter’s value chain (Pant and Hsu 1996)

Primary activities

this includes Inbound Logistics or Input (receiving, storing, and disseminating inputs to the product), Process or Operation (transforming inputs into the final product), Outbound Logistics or Output (collecting, storing, and physically distributing the product to buyers), Marketing and Sales or Share (providing a means by which buyers can purchase the product and induce them to do so), Service or Maintain (providing service to enhance or maintain the value of the product) (Finne 1997).

Secondary/support activities

this includes Procurement (the function of purchasing inputs used in the organization value chain), Human Resources Management (the recruiting, hiring, training, development, and compensation of all types of personnel), Technology Development (know-how, procedures, or technology embodied in process equipment) and Infrastructure (general management, planning, finance, accounting, legal, government affairs, and quality management which support the entire chain and not individual activities) (Porter 1985), (Julien 2012), (University of Cambridge n.d.), (W3C Brazil 2012).

Rayport and Sviokla’s conceptualization of value chain

Rayport and Sviokla value chain relies on the concept of the virtual value chain of a firm in which ‘information’ play a key role in the chain. The virtual value chain is all about utilizing information to enhance the value chain. Therefore, in the virtual value chain, strategic decisions, and activities are built around information (Rayport and Sviokla 1995) (Bhatt and Emdad 2001).

According to Rayport and Sviokla (1995); a virtual value chain consists of five stages; Gathering, Organizing, Selecting, Synthesizing, and Distributing.

Integration of physical and virtual value chains

Integration of Porter’s and Rayport and Sviokla’s value chains can also happen in an organization when the organization aims to adopt both virtual and physical activities for offering customized products and services. Virtual value chain makes a large part of the transactions transparent by providing the organization with customer, suppliers and manufacturers information while physical value chain allows the organization to fulfill customer orders and assembling final product and services (Bhatt and Emdad 2001).

In the physical value chain, information performs a support function but, in virtual value chain information plays a critical and strategic role.

2.3 Organizational capability types – A general view

Many organizations today wonder what exactly organizational capability means and why it is so important (Brits et al. 2007). While there are different definitions and conceptualizations for the concept of organization capability in the research literature, extensive experience from practice clearly indicate that the concept represents “organization capacity to successfully perform a unique organization activity over a period of time”.

Along this perspective, Brits (Brits 2006) defines capability as a “special type of a resource whose function improves the productivity of other resources”. This implies that resources can represent a cluster of elements that constitute a capability. Also, (Townsend and Cairns 2003) argue that there is a considerable difference between competency and capability. Competency, as it is more regularly defined and theorized, is a term that covers current observable skills based on current knowledge while capability is beyond competency. The capability is a more “holistic, broad-based concept that includes the additional elements of values and self-efficacy as core components and it describes how an individual or organization applies their ability in a confident manner to problems in new and unfamiliar circumstances as well as in familiar situations” (Townsend and Cairns 2003). Townsend and Cairns (2003) identified three fundamental organization capability attributes: 1) ability - the current organization competence, 2) self-efficacy - belief in one’s ‘capability’ to perform satisfactorily and 3) shared appropriate values - sharing values across the organization such as trust and valuing diversity.

In the study completed by (Bhatt and Emdad 2001) and (Ahmadi Zeleti and Ojo 2014), three types of organization capabilities were described based on the well-known edicts of Resource-Based View and Dynamic Resource-Based Theory (Helfat and Peteraf 2003). The Resource-Based View naturally evolved into studying how intangible resources, such as intellectual assets, could be leveraged to accelerate organizational learning and competitive advantage (Oliveira et al. 2002) (Bharadwaj 2000). Dynamic Resource-Based Theory simply facilitates the evolution of these capabilities over time (Helfat and Peteraf 2003). The three capability types include 1) Value capabilities, 2) Competitive capabilities and 3) Dynamic capabilities (Bhatt and Emdad 2001).

  • Value capability - this includes capabilities that are characterized by value, heterogeneity, and imperfect mobility. Value capabilities include all capabilities which assist an organization to deliver the organization value to the customers. While value capabilities are not the source of competitive advantage, they are necessary to produce customer value. For example, IT infrastructure falls into this type of capability. IT infrastructure has been described as an important organization capability that can be an effective source of value (Bhatt and Emdad 2001) (Bharadwaj 2000) (Ahmadi Zeleti and Ojo 2014). Value capabilities according to Table 1 comprises Individual competences, Business process, Organization, IT infrastructure, Technological infrastructure and Management/Governance. We elaborate on value capabilities in section 4.3.

  • Dynamic capabilities - this includes capabilities required in dynamic nature of the competitive environment. The concept of dynamic capability reflects the ability of the organization to renew capabilities (integrate, build, and reconfigure internal and external competences (Helfat and Peteraf 2003)) to address rapidly changing environments. Dynamic capabilities capture the ability to search, explore, acquire, assimilate, and apply knowledge about resources, opportunities, and how resources can be configured to exploit opportunities (Bhatt and Emdad 2001). Dynamic capability can also facilitate branching of other capabilities as it is changing the organization capabilities. According to (Brits 2006), differential performance of organisations over time is because of their capacity in the 1) accumulation, 2) deployment, 3) renewal, 4) reconfiguration of resources in response to changes in the internal and external environment, 5) Attempts to explain the process of how capabilities are created, 6) Emphasises the strategic value of higher order resources because of its dynamic nature, and 7) Renewal of core competencies and competitive advantage. For example, Research and Development capability falls into this type of capability (Helfat and Peteraf 2003). Table 1 identifies four specific types of dynamic capabilities. Process innovation is required to improve the processes for the production of new product or output (Verworn and Herstatt 2002). Knowledge management encompasses identifying and mapping intellectual assets within the organization, generating new knowledge for competitive advantage, making vast amounts of corporate information accessible, sharing of best practices, and technology (Tanriverdi 2005), (Easterby-Smith and Prieto 2008). Manufacturing performance is characterized by the set of practices in use for the manufacturing system (Hallgren 2007). Supply chain integration enables firms to share information with their network to create supply partners information-based approaches for superior demand planning, for the staging and movement of physical products, and for streamlining voluminous and complex financial work processes (Rai et al. 2006).

  • Competitive capability - this includes capabilities that foster the organization competitive advantage and allow organizations to stay competitive. These capabilities also impact the future competitive capabilities because of the dynamic and long-term effect (Bhatt and Grover 2005) (Oliveira et al. 2002). For example, IT strategic choices fall into this type of capability. IT strategic choices are a source of competitive advantage because they develop through years of experience by learning by doing (Oliveira et al. 2002). IT experience allows the organization the ability to integrate IT strategy and organization strategy, develop reliable and cost-effective systems for the organization and anticipate organization needs sooner than the competitors (Bhatt and Emdad 2001). Table 1 outlines four types of competitive capabilities. IT (Strategic choices) are the main forces for competitive advantage. IT strategies increase competitive pressure in the marketplace (Xia and King 2002). Manufacturing strategy is a set of strategies organizations define for improvement of manufacturing processes and performance (Hallgren 2007). Business operational is the capability of the whole system (organization) to operate locally/globally (Cepeda and Vera 2007).

Table 1 General capability types

In general, capabilities have lifecycles which indicates the stages for developing a capability over time. According to (Helfat and Peteraf 2003), capability life cycle has three stages which are 1) Founding (capability is identified and starts functioning), 2) Development (capability is developed gradually over time) and 3) Maturity (capability meets its highest level of functionality and impact) (Helfat and Peteraf 2003). Not all capabilities may reach the maturity stage due to poor development. Furthermore, all aspects of capabilities are found in both the Resource-Based View.

2.4 Data capability frameworks

An analytical report by The European Data Portal (Carrara et al. 2015) reports on four areas of capabilities for dealing with data: 1) technical capabilities: technical skills to work with data. Capabilities such as data processing, data analysis, and data structuring, 2) statistical capabilities: statistical skills that are required to work with statistical tools to analyze the data. Skills such as cleaning and analyzing the data using different statistical functions, 3) Analytical capabilities and personality: this is the analytical skills one can have to obtain valuable insights from data. This is only possible if one can understand the problem and provide an approaches to solving the problem with the use of data, and 4) business insight and domain knowledge: this capability and skill are required to understand the market the business is operating in. This allows people working with data to understand the market better, develop good business strategies, and therefore focuses on marketing and promoting the product and service they generate by using the data. Moreover, HM Government (HM Government 2013) highlights three overarching aspects to data capability. This includes 1) human capital: human capital or skilled workforce is one of the three overarching aspects to data capability which enables the organization to manage, analyze, interpret and communicate data to extract insight and value for the organization, 2) tools and infrastructure: as volumes and demands of data are increasing and becoming more sophisticated, it is vital that organizations develop agile data infrastructure, tools, techniques and applications to be responsive to changing trends in and behavior of data industry. Data infrastructures are used to process, store and analyze data and 3) data re-use: data handling, protecting, guaranteeing and validating are necessary to best access and use data, share data across sectors and disciplines for data re-use and to link datasets from various sources. All three overarching aspects are necessary for data capability.

Along this line, Crowston and Qin (Crowston and Qin 2010) present a capability model for data which contains process areas, best practices, and common features. Reliable data organizations have the capability to execute these practices reliably, that is, to perform them in a consistent and predictable fashion. An example of Data capability model is shown in Fig. 2.

Fig. 2
figure 2

Operationalization of data capability model through goals, key process areas, and key practices

In the data capability model, each process area is described regarding common features. Common features include (Crowston and Qin 2010):

  • Commitment to perform: the organization has policies regarding the processes and how to perform these processes;

  • Ability to perform: the organization has the capability to perform the processes;

  • Activities performed: the process is performed in practice;

  • Measurement and analysis: the execution of the process is measured and performance is analyzed, and

  • Verifying implementation: before implementation, the quality of processes and activities are assured.

A large number of key practices were identified in (Crowston and Qin 2010). Key practices are clustered into four process areas based on the high-level goal the practice helped achieve. These practices are shown below in Table 2.

Table 2 Data capability process areas and practices

Practices presented in Table 2 can also be mapped to the data lifecycle (Higgins 2008) to provide a framework for management of data and help plan activities at more granular levels.

3 Research method and approach

This section presents the research approach, research framework, and research design process guided by the Design Science Research paradigm.

3.1 Research approach

The primary approach employed in developing the open data value capability architecture follows the design science research guidelines and process elaborated in (A. Hevner and Chatterjee 2010). Design science research is fundamentally a problem-solving paradigm. Design science research, in general, develops new artifacts and improves the effectiveness and efficiency of the existing artifacts in the context of solving real-world organization problems. We selected a design science approach for our research methodology since the aim of this study is to create a new artifact in a form of design tool to assist both public and private organizations to identify, map, develop and plan open data capabilities required to create value from the open data. We followed six main design science activities presented by (Peffers et al. 2007) and (Sonnenberg and Brocke 2012). The implementations of these activities in our work are presented in subsequent sections of this paper. Figure 3 shows the activities.

Fig. 3
figure 3

Build and evaluate activities of this research (Sonnenberg and Brocke 2012)

Problem identification (motivation)

We specified the problem of lacking of an open data value capabilities to support and guide open data-driven organizations, justified the value of a capability architecture for open data (the solution) and proposed an approach for developing such an architecture in Section 1.

Define the objectives of a new solution (developing)

We studied open data-driven organizations and open data value capabilities by means of literature review to determine the different aspects of the capability architecture necessary for open data-driven organizations to identify, map, plan and develop value capabilities for generating value from the open data (Section 2).

Design and development

We designed an open data value capability architecture that articulates distinctive elements to guide organizations in mapping, planning and developing open data specific capabilities (Section 4).

Demonstration

For demonstration of the constructed architecture, we implemented the architecture in two open data-driven organizations (Section 5).

Evaluation

To evaluate the architecture, we employ common techniques for evaluation in design science research (A. R. Hevner et al. 2004).We evaluated the architecture based on the perceived utility of the architecture by pilot organizations for mapping (identifying existing capabilities in organizations), planning (identifying missing capabilities and priorities) and developing open data capabilities for increased value generation (Section 6).

Communication

To communicate with the research community, results of the research would be published in related scholarly channels and communities. In addition, this research offers guidance for future actions and research in the domain of open data capabilities (Section 8).

3.2 Research framework

The research framework employed in this study is based on design science research framework presented in (A. Hevner and Chatterjee 2010). The framework consists of three blocks: environment, design science research and knowledge base.

As shown in Fig. 4, the contextual environment for this study is the open data value capability architecture for both public and private organizations. We build the open data value capability architecture based on the analysis of the existing knowledge obtained from literature on open data-driven organizations, PSI value chain, and general business capability.

Fig. 4
figure 4

The research framework design based on (A. Hevner and Chatterjee 2010)

3.3 Design process

Guided by the research framework elaborated in Fig. 4, the design process proceeded in following major steps: 1) Literature review (Section 2), 2) Analysis of open data capability areas (Section 4), 3) Synthesis, design and development of the open data value capability architecture (Section 4), and 4) Architecture demonstration and validation: demonstration of use of the open data value capability architecture in an open data-driven for-profit organization in Ireland (Section 5 and 6).

3.3.1 Literature review

Our first attempt in understanding concepts in the domain and the topic under study is the review of the existing literature. We used keywords such as “open data”, “open data value chain”, “capability building”, “organizational capabilities”, “open data organization”, and “open data capabilities” to search for relevant articles, reports, and other written materials in the domain. Relevant literature on value chain models, capability types, and data capability framework are also reviewed in this research. To complement the information gathered during the research, website contents on open data capabilities were also considered to be useful given the paucity of scholarly work on open data-driven organizations, and related organizational capabilities. This facilitates better description of the capability architecture as a whole and ensuring that the architecture provides enough information and is understandable by organizations willing to utilize it.

3.3.2 Analysis

To analyze open data value capability areas and specific capabilities associated with each area, we first map a list of capabilities we considered necessary for utilizing and working with data and then carefully select and collect these capabilities from various literature. Relying on our domain knowledge, we categorize the collected capabilities. The categorization is based on what each capability aims to achieve. For example, a capability like ‘sophisticated querying’ capability would be considered under the ‘data retrieval’ category. Another example is ‘data visualization’ capability which is an essential capability for ‘data processing’. Overall, we identified 45 open data value capabilities for generating value from open data. The 45 capabilities are then subsequeently categorized into eight open data capability areas.

3.3.3 Synthesis

To synthesize the open data value capability architecture, we combine the six types of generic organizational value capabilities and open government data (PSI) value chain found in the related literature with the eight open data capability areas identified, analyzed, and categorized by the authors (as domain experts). The synthesis and the construction of the architecture are done in a way that the architecture can help to identify and capture different open data value capabilities an organization requires in every stage of the value chain. For example, we would like the architecture to help capture the ‘individual competencies’ at every stage of the value chain. In this example, in the first stage of the value chain, the ‘data generation’, an organization can identify ‘individual competencies’ required for ‘generating data.’

3.3.4 Architecture evaluation

To evaluate the developed architecture qualitatively, this study used qualitative criteria to evaluate the architecture after its use in two open data-driven organizations. The selection of the organizations is based on the following three criteria:

  • The organization is an open data-driven organization. This means that the organization primarily relies on open data as a key resource to achieve its mission goals.

  • The organization is already generating revenue (not an early-stage start-up) and has an identifiable business model.

  • The organization understands the need for developing open data capabilities and appreciate the role that architectures plays in developing such capabilities.

While our initial goal was to select more than two organizations for validating the architecture, out of the organizations that responded only two met the above criteria. However, we believe, having two relatively mature and profitable organizations for validation is a good first step towards demonstrating the value of the developed architecture. According to (Saunders et al. 2009), the reason for multiple implementations is to establish whether the findings can occur in other organizations and thus provide a basis for generalization. However, our aim in this current study is not to generalise but to better understand how our developed architecture could be exploited in different organizational contexts and obtaining feedback on perceived utility of the architecture.

Two organizations Xpreso; a private organization and The Marine Institute; a public organization both based in Ireland agreed to implement and evaluate the architecture. We organized two separate virtual meetings (GoToMeeting) with the two organizations. Xpreso’s team consists of three core employees working closely with data. The team includes the Chief Technology Officer (CTO), a data specialist, and a senior developer. The Marine Institute team comprises two core and well-experienced employees both at the managerial level with over 10 years of experience in working with data.

During the GoToMeeting call, we presented the architecture and described each element of the architecture to the teams. In addition, we also provided the teams with a separate written document describing the architecture in more details. We used Google online collaborative tool to allow each team to work with the architecture in a collaborative environment. This also allows us to view the progress of both teams in a real time. All document versions were collected and stored in our local repository. Thus, the information could be later checked for correctness. Besides, we have requested both teams to provide as many information as possible about their value capabilities. This allowed the material to be reviewed on demand to ensure accuracy.

4 Synthesis of an open data value capability architecture

This section presents a comprehensive study of the constructed open data value capability architecture. The three areas of focus for building the architecture are: 1) the open government data (or PSI) value chain, 2) open data value capability areas, and 3) types of generic value capabilities (Fig. 5). Below, we briefly describe how these three aspects are related and how they have been constructed to serve and address the needs of open data-driven organizations to identify, map, develop and plan open data value capabilities.

Fig. 5
figure 5

The three aspects of the open data value capability architecture

As the focus of this study is on the value capabilities of open data-driven organizations, the open government data value chain is used to specify the stages for generating value from the open data in these organizations. To do so, each stage of the value chain should contain number of interrelated activities, in this study, the activities are the open data value capability areas, serving the need of that particular stage and providing input to the following stages. In other words, the stages of the value chain and the related open data value capability areas provide step-by-step guide for open data-driven organizations to define their value generating capabilities required to achieve the goal of each stage as well as that of the whole value chain. However, to understand what types of capabilities to be identified for each open data value capability areas, we have used generic value capabilities such as individual, business processes, and IT infrastructure capabilities. The value chain stages and the open data value capability areas are specific to open data-driven organizations while the types of value capabilities are common capability areas for any kind of organization.

In the following sections, we further describe each of the three aspects in more details. In section 4.1, we describe the open government data value chain reported in open data and open government data literature. In section 4.2, we present the open data value capability areas and capabilities associated with each area. In section 4.3, the six types of general organizational value capabilities reported in the business literature are presented and described and in section 4.4, the synthesis of the above three and constructed open data value capability architecture is presented.

4.1 Open government data (PSI) value chain

The understanding of value chain is essential to grasp the vital elements of various activities related to open data (Ubaldi 2013). Value is not only financial in nature in an open data context. In general, value could also be economic, social or political (Guidoin n.d.). By utilizing value chain, organizations can identify internal and external activities or processes to create value and improve efficiency and effectiveness (Rayport and Sviokla 1995), (Vries 2012).

The value chain model we employed to construct the open data value capability architecture identifies four main phases: Data Generation, Data Collection, Aggregation and Processing, Data Distribution and Delivery, and Final Data Use (Ubaldi 2013). These phases are described in (Ubaldi 2013) as follows:

Phase 1: Data generation

data generation phase covers all capabilities required for generating data. This phase requires capabilities related to ‘generating data,’ for example, technologies to collect a substantial amount of data.

Phase 2: Data collection, aggregation, and processing

Raw data may not have enough quality and meaning to be used. Thus data often need to be aggregated, linked, and or manipulated to add value to being open and freely distributed. This phase requires capabilities related to ‘data processing’ and ‘data storage and computing facilities’; such as data cleansing, mashing-up, analysis, invalid or duplicate data deletion and standardization. Moreover, data storage and computing facilities are necessary to be pooled together for the efficiency of data aggregating and processing.

Phase 3: Data distribution and delivery

data processed need to be distributed to enable access and re-use. Public sector entities and other organizations are obliged to define precise publishing solutions, providing access to data and APIs and ultimately releasing data. This phase requires capabilities related to ‘publishing solution,’ ‘providing access to data and APIs’ and ‘data release,’ for example, publishing as linked data, data exposure via APIs and proactively releasing data.

Phase 4: Final data use

data previously distributed need to be re-used by different users to sustain public value creation. This phase requires capabilities related to ‘data retrieval’ and ‘data usage,’ for example, guidelines on how to use data and supporting intermediaries.

4.2 Open data value capability areas

To be competitive and generate robust and thriving revenue streams, open data-driven organizations tend to increase efficiency and effectiveness in respect to value-adding processes related to generating data, processing data and re-using data. To increase efficiency and effectiveness, organizations are required to identify a set of capabilities. A number of open data value capability areas have been identified and mapped from open data literature. They include: Capability areas are ‘data generation’, ‘data processing’, ‘data storage and computing facilities’, ‘data release’, ‘providing access to data and APIs’, ‘publishing solution’, ‘data retrieval’ and ‘data usage’. These are explained below:

Data generation

This capability is associated with generating new sets of data from existing information, text and other raw data or from any device or software collecting data. This can include data generation from sensors or smart grids (Ferro and Osella 2013), (Carrara et al. 2015) (Zuiderwijk et al. 2015).

Data processing

This capability is associated with processing the generated or the original data to meet its potential purpose of use. Examples in this vein include utilizing processing software to mash-up of original data with other sources of information, harmonization of data with a specific application and cataloging data to suit the expected need and to the fruitful use of such data to enhance the organization (Ferro and Osella 2013).

Data storage and computing facilities

This capability is associated with data storage and back-ups such as storage capacity and computing facilities such as computer hardware or software, computer networks and communications systems and all networking and communications provision including connections to external computers. It is essential for an organization to estimate data storage and computing capacity appropriately to ensure data quality.

Data release

This capability is associated with the release of processed data to its users to enable data reuse. Capabilities such as data structuring, classification, and regular update. Data is considered a public good, thus should be released in accordance with the data release regulations of a particular organization (HM Government 2013).

Providing access to data and APIs

This capability is associated with availability and accessibility of APIs to external users such as developers. Capabilities such as API development, data exposure via graphical user interface and APIs and testing and bug fixing. There is still plenty more to do on making more data and APIs accessible (HM Government 2013).

Publishing solution

This capability is associated with publishing data in compelling formats which require methods and mechanisms. For example, publishing as Linked Data is one publishing solution.

Data retrieval

This capability is associated with data query. This includes extracting the requested data from data storage or datasets. This process requires sophisticated querying and appropriate planning for data retrieval.

Data usage

This capability is associated with enabling and supporting data users such as data intermediaries and developers to be able to use data. Data re-use will enrich the value of data.

Open data value capability areas and capabilities associated with each area are presented in Table 3.

Table 3 Open data value capability areas

4.3 The six types of value capabilities

Value and perceived value have received considerable attention in the organizational science literature. However, the actual definition and assessment of value is seen as a complex task due to the problem in identifying and measuring both monetary and nonmonetary benefits and sacrifices. Accordingly, while some researchers define value primarily in monetary terms others define it in nonmonetary terms to include competitive gains, competencies, social relationships, knowledge, managerial time spent, etc. (Moller and Torronen 2003). However, in this research, we believe ‘value’ is both monetary and nonmonetary. However, the value created is bound to vary between organizations due to the differences in organizational cultures, customers, customer relationship and value proposition generated (Moller and Torronen 2003).

In this research, as value chain is involved, we consider general types of value capabilities to build the capability architecture. General value capabilities are described below.

Individual/competences

In (Jaques and Stamp 1995), the authors define the concept as the extent and complexity of the context within which an individual can operate. For example, specific employee skills required performing a specific task.

A business process is a collection of related, structured activities or tasks that produce a specific service or product for a particular customer or customers. For example, Standardization and harmonization process, validation and visualization process. (Steiner et al. 1997), (Symphony Technologies Pvt. Ltd. n.d.).

Organization

This refers to the way systems and people in the organization work together to get things done. For example collaboration mechanisms, organization-specific competencies (marketing, finance, etc.), employees motivation, efforts towards organizational goal, adaptability, and flexibility, creativity and innovation (Ambrosini and Bowman 2009).

IT infrastructure

IT infrastructure provides an organization the ability to share information across the organization (Bhatt and Grover 2005). Another word, IT infrastructure is the technological foundation of equipment, computer, communications, data and basic systems used in common across an organization. It includes software (ERP), internal and external network resources (servers and switches) and services (software setup, help desk and computer administration) (Bhatt and Grover 2005), (Mithas et al. 2009), (Gheysari et al. 2012), (Bharadwaj 2000), (Xia and King 2002).

Technological infrastructure

technology is knowledge embedded in products and processes on doing practical things, especially producing things or data. It includes any sensor-based devices, sensing/sensor phones and smart grids (Arnold and Thuriaux 1997), (Gheysari et al. 2012), (Brazilian National Council for Scientific and Technological Development (BNCSTD) 2011).

Management/governance

This is about controlling things (people and resources) and action of governing the organization. Different management skills or actions might be required for different stages of the value chain. For example, staffing, training programs, compensation, a quick response accepting additional data for advanced features, technical management expertise and managing risks (Ambrosini and Bowman 2009).

Each of the six types of general value capabilities described above is valuable and necessary for an organization. These capabilities collectively provide a more holistic approach to building and managing value chains. The section below presents synthesis of the open data value capability architecture based on: 1) open government data/PSI value chain phases, 2) open data capability areas and 3) general types of value capabilities.

4.4 Open data value capability architecture

The deconstruction of open data value capabilities is an architecture of strategic management and value chain analysis for open data-driven organizations, which aims at helping open data-driven organizations to identify, map, develop, and plan value capabilities that are required – from generating data to final use and re-use of data – for creating and capturing value from open data.

The architecture is a strategic tool for open data-driven organization of any scale to exercise and exploit for their organization. By relying on the open data value capability architecture, organizations are better informed to identify, map, develop, and plan capabilities. As the architecture provides all the essential top-level aspects of the capabilities necessary to generate value from the open data, it ensures the practical and timely design and identification of the capabilities by organizations and allows linking existing capabilities to the organizational goals. Furthermore, the architecture helps organizations to understand how the organization is currently functioning and to diagnose weaknesses and strengths in terms of the capabilities available inside the organization and capabilities that need to be further enhanced and developed. Therefore, organizations can draw conclusions for action planning and intervention to address the capability shortage within the organizational boundaries.

Open data value capability architecture is constructed based on: 1) open government data or PSI value chain stages, 2) open data value capability areas, and 3) the general value capability types. Figure 6 presents the open data value capability architecture.

Fig. 6
figure 6

Open data value capability architecture

In utilizing open data value capability architecture, organizations need to identify specific individual, process, organization, IT infrastructure, technological infrastructure, and management capabilities for all value chain phases. The open data-driven organizations can use this as a capability development tool to identify what capabilities are required for the organization. The process is initiated by identifying what capabilities are required for each capability area correlated with the first stage of the value chain. For example, the first value chain stage is Data Generation, and this includes a set of capabilities required for generating data. Managers should identify what individual, process, organization, IT infrastructure, technological infrastructure, and management capabilities are required for generating data. Open data organization management/governance is necessary throughout the value chain to ensure the quality of the process.

5 Application of the open data value capability architecture

We are interested in investigating the application of the open data value capability architecture in practice and in the contexts of mature organizations. Thus the architecture was implemented in two open data-driven organizations. The first is the private organization based in Dublin, Ireland and the latter is a public organization based in Galway, Ireland. We begin with each organization’s background followed by the organization’s open data value capability architecture and the feedback received.

5.1 Background of the organizations

5.2 Xpreso

Founded on 2013, Xpreso is an open data-driven private organization based in Dublin, Ireland with a focus on communication platform which connects courier drivers with parcel recipients in real-time. The founders of Xpreso consider the business as both data consumer and data producer. Xpreso’s employs open data in supporting its primary businesses. This is synonymous with the “support primary business model” described in (Ahmadi Zeleti and Ojo 2014).

Xpreso’s mission is to develop a communication platform that links courier drivers with parcel recipients in real-time. The recipient can see the driver’s position on a map, and can also see exactly when their parcel is due to arrive. If they are not home to receive the parcel, they can redirect the parcel to a more convenient location. Xpreso’s vision is both to increase the ease at which people can shop online, and to increase the ease at which drivers can complete their deliveries.

Xpreso provides the following functions for courier delivery drivers:

  • The Xpreso app tracks the driver’s GPS trace as he completes his deliveries

  • The driver can see the position of all of his deliveries on a map

  • The driver’s route is optimized to allow for more efficient delivery schedules

  • Pickup-Requests can be transmitted to the driver ‘on-the-fly’ through the app

The capability architecture was employed and worked on by the CTO and one employee from research and development department. The outcome of the use of the architecture in Xpreso is presented in later sections.

5.3 The Marine Institute

The Marine Institute with over 10 years of experience in data cataloguing is a State agency responsible for marine research, technology development and innovation in Ireland. The Marine Institute carries out environmental, fisheries, and aquaculture surveys and monitoring programmes to meet Ireland’s national and international legal requirements. The Marine Institute provides scientific and technical advice to Government to help inform policy and to support the sustainable development of Ireland’s marine resource. The Marine Institute aims to safeguard Ireland’s unique marine heritage through research and environmental monitoring. The Marine Institute’s research, strategic funding programmes, and national marine research platforms support the development of Ireland’s maritime economy.

The Marine Institute Act states that the Institute will have the following general functions:

“to undertake, to co-ordinate, to promote and to assist in marine research and development and to provide such services related to marine research and development that in the opinion of the Institute will promote economic development and create employment and protect the environment”.

The Marine Institute provides the following data products and service:

  • ERDDAP - a data server that gives users a simple, consistent way to download subsets of scientific datasets in common file formats and make graphs and maps

  • Irish Spatial Data Exchange (Bootstrap)

  • GeoNetwork (CSW, ISO 19139)

  • Esri REST Services

  • Irelands Marine Atlas

  • Marine Data Online (Bootstrap)

The Marine institute’s unique capabilities are Geographic Information Systems knowledge and capabilities as over 90 % of marine data has a spatial dimension and thus displayed on a map. The combination of geographic and scientific data that creates capabilities in working visually with open data through visualisation through maps.

5.4 Implementations of the open data value capability architectures

We started by providing both organizations with working sheets specifying the purpose of the architecture, the architecture, the metadata of the architecture, the feedback questions, and the organization’s background. Both organizations attempted to understand the architecture, its elements, and the goal it aims to achieve. While there are obvious differences between value chain stages, general value capability types, and open data value capability areas, the metadata increasingly assist both organizations to understand the capability architecture as a whole and each component of the architecture in specific. In the metadata, we have extensively described each component of the three broad areas: the value chain stages, open data value capability areas, and general value capability types and their relations. Figure 7 presents a snapshot of the architecture sheet sent to both organizations.

Fig. 7
figure 7

Snapshot of the architecture sheet

To evaluate the developed architecture, we collect feedback from both organizations. Figure 8, presents a snapshot of the feedback sheet submitted to both organizations.

Fig. 8
figure 8

Snapshot of the feedback questions sheet

Bellow, the open data value capability architecture of Xpreso and The Marine Institute is presented.

5.5 Open data value capability architecture - Xpreso

Xpreso performed four attempts towards mapping its open data value capability architecture using the provided instrument. The following was expressed by the Xpreso’s team and observed by us. In regard to the value chain stages presented in the architecture and described in the metadata, Xpreso includes:

“The value chain stages are easy to understand and did fully and logically cover the business steps involved in producing and publishing data.”

We neither observed nor have been informed by the Xpreso team that there were objections to the value chain stages. Moreover, regarding the open data value capability areas of the architecture, the Xpreso team further includes:

“The areas are clear to understand and did fully cover the data capabilities. There was a clear division between areas”

However, despite the availability of the metadata of the architecture’s components, we observed that Xpreso found it difficult to fully understand the general types of value capabilities. In regard to the technical aspects of the general types of value capabilities, Xpreso adds:

“We found it somewhat difficult to differentiate between IT Infrastructure and Technological Infrastructure when filling out the database.”

In addition to the technical aspects of the general value capability types, Xpreso reported that they had difficulty in distinguishing the organizational and management capabilities.

“We found it difficult at first to fill out details for Organisational competences and Management/Governance competences for certain value chain stages. Also, there is a clear division between Organisational competences and Management/Governance competences, but, we found it quite difficult at first to make a decision between the two areas.”

Equally important is to note that Xpreso verifies the significance of utilizing the capability architecture in open data-driven organizations more especially start-ups to unleash, generate, and capture the real value from open data. In this regard, the Xpreso team adds:

“The architecture allowed us to, for the first time, fully examine the processes and capabilities required to produce and publish the datasets which we are considering, which helped to greatly clarify the requirements of producing such data. It also allowed us to examine our organization from regarding the key business areas described in the architecture, which we found to be a very useful tool in its right.”

Given that Xpreso had not carried out any capability audit in the past, the team found the capability architecture very useful tool to identify, create, develop and manage open data capabilities.

Table 4 shows the open data value capabilities of Xpreso that allow them to generate and transfer value to the customers. Xpreso’s architecture instance is presented in Table 4.

Table 4 Xpreso’s open data value capability architecture

5.6 Open data value capability architecture - the Marine Institute

The Marine Institute successfully completed the open data value capability architecture in one iteration. The following was expressed by The Marine Institute’s team and observed by us. With regards to the three broad areas - the value chain stages, open data value capability areas, and general value capability types - used to construct the architecture, The Marine Institute includes:

“The structure of the path in the data publication lifecycle and all the features to include has pleased us about this architecture.”

The Marine Institute team did not raise any issue about the three core dimensions of the architecture - value chain stages, open data value capability areas, and general value capability types and their relations. Regarding the usefulness and applicability of the architecture, The Marine Institute includes:

“The open data value capability architecture could prove useful to supporting knowledge on organisational open data value capabilities and it is useful in an assessment and understanding of the stages in the open data evaluation cycle”

Similar to Xpreso, The Marine Institute has never carried out any capability audit before engaging in this research, and they find the capability architecture a complete tool to identify, create, develop and manage open data capabilities. In addition, the organization found the elements of the architecture sufficiently complete. However, in terms of increasing the understandability of the concepts used in the architecture, The Marine Institute suggested:

“A description of business processes and some material on open data value chain for government organisations will prove useful”

Table 5 shows the open data value capabilities of The Marine Institute that allow them to generate and transfer value to the customers. The organizations architectural instance is described in Table 5.

Table 5 The Marine Institute’s open data value capability architecture

6 Qualitative evaluation

We have used some qualitative evaluation criteria for evaluating the constructed architecture. Specifically, we adopted a qualitative model for examining the various aspects of the architecture. This model focuses on research findings, process and pragmatics. We have adopted this model with a particular focus on the methods used most extensively in evaluations for assessing quality in qualitative research (Lin et al. 2007). We have selected three quality criteria for our evaluation presented in Table 6. Given that the research produced a new artefact - open data value capability architecture; the ‘Finding’ criterion was selected to evaluate the architecture. The ‘Process’ criterion is selected to allow us evaluate the steps taken to develop the architecture. As we have implemented the developed architecture in two open data-driven organizations therefore, the ‘Pragmatic’ criterion is selected to evaluate the practical aspects of this implementation.

Table 6 Quality criteria for evaluation

Grounded in the literature on qualitative research methods and evaluation, we have identified relevant sub assessment criteria for the above quality criteria. Table 7, presents the qualitative model to evaluate our open data value capability architecture.

Table 7 Qualitative evaluation of the Open Data Value Capability Architecture

7 Discussion

Past studies on open data publishing process shows that open data publishing requires capabilities for collecting and generating of data, processing, securing privacy and the development of standards for publishing and use of data. Transformation requires the development of these capabilities and development of capabilities require the organization to understand them.

To date, no open data value capability architecture or closely related framework exists in both open data and organizational science literature. Thus, comparisons of the developed open data value capability architecture with similar architecture or framework is not possible. However, a number of organizations claimed to have developed framework supporting organizational data capabilities. For example, (Public Bodies Working Group on Open Data 2015) proposed a technical framework that supports the ongoing implementation of the open data initiative and ensures that publication of datasets on the Irish open data portal, is done in a consistent, persistent and truly open way. The framework highlights five key components: 1) open data licence, 2) recommended formats, 3) metadata schema, 4) recommended standards, and 5) recommended unique resource identifiers. Moreover, (PWC 2014) developed a data analytics framework consisting seven key building blocks to identify and release insight from data. The seven building blocks are: 1) discover, 2) assure, 3) describe, 4) predict, 5) empower, 6) embed, and 7) optimise. (PWC 2014) claims that data analytics capability is a critical business discipline that creates the insight necessary to help solve complex business challenges. In addition, (Big Data Partnership 2016) proposed a general data capability framework allowing organizations to execute a big data strategy tailored to maximise value against business objectives. As it is important to have a regular mechanism to validate that the big data strategy is aligned with the business goals, the framework assesses three key themes: 1) technology (Does the current technology stack allows to deliver big data strategy?), 2) organisation (is there set of right skills to support those technologies?), and 3) governance (what capability exists to ensure the programme is successfully delivered? and what processes are in place to maintain the solution, govern the data and manage usage across the organisation?). (Eckartz et al. 2016) also proposed an open data innovation capability framework which looks at how successful open data re-users create value out of the available data sources. The framework includes three main components: 1) IT capabilities, 2) organizational capabilities, and 3) skills. The study reveals that skills are valued the highest closely followed by organizational capabilities.

None of the above frameworks captured the essential open data value capability areas. However, in (Ahmadi Zeleti and Ojo 2014), authors highlighted the importance of studying detailed specification of the architecture elements for the pragmatic or tooling purpose. Therefore, we have sought to analyse this convergence as a form of alignment in which we expect open data value chain to directly support open data value capability architecture and consequently shape open data-driven organization value capabilities. The implementation of the develped open data value capability architecture in the two open data-driven organizations show that it has the potential to significantly impact identifying, mapping, developing and planning open data value capabilities.

Although we have provided an initial list of elements to characterise the architecture, its use by concrete open data-driven organizations would invariably lead to identification of additional open data value capabilities. In fact, from our study, the implementations in the two organizations produced additional open data value capabilities. The discovered capabilities are associated with each open data value capability areas. As shown in Table 8, eight new open data value capabilities have been discovered from the two open data-driven organizations which include: 1) Knowledge of data standards and Data on the Web Best Practices, 2) Knowledge of data value, 3) Data Strategy, 4) Aggregation process (GPS), 5) Database architecture, 6) Knowledge of graph data models, 7) Verifying data integrity, and 8) Web-based front-end.

Table 8 Discovered open data value capabilities

Moreover, based on the feedback collected and author’s observation during implementation of the architecture in both open data-driven organizations, we found that the smaller organization had limited knowledge about the general value capability types and had difficulty understanding and distinguishing between Organizational and Management capabilities. This could be attributable to the fact that in the current open data ecosystem with increasing number of start-ups and small organizations, management expertise and organizational capabilities are yet to mature and the required management and organization capabilities are yet to be well defined in these entities. However, the bigger organizations such as The Marine Institute have well-established management and organization practices and capabilities at all levels in the organization and therefore have a clear understanding of the architecture developed in this study.

The presented general value capability types are fundamental capability areas in any organization and provide underlying knowledge for management to identify, map, develop and plan open data value capabilities needed for creating value for customers. These six basic value capabilities are necessary and valuable for any management team. They can be better exploited with strong management experience and organizational knowledge. Developing organizational capabilities in open data-driven organizations should be an ongoing process responding tuned to leveraging opportunities and addressing challenges in the data-driven production environment.

From the two organizations’ open data value capability architecture instances, we observed that both organizations put more emphasis on ‘generating data,’ ‘data processing’ and ‘publishing solution’ capabilities. This shows that the open data-driven organizations are more eager to develop capabilities which result in generating data, processing data and publishing data. Other open data capability areas may likely receive less attention. This may be due to number of reasons including:

  • Limited open data products and services are being produced, and the real value of open data is yet to be unlocked;

  • Capabilities in the other areas are not fully developed or yet to be identified by the organizations;

  • Capabilities in the other areas are identified, but domain expertise needs to be strengthened to realize and utilize the capabilities;

  • Capabilities in the other areas are identified but, poorly presented which results in low-quality open data products and services which receive little or support from the open data stakeholders;

  • The open data ecosystem is not well understood by the stakeholders and resource and capability sharing is fragile and weak in this ecosystem;

  • No true and common understanding of the ‘value’ of open data and ‘value proposition’ of the open data products and services are available which result in uncertainty in defining and misuse in utilizing appropriate capabilities;

As open data is still in its infancy and the real value of open data is yet to be fully harnessed, the comprehensive study on open data value capabilities done and the capability architecture developed in this study arguably advances the domain by supporting open data-driven organizations and start-ups in identifying, mapping, developing, and planning the organization’s capabilities.

Finally, we would like to highlight our experience in using the design science research approach in developing the open data value capability architecture. Our experience shows that the employed method not only enables a clear rigorous process for building the architecture but also enabled detailed attention to the two organizations’ needs in performing open data value capabilities audit. However, what started as a mission to implement the architecture in organizations becomes also a learning experience as the implementing organizations were a rich source of information for identifying other potential value capabilities that enriched and refined the architecture.

8 Conclusion

This study contributes to better understanding of the open data value capabilities and value capability architecture based on the integration of open data value chain, open data value capability areas, and general value capability types. Relying on existing literature we have framed this convergence phenomenon as a form of alignment in which open data value chain would directly impact shaping new open data value capabilities and in return deliver open data value to data product and service users.

Our study has revealed eight open data value capability areas and capabilities associated with each area. Also, we have developed an open data value capability architecture contributing to both research and practice. Furthermore, our study has identified eight new open data value capabilities: 1) Knowledge of data standards and Data on the Web Best Practices, 2) Knowledge of data value, 3) Data Strategy, 4) Aggregation process (GPS), 5) Database architecture, 6) Knowledge of graph data models, 7) Verifying data integrity, and 8) Web-based front-end.

However, the study somewhat suggests that less attention is been paid to building capabilities around ‘data storage and computing facilities’, ‘data release’, ‘providing access to data and APIs’, ‘data retrieval’ and ‘data usage’. More scholarly effort is required to address this gap by identifying obstacles organizations have and proposing appropriate solution to assist organizations to overcome the obstacles. As a result, to generate substantial value from open data, it is necessary for organizations to fulfil the required capabilities of each stage of the value chain.

Furthermore, our study raises the possibility that start-ups and small organizations may lack sufficient management knowledge about the general value capability types specifically management and organizational capabilities they have. Consequently, the need for organizations to strengthen the management and organizational know-how by applying management best practices was identified.

There are at least limitations regarding our study: 1) was not possible to compare our developed architecture with similar frameworks as there were no scholarly work done hitherto addressing open data value capabilities, and 2) difficulty in engaging both private and public organizations limited the number of implementations for demonstration and evaluation purposes.

This research suggests future work in refining the open data value capability architecture. Best practices are required for organizations that opt for enhancing their internal value capabilities as well as developing strategies to expand capabilities. Future work also calls for scholarly effort on the empirical study around the architecture.

In addition to serving as open data value capability planning tool, the developed architecture could also serve as a tool for benchmarking or measuring open data capabilities in organizations.