Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

3.1 Automatic Mechanism Generation

Historically many chemical kinetic mechanisms have been the result of extensive and careful development work by teams of experts in particular fields. The manual generation of mechanisms begins with the selection of important species , which usually include not just reactants and products but also important intermediates that are necessary in order to predict the production rates of the key products or other key quantities, for example, ignition behaviour or dynamic features such as oscillations. The types of reactions that can occur between these coupled groups of species must then be specified along with appropriate thermochemical data. Over time, the development of expertise has meant that protocols can be specified for different types of application which indicate the reaction classes that each category of important species can undergo. Typically, even at the mechanism construction stage, certain reaction classes are ignored if their rates are very slow compared to the overall timescales of interest, they are too endothermic or they are too complex [e.g. too many bonds are broken or products produced (Yoneda 1979; Németh et al. 2002)]. Pathways to minor products are also often ignored (Saunders et al. 2003a). There are many examples of such protocols.

In atmospheric chemistry, one case relates to the development of the Master Chemical Mechanism (MCM) describing the tropospheric degradation of a wide range of volatile organic compounds (VOCs) . Around 135 VOCs are included in the mechanism, and it follows that each may undergo similar degradation pathways, with rate coefficients for each step depending on the structure of the specific chemical species involved (Saunders et al. 2003a; Kerdouci et al. 2014). The protocol begins with the initial reaction of each VOC with the OH radical, NO3, O3 or photolytic initiation. The reaction then continues through a range of intermediates and competitive pathways to final products including CO2. The chemistry along a given degradation pathway is developed until the VOC is broken down into CO2, CO or an organic product which is treated independently elsewhere in the mechanism. A schematic diagram illustrating the main reaction classes is shown in Fig. 3.1. It is easy to imagine that even when considering the oxidation of a single VOC and all its products, the scheme will expand very quickly. As an example, even for butane, the full degradation scheme consists of 510 reactions and 186 species.

Fig. 3.1
figure 1

A schematic of the mechanism generation protocol employed in the Master Chemical Mechanism development for tropospheric VOC degradation. Reproduced from (Saunders et al. 2003a) under the “Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License”

Other examples of such protocols exist in pyrolysis and combustion, where again a whole range of gas-phase organic mechanisms may be present depending on the starting fuel. An example of the likely reaction classes for alkane pyrolysis is given in Table 3.1 and for alkane oxidation in Fig. 3.2. A more detailed discussion of the reaction classes used to describe the oxidation of a range of fuel types is given in Chaps. 2 and 3 of Battin-Leclerc et al. (2013).

Table 3.1 Primary, secondary and tertiary reactions of methane pyrolysis determined by the expert system of Chinnick et al. (1988)
Fig. 3.2
figure 2

Simplified scheme for the primary mechanism of oxidation of alkanes (broken lines represent metatheses with the initial alkane RH). Reprinted from Warth et al. (2000) with permission from Elsevier

It becomes immediately clear that as the reactants become more abundant (e.g. in the case of tropospheric chemistry) or the starting fuels more complex (in the case of pyrolysis and combustion), then the manual construction of mechanisms becomes a daunting task, even where protocols describing key reaction classes exist. For this reason, attempts have been made by different research groups to utilise the expert knowledge available from such protocols within computer codes for the construction of reaction mechanisms. This is still a challenging problem since for a reaction generator to produce a viable set of elementary reactions, it should consider reactions between all combinations of species but never produce the same reaction twice. Such methods must also avoid the possible combinatorial explosion that may exist if all reaction possibilities are considered. The protocols describing viable reaction classes introduced above have a role to play here, since unlikely reaction classes must be excluded at this stage to avoid the mechanism becoming uncontrollably large. It may also be possible to lump together similar species types that undergo the same reaction pathways at the mechanism generation stage in order to limit the size of the final mechanism (Bounaceur et al. 1996; Ranzi et al. 1995, 2001). Species lumping will be described in more detail in Sect. 7.7 . The application of the QSSA has also been tested within the mechanism generation context by the RMG code developed at MIT (Van Geem et al. 2006; Green et al. 2013) . The RMG code (Green et al. 2001) was successfully applied for the generation of detailed mechanisms for the combustion of several butanol isomers (Van Geem et al. 2010; Hansen et al. 2011, 2013; Harper et al. 2011) and other chemical systems (Matheu et al. 2003; Jalan et al. 2013) .

Examples of such expert systems for kinetic reaction mechanism generation are available in many fields of application of kinetic modelling. In pyrolysis and combustion, an early example was developed by Chinnick et al. (1988) based on logical programming. The program was used to develop detailed schemes for the pyrolysis of C1–C4 hydrocarbons, and in general, the schemes compared well with those proposed by human experts. A similar approach was also undertaken by Chevalier et al. (1990) in Stuttgart for the oxidation of higher hydrocarbons. This program also incorporated rate coefficient data for known reactions from kinetic data evaluations with extensions to unknown reactions based on reaction type and species structure, using simple rules such as those described by Atkinson (Atkinson 1986, 1987; Kwok and Atkinson 1995) for rate coefficients in tropospheric chemistry.

An extension to these types of methodologies was developed in Milan as part of the MAMOX code (Ranzi et al. 1995, 2005). Here automatic simplification of the mechanism is incorporated into the generation procedure by considering species isomers with similar kinetic behaviour as a single lumped species (Ranzi et al. 2001). By also lumping parallel reaction pathways for these similar isomers and fitting lumped reaction rates to predictions from the full scheme, large reductions in the size of the generated mechanism can be achieved. The obvious advantage here is in the lower computational requirements of the generated mechanism.

Other examples in combustion include REACTION developed by Blurock (1995, 2004a; reaction\analysis; Blurock 2004b, c; Moreac et al. 2006; Mersin et al. 2014) and EXGAS, first introduced by Côme et al. and continuously further developed by subsequent researchers in Nancy (Warth et al. 2000; Glaude et al. 2000; Battin-Leclerc et al. 2000, 2008). Actually the EXGAS system is part of a comprehensive modelling system which also includes a kinetic data base and programs for the estimation of thermochemical parameters (Muller et al. 1995). Over time these types of programs have become more sophisticated and are now able to deal with wider classes of fuels than simple alkanes. Recent applications, for example, have extended to heavy alkanes (Ranzi et al. 2005, 2004; Buda et al. 2005; Biet et al. 2008), oxygenated species (Glaude et al. 2010; Hakka et al. 2010) and biomass fuels (Rangarajan et al. 2010), and aqueous phase oxidation (Li and Crittenden 2009). A review of the principles involved and the features of the different systems is provided in Pierucci and Ranzi (2008). Liu et al. (2012) developed an n-decane combustion mechanism using the automating generation program ReaxGen and generated various skeletal mechanisms using the Directed Relation Graph (DRG) method (see Sect. 7.5).

Matheu and Grenda (2005a, b) applied the mechanism generation tool XMG-PDep (Grenda et al. 2003; Matheu et al. 2003) to high-conversion, pyrocarbon-depositing ethane pyrolysis. The code generated the reaction pathways governing the observed minor products acetylene, propylene, 1,3-butadiene and benzene. They also investigated the effects of large groups of radical disproportionation reactions, omitted reaction families, and the possibility that pressure changes in the reactor could alter the distribution of the deposition precursors.

The “reaction classification using automated reaction mapping” (RCARM) code (Kouri et al. 2013) allows the classification of a specific reaction step (taken from either a manually or automatically generated reaction mechanism) into a particular reaction class, such as hydrogen abstraction or beta scission. The authors developed 29 simple classification rules, 20 complex (well-skipping) classification rules, and four second-stage classification rules. The subdivision into classes allows the kineticist to check the completeness of the reaction steps within a mechanism and the consistency of rate coefficient assignments. Inspection of the members of a particular class might also help to identify a missing reaction. A detailed discussion of the automatic generation of reaction mechanisms in combustion is given in Blurock et al. (2013).

In atmospheric chemistry the protocols developed for the generation of the MCM have also been incorporated into an expert system by Saunders et al. (2003b) . This approach also uses simplification rules to avoid the explosion of species and reaction numbers. Lumping is used here in the case of peroxy radical species and the restriction of possible reaction classes. The MCM, however, avoids the lumping of primary VOCs and for the most part remains an explicit, detailed mechanism. The approach taken by Fish (2000) was to incorporate primary species lumping into the mechanism generation procedure for a gas-phase tropospheric scheme. Lumping based on functional groups was used based on an approach developed for atmospheric mechanisms by Gery et al. (1989) and also used in the CHEMATA mechanism generation code (Kirchner 2005) . In this approach, each carbon atom is given a type depending on the number of carbon atoms to which it is bonded and a status depending on its functional group. The program then uses structural activity relationships to generate rate coefficients for the lumped groups but tracks the fraction of the original VOCs within the lumped quantities. The intended use of the mechanism should determine which approach is the most suitable. For the detailed calculation of chemical products and intermediates, an explicit mechanism like the MCM may be more suitable, but for use in computationally expensive reactive transport codes for tropospheric pollution, the generation of an already lumped mechanism could be necessary in order to restrict simulation times to a manageable size.

The heuristics-aided quantum chemistry (HAQC) methodology of Rappoport et al. (2014) shows many similarities to those previously mentioned, and it has been used for the generation of detailed reaction mechanisms of organic chemistry transformations.

In the field of bioinformatics, the automatic generation of mechanisms describing, for example, metabolic or signalling pathways is also becoming a rapidly growing field. The level of complexity here may even outweigh that discussed above for tropospheric or complex fuel combustion mechanisms since the number of nodes in a human molecular network may be of the order of thousands if all genes, RNAs, proteins, etc. are taken into account (Rzhetsky et al. 2004). The review of Maria (2004) provides a useful discussion of model formulation issues for chemical and biochemical systems. The issue of how to formalise knowledge and develop a consensus view on the dominant reaction types in molecular networks in such a rapidly developing field seems to be critical. A novel approach taken by Yuryev et al. (2006) and Rzhetsky et al. (2004) is the development of methodologies to extract and formalise knowledge about molecular interaction networks using a network database extracted from scientific literature and to use the knowledge for the generation of reaction pathway models. For example, the GeneWays system (Rzhetsky et al. 2004) attempts to extract information on relationships between substances or processes with application to signal transduction pathways and represent them as direct relation graphs (more discussion on the use of reaction pathways and direct relation graphs for model reduction can be found in Chap. 4 and Sect. 7.5, respectively). This type of method represents a stochastic approach rather than a set of protocols and data developed by careful experts (such as in data evaluations). Inconsistencies between data are not handled in the same way as they would be within formal evaluation approaches. Rather, the GeneWays platform aims to use multiple sources of information from the open literature and also to allow researchers to query, review and critique the information, thereby aiming to develop a consensus view over time .

Similar systems are also developing in the bioinformatics area as reviewed in de Jong (2002). An example is KEGG (Kyoto Encyclopaedia of Genes and Genomes) which provides an integrated database including metabolic pathway maps, drug components, complete and draft genomes, chemical compounds, chemical pathways and reaction classes (Kanehisa and Goto 2000) . KEGG is a computational representation of a biological system based on graph theory, with each node of the graph representing an object from molecular to higher levels. Examples of objects include enzymes, compounds, genomes, etc. The edges of the graph represent biological relationships at many levels but may include, for example, metabolic or transcription pathways. The aim is to link a specific set of genes with “a network of interacting molecules in the cell, such as a pathway or a complex, representing a higher order biological function” (Kanehisa and Goto 2000) and therefore to simulate several levels of the timescale hierarchy as was also attempted in the E-CELL software environment (Tomita et al. 1999) . Part of the aim of the KEGG project is to develop the equivalent of the mechanism construction protocols we saw earlier for purely chemical mechanisms, by incorporating and developing reference pathways (Karp et al. 2000) or similarities between the pathways of similar groups of organisms. A final goal could be the analysis of network–disease and gene–disease associations, and the exploration of the interactions with available drugs (Kanehisa et al. 2010). However, in common with other complex modelling systems found in combustion, pyrolysis and atmospheric chemical kinetics, the uncertainties present in pathway descriptions of biological systems as well as the kinetic parameters used will be large (Wiechert 2002). According to Wiechert, even a consistent and complete data set for the central metabolic pathways of E. coli K12 is a significant challenge.

In all application areas, software tools for mechanism/model construction have already proved to be extremely useful, but there are some potential penalties associated with the resulting ability to increase model complexity. If our ability to accurately specify data for the huge number of pathways involved does not keep pace with the growth in model complexity, then the number of uncertainties contained within the models may grow. It will not therefore be guaranteed that the resulting model is robust enough to use, for example, within an engineering design or atmospheric policy assessment context.

The use of reaction classes can, to a certain extent, help to reduce the burden of quantifying parameters within large mechanisms, by allowing the estimation of rate constants using general physical and chemical principles (Olm et al. 2014). For example, detailed experimental data may be available which quantifies the rate coefficients of some reactions within a reaction class. Data for other reactions within the class can then be estimated based on the fact that the species involved in the reaction will contain the same functional groups as those for which detailed information are available (Atkinson 1986, 1987; Kwok and Atkinson 1995). New experimental data may not therefore be needed in order to make reasonable estimates of large numbers of reaction rates within automatically generated mechanisms. In addition, sensitivity analysis methods can provide an essential tool in helping to establish which assumptions can lead to the largest influence on predicted model targets, thus allowing the focus of model improvement efforts towards a smaller number of parameters within the mechanism as discussed in Chap. 5.

3.2 Data Sources

In order to construct a chemical mechanism composed of its elementary reactions, it is of course necessary to provide thermodynamics and reaction kinetics parameters for the component species and reaction steps, respectively. A huge part of chemical kinetics is the determination of such parameters via a variety of methods such as functional fitting to fundamental experiments, theoretical calculations based on quantum chemistry, reaction rate or transition state theory (Pilling and Seakins 1995; Miller et al. 2005; Pilling 2009), estimations using thermochemical rules (Benson 1976) and the use of the structure–reactivity approach. Such an approach was proposed by Atkinson (Atkinson 1986, 1987; Kwok and Atkinson 1995) for the calculation of rate coefficients for the gas-phase reactions of the OH radical with organic compounds or functional group trees (Green 2007).

Historically, the use of the law of mass action was first attempted to give a representation of the rate of a global reaction, that is, when the primary reactants are assumed to immediately form the final products. However, this was followed by the subsequent realisation that the behaviour of a reactive system was controlled by a number of reaction steps with reaction intermediates playing a key role as discussed in Sect. 2.1. Experimental and theoretical studies were then performed to determine the rate coefficients for individual reaction steps motivated by a number of different application fields.

In gas-phase combustion kinetics, the development of chemical mechanisms was driven by the need to understand the behaviour of automotive engines and other combustion devices such as gas turbines. Initially, mechanisms were developed for relatively simple chemical processes such as hydrogen oxidation and small hydrocarbons such as methane. The push now is towards complex kinetic mechanisms which mimic the behaviour of larger hydrocarbons (Battin-Leclerc 2008) and real fuels such as diesel (Westbrook et al. 2006), kerosene (Dagaut and Cathonnet 2006; Dagaut and Gail 2007; Honnet et al. 2009) and biofuels (Westbrook et al. 2011; Ramirez et al. 2011). Consequently, the size of available mechanisms has grown, as exemplified by a recent mechanism describing the oxidation of the biodiesel surrogate methyl decanoate involving 3,012 species and 8,820 reactions (Herbinet et al. 2008). Similar developments have taken place within atmospheric chemistry with the Master Chemical Mechanism describing the gas-phase chemistry of the troposphere including around 5,900 species and 13,500 reactions (Saunders et al. 2003a).

An important question arises, which is how the parametric data contained in such complex mechanisms are obtained. It is not the purpose of this text to cover the fundamental methods of chemical kinetics since there are many excellent existing reviews of this topic (Pilling and Seakins 1995; Miller et al. 2005; Pilling 2009). However, we summarise here some useful resources which may be employed in the development and parameterisation of chemical mechanisms.

Currently many of the elementary reaction steps and corresponding reaction rate parameters included in kinetic mechanisms can be found in online chemical kinetic databases such as that available from NIST (Manion et al. 2013). In many cases, published rate data has been critically evaluated by a panel of experts using available information regarding each elementary step [see e.g. Baulch et al. (1992, 1994, 2005); Atkinson et al. (2004, 2006, 2007, 2008); IUPAC 2014]. Such evaluations not only provide recommended expressions for the temperature and pressure dependence of rate coefficients, but also often give some quantification of the degree of confidence that can be placed in the predicted values over a given temperature range. These are perhaps a better source where available, although such evaluations may not always contain the most recent data. The advantage of evaluations where they do exist is that in many cases there are enough separate studies to allow quality assigned error limits to be defined for the reactions considered. This provides a useful starting point for overall model uncertainty evaluations which will be discussed further in Chap. 5.

For more recent and complex mechanisms, the fact is that despite the best efforts of experimental and theoretical kineticists, a large proportion of the elementary steps will have never been studied individually and are likely to be deduced from similar reactions or by kinetic methods such as those proposed by Atkinson (Atkinson 1986, 1987; Kwok and Atkinson 1995). Such approximation methods are unlikely to achieve the same degree of accuracy as fundamental theoretical or experimental studies. However, we will see later in Chap. 5 that the methods of uncertainty and sensitivity analysis can aid the process of important parameter identification, so that strongly influential parameters from this estimated group can be targeted by further kinetic studies.

As well as rate coefficient information, thermodynamic data are required for the description of many chemical systems. A number of software packages are available to calculate thermodynamic data such as THERM (Ritter and Bozzelli 1991) or THERGAS (Muller et al. 1995). NASA polynomials are often used as a starting point for the calculation of thermodynamic properties (see Sect. 2.2.3) and have been made available for many years via the data base of Alexander Burcat (Burcat 1984; Burcat and Ruscic 2005; Burcat) as well as in recent evaluations (Ruscic et al. 2003) .

Some interesting issues emerge in reviewing the field of mechanism construction. For a given application, several mechanisms may exist which may or may not share common reaction steps and may or may not share common data. Whilst evaluated data exists for some reactions/pathways for well-established applications, for newly emerging fields such as alternative fuel combustion or bioinformatics, differences between data parameterisations within mechanisms constructed to describe the same chemical processes may still be present. In fact, several models with quite different parameterisations could be capable of making very similar predictions of key target outputs (see Chap. 8 for further discussion of this point). Over time, and as more detailed kinetic data becomes available, different mechanisms formulated to describe the same chemical processes should start to converge towards similar parameterisations. Collaborative working may assist this process.

Opportunities for collaborative working clearly exist and have recently been explored within the web-based PrIMe (Process Informatics Model) informatics system within the field of combustion (Frenklach et al. 2004). PrIMe aims to offer a system which not only collects and stores data but also includes a platform to assist in the validation of the data as well as the quantification of data uncertainties (Seiler et al. 2006) . This approach is called “data collaboration” . The system can then be used to compile predictive models from the data for specific applications and to quantify predictive uncertainties (Feeley et al. 2006) (see Chap. 5 for a full discussion of uncertainty analysis). The aim is to use all available data, including evaluated consensus values, as well as data which differs from the agreed consensus. For the system to be successful, it relies upon engagement from the community in terms of supplying data, and model construction and evaluation tools. At the moment it is probably fair to say that within combustion, many groups are still working with individually developed mechanisms which they may update periodically. The advantages that could be gained from better collaborative working have perhaps not been fully exploited.