Introduction

Not only is cellulose arranged in small crystals, but there are several different crystal forms (polymorphs) that depend on the history of the sample. Diffraction studies, which depend on more or less periodic arrays of atoms, are one of the most common analyses conducted on cellulose. These studies are used for many purposes, ranging from refinement of atomic positions and delineation of the hydrogen bonding systems to routine determination of the polymorph or degree of crystallinity.

The detailed refinements of atomic positions based on more than a hundred diffraction intensity values taken from fiber diffraction patterns (Nishiyama et al. 2002, 2003; Langan et al. 2001; Wada et al. 2004, 2009) have been of great benefit to the knowledge of cellulose. By replicating the unit cells of these crystals, three-dimensional models of the crystals can be readily constructed and subjected to calculations of various sorts, allowing comparisons of observed experimental properties with those calculated on the basis of the model cellulose structures. For example, vibrational and NMR spectra can be calculated (Kubicki et al. 2013), perhaps for validation or attempts at resolution of the structure of remaining non-crystalline material. Another use of such models would be for calculation of various mechanical properties such as deformability that allow determination of ideal values (Wohlert et al. 2012). Such calculations, however, can be regarded as topics for specialists who are expected to be competent users of the various tools for manipulating data from crystal structure studies.

More often, perhaps thousands of times per year, powder diffraction data are collected from cellulose for determination of the sample’s specific polymorph or its degree of crystallinity. The resulting information provides important characterization and information on the consequences of various applied treatments. These analyses are conducted with varying levels of effort and understanding. One task that often accompanies these analyses is the assignment of Miller indices to the various peaks. Assignment is complicated to a degree because various conventions have been used over the years to designate the unit cell dimensions. Early work (Meyer and Misch 1937) proposed a unit cell for cellulose I with a = 8.35 Å, b (fiber axis) = 10.3 Å, c = 7.9 Å and a monoclinic angle of β = 84°. However, perhaps to facilitate comparative discussions of polymers that have several space groups, the molecular axis for cellulose is now considered to be c.Footnote 1 In both small molecule and polymer crystallography, the convention now is to use an obtuse monoclinic angle (γ for polymers), but early cellulose structures were instead based on an acute β angle. Even within the convention with an obtuse monoclinic angle and c as the fiber axis, Gardner and Blackwell (1974) defined their unit cell with a = 8.17 Å and b = 7.86 Å,Footnote 2 whereas the work of Woodcock and Sarko (1980) used 7.78 Å for a and 8.20 Å for b. These differing conventions result in varying Miller indices being assigned to most peaks on the diffraction pattern. From an editor’s perspective, both authors and reviewers have been caught up in these varied conventions, to the extreme point that reviewers have demanded rejection of a manuscript because the authors had used a convention different from the one that the reviewers had seen somewhere along the way. Some current submissions to Cellulose and other journals regarding polymorph identification and crystallinity measurements continue to base their Miller indices for cellulose I on the conventions used by Meyer and Misch, while some others use the also obsolete one of Gardner and Blackwell.

It would seem to be an improvement if there were adoption of a single convention by both the fiber and powder diffraction communities, with the understanding that the selection of convention is a choice and that other choices were made in the past. Perhaps a more important consequence from standardizing on one convention that is used in both routine powder patterns and advanced fiber crystallography is that it expedites conversations between those two spheres of scientific endeavor. Zugenmaier (2008) has seconded our earlier proposal (French et al. 1987; French and Howley 1989) that shows the unit cell with the c-axis vertical, the a-axis directed towards the viewer and the b-axis towards the right, with a being shorter than b. In turn that was grounded in statements by Klug and Alexander (1974) and used by Woodcock and Sarko (1980).

In the past few years, we (Nishiyama et al. 2012) have taken advantage of the availability of the x, y, and z atomic coordinates from the crystal structures to calculate both powder and fiber diffraction patterns for cellulose. These calculations were based on different sizes of crystals, either from the coordinates of one asymmetric unit in the unit cell, using the Mercury program (Macrae et al. 2008) or from crystal models that had various shapes, sizes and amounts of water or deviation from a perfect lattice that resulted from molecular dynamics studies. The latter calculations were carried out with either the Debyer software for powder patterns (Wojdyr 2011), or custom software by Dr. Nishiyama for fiber patterns (Nishiyama et al. 2012). These efforts are beginning to provide an atomistic visualization of cellulosic materials.

The present work is based only on the Mercury 3.0 program, which is available in both free download and full-capability versions (Macrae et al. 2008). This program requires the unit cell dimensions and the fractional atomic coordinates of the asymmetric unit to instantly produce a powder pattern. The primary goal here is simple, namely to present the calculated diffraction patterns from the different polymorphs with the recommended indexing. With an input peak width at half maximum height (pwhm) of 1.5° 2θ, a pattern can be calculated that resembles an experimental pattern from a fairly crystalline sample of practical interest. By also using the Mercury’s default pwhm value of 0.1° 2θ, a pattern is calculated that is much sharper than will be attained with any cellulose sample, but it resolves the peaks to show them mostly without overlap. Powder diffraction patterns are sometimes subjected to deconvolution during analysis, with arbitrary choices of the contributing peaks. A secondary goal of the present work is to show, based on the calculated patterns having narrow peak widths, which peaks should be considered during deconvolution.

Another facility of the Mercury program (paid licenses only) is to account for preferred orientation of the sample. Because of the aspect ratio of cellulose fiber fragments that are used as samples for powder diffraction, it is difficult to avoid some degree of preferred orientation. Therefore, patterns were calculated both with and without preferred orientation along the c-axis. Not only is preferred orientation difficult to avoid in experiments on cellulose, but it was originally recommended when calculating the Segal crystallinity index analysis (Segal et al. 1959; French and Santiago Cintrón 2013). In the latter paper, the pattern from the oriented cellulose Iβ sample was indicated to have a slightly higher Segal Crystallinity Index than for the random pattern. None of the information herein is particularly novel or unique, but the idea here is to present a practical visual indication of the Miller indices of the often overlapped peaks and their approximate proportions.

Input information

Crystal information files (.cif) for cellulose Iα and Iβ were obtained from the Supplementary Information accompanying the original reports (Nishiyama et al. 2002, 2003). Those files contain both X-ray and neutron structures. (These structures are also found in the Cambridge Crystallographic Database with Refcodes PADTUL and PADTUL01 (cellulose Iα) and JINROO01 and JINROO02 (cellulose Iβ).) The X-ray coordinates have no hydroxyl hydrogen atoms present and the neutron diffraction coordinates have deuterium atoms in the positions of the hydroxyl hydrogen atoms. The Supplementary Information includes modified.cif files for cellulose Iα and Iβ that eliminate the coordinates from the X-ray study and other information not needed for the calculation of powder patterns. The deuterium atoms of the neutron studies are renamed as hydrogen atoms. That was done to avoid any effect on the pattern from the presence of deuterium. Because the original.cif for Iα was reported with a as the fiber axis (the main original article reported c as the fiber axis), the revised.cif file in Supplementary Information has coordinates that were transformed so that c is the fiber axis. If the.cif file were to be used for constructing models to be studied with energy calculations, only the A or B scheme hydrogen atoms should be used. For calculations of the X-ray pattern, the A and B atoms should be used. Small differences can be observed in the calculated patterns when they are not included.

A .cif file for the cellulose II structure (Langan et al. 2001) was kindly sent by Dr. Langan and a shortened version is supplied in Supplementary Information. The .cif files for cellulose IIII and IIIII were created manually from the published coordinates (Wada et al. 2004, 2009, respectively) and the minimum of information needed for a functional .cif. Those files are also provided as Supplementary Information.

The default calculated powder patterns were customized with a CuKα wavelength of 1.5418 Å, typically used in powder diffractometers. Pwhm values of 0.1 and 1.5° 2θ were used. The intensity vs. 2θ data were saved and re-plotted with plotting software to make combined plots. The cropped bitmap image of the expected peak positions from the Mercury powder plots was copied and pasted onto the plots in a drawing program. Miller indices were obtained from a list of reflections produced by Mercury and manually added to the expected peak positions. Preferred orientation was also part of the customization process in Mercury, with a March-Dollase factor of 1.8 (Dollase 1986) applied to the (001) plane for the monoclinic structures. In the case of triclinic Iα, specification of the (001) plane resulted in increases in many of the intensities relative to the main (110) peak. Earlier versions of this paper attempted to explain this behavior that was opposite to the results for the monoclinic structures. Ultimately it was realized that the problem was that the (001) plane was not normal to the molecular axis for the triclinic structure, unlike the situation for the monoclinic structures. Specifying the (001) plane in the Mercury customization data window did not therefore result in preferred orientation of the fiber axes as intended. Instead, the plane to specify was (11–4), which has an interplanar spacing of 2.596 Å, very close to c/4. Thus, the only preferred orientation modeled was for fibrous samples where the fiber axes were in a plane parallel to the sample surface, and the crystallites were randomly oriented about the fiber axes. Samples of bacterial cellulose, for example, can have additional orientation about the fiber axis, causing the (100) peak to be nearly absent from Iα patterns.

Results and discussion

Figure 1a, b show the calculated patterns from cellulose Iα, based on the modified .cif file in the Supplementary Information. Unit cell dimensions were: a = 6.717 Å, b = 5.962 Å, c = 10.400 Å, α = 118.08°, β = 114.80°, and γ = 80.37° (Nishiyama et al. 2003). The modifications involved interchange of the unit cell dimensions to match a convention with the c-axis parallel to the molecular axis. Although the unit cell dimensions listed in their published report (Nishiyama et al. 2003) are the same as used herein, their structure determination was carried out with the a-axis parallel to the molecular axis, and the coordinates in their supplementary .cif file are based on that convention. The three main peaks for the Iα one-chain triclinic unit cell have Miller indices of (100), (010) and (110) (which are the counterparts to the (1–10) (110) and (200) peaks of the cellulose Iβ pattern).

Fig. 1
figure 1

a Simulated cellulose Iα powder pattern for randomly oriented crystallites with 0.1° and 1.5° peak widths at half maximum intensity. Magenta lines indicate the positions of the calculated peaks, and the black vertical lines on the scale correspond to the 5° intervals. The image was prepared with the aid of the Mercury program (see text) which output files with the intensities and the Miller indices; the magenta lines were taken from the saved image of the powder pattern from the Mercury program. b Cellulose Ia with crystallites having preferred orientation along the fiber axis

The simulated patterns from the randomly oriented and preferred orientation samples are noticeably different, but after simulating the orientation of the (11–4) plane (see last paragraph of Input information, above) set the pattern for the effects of preferred orientation for the other samples. The intensities of the non-equatorial reflections, mostly weak in the random pattern, nearly disappear. Note also that the weak (001), (002), and (003) peaks on the Iα pattern at 10.48°, 21.05°, and 31.8° 2θ, have d-spacings of 8.44, 4.22, and 2.81 Å, respectively. They do not correspond to divisions of the 10.40 Å c-axis dimension by 1, 2, and 3, as would be the case for a monoclinic structure. The (004) reflection is beyond the 40° 2θ cut-off. Finally, the present unit cell admittedly does not conform to conventions that call for all triclinic cell angles to be greater than 90°. Here, priority was given to having the c-axis match the molecular dimension.

Figure 2a shows the calculated diffraction patterns for randomly oriented powder samples of cellulose Iβ, and Fig. 2b presents the patterns for Iβ samples with preferred orientation along the c-axis. Unit cell dimensions were: a = 7.784 Å, b = 8.201 Å, c = 10.380 Å, and γ = 96.55° (Nishiyama et al. 2002). As previously stated (French and Santiago Cintrón, 2013), both simulated diffraction curves correspond to perfect crystals. The difference in the simulated intensity profiles is strictly the result of different crystallite sizes, and the “background” level around 18° 2θ for the curves with 1.5° 2θ pwhm results from overlap of the diffraction peaks. On these simulated patterns, there is no modeling of amorphous scattering. Therefore, in various calculations of cellulose crystallinity, it is generally not appropriate to position a background intensity curve (attributed to amorphous scattering) as high as the minimum intensity at 18° 2θ. The shoulder at about 20.5° 2θ for the (012) and (102) reflections on the random pattern is not obvious on the pattern with preferred orientation. The positions of the absent, odd-order [(001) and (003)] meridional reflections are indicated by green lines above their green Miller indices instead of the purple lines for the other reflections. The main contributors of intensity to the three main peaks have Miller indices of (1–10), (110) and (200). The moderate peak on the random 1.5° pwhm curve near 34.5° is seen to be a composite of several reflections and (004) is not the dominant contributor.

Fig. 2
figure 2

a Cellulose Iβ with random (a) orientation of the crystallites. b Cellulose Iβ with preferred orientation of the crystallites along the fiber axis

These Miller indices [(1–10) and (110)] for the peaks at 14.88° and 16.68° 2θ are the same as those used by Gardner and Blackwell (1974) despite their assignment of the a-axis to the 8.20 Å repeat and the b-axis to the 7.88 Å dimension. However, their assignments do interchange the indexing for the major peak, with Miller indices of (020) instead of the (200) values promoted herein. Because of this difference, it was not immediately obvious that the chain packing in their unit cell was different from that in the monoclinic (MM subcell) unit cell from Syracuse (Sarko and Muggli 1974). The packing in the Gardner and Blackwell cell corresponds to a “down” instead of “up” orientation of the molecules (French and Howley 1989) for the unit cell used herein, whereas the packing is parallel up in Sarko and Muggli’s structure and in the Nishiyama et al. (2002) structure.

Figure 3a, b show the patterns for cellulose II samples with random and preferred orientation. Unit cell dimensions were: a = 8.10 Å, b = 9.03 Å, c = 10.31 Å, and γ = 117.10° (Langan et al., 2001). Both the 0.1° and 1.5° patterns from the sample with preferred orientation are considerably simplified, compared to the random model, by the near absence of reflection intensities from upper layer lines. The three main peaks have Miller indices of (1–10), (110) and (020). The peak for the latter reflection, at about 22.1° 2θ, has contributions from two adjacent reflections. To ascertain which reflection is responsible for the majority of the intensity it was helpful to plot the diffraction pattern for only a small range, say from 19 to 23° 2θ, along with a step size of 0.01° 2θ. The window with the calculated diffraction patterns in the Mercury program can be stretched to the full width of the monitor as well. Note that this reflection (020) is often mistakenly labeled as (200)—e.g. Yue et al. (2012). The .hkl file that is output from Mercury also lists the intensities of the various reflections. For example, it gives the calculated intensity for the (1–10) reflection as 2480.7, and a value of 1.1 for the (100) peak, easily resolving that (1–10) is the dominant contributor to the peak at about 12.2° 2θ.

Fig. 3
figure 3

a Simulated powder diffraction patterns for cellulose II crystallites having random orientation. b Simulated powder diffraction pattern for cellulose II with preferred orientation of the crystallites along the fiber axis

The simulated patterns for cellulose IIII are shown in Fig. 4, based on unit cell dimensions of a = 4.45 Å, b = 7.85 Å, c = 10.33 Å, and γ = 105.1° (Wada et al. 2004). Because the cellulose IIII structure has only a one-chain unit cell with twofold screw-axis symmetry, the number of reflections is limited, with only ten peaks possible before 25° 2θ (cellulose II has 19 and Iα has 14). The difference between the patterns for random (Fig. 4a) and oriented samples (4b) is dramatic because of the strong presence of the (002) reflection on the random pattern, and its near absence on the pattern from the sample with preferred orientation. Because the peak at about 21° 2θ comprises the (100), (012) and (1–10) reflections with very strong contributions from (100) and (1–10), it is not well-suited for line profile analyses for crystallite size determinations.

Fig. 4
figure 4

a Simulated powder diffraction pattern for cellulose IIII with random orientation of the crystallites. b Simulated powder pattern for cellulose IIII with preferred orientation for the crystallites along the fiber axis

Figure 5 shows the results from the study of cellulose IIIII by Wada et al. (2009). This one-chain unit cell with a = 4.45 Å, b = 7.64 Å, c = 10.36 Å, and γ = 106.96° has fractional occupancy, with either an “up” or “down” chain in any given unit cell. The calculated patterns show very low intensity except for the peak at 12.1° 2θ and the composite of (1–10) and (100) peaks at 20.863° and 20.869° 2θ, respectively.

Fig. 5
figure 5

a Simulated powder diffraction pattern for cellulose IIIII with random orientation of the crystallites. b Simulated diffraction pattern for cellulose IIIII with preferred orientation of the crystallites about the fiber axis

These results underscore the need to report the details of the sample preparation that would affect the orientation, such as whether the sample was a pressed pellet, or sprinkled onto sticky tape. Also, the presentation of the sample to the incident beam (transmission or reflection) should be stated as that can also affect the relative intensities of the various peaks.

The intensities and spacings on these calculated patterns will not totally agree with experimentally observed patterns for several reasons. The peak positions will not agree exactly, possibly because of crystallite size variations resulting in different long-range compressive forces on the crystals and unit cells (Nishiyama et al. 2012). Knowing that the crystallographic discrepancy index (the R factor) values are about 20 % in the original crystallographic studies, the observed and calculated intensities are expected to differ. (A better indicator of expected differences in this case would be the discrepancy index wR 2 based on the structure factors squared and weighted, values of which are about 45 %.) This expected deviation is why the title “idealized powder diffraction patterns” was chosen. The patterns calculated by Mercury are isotropic; there is no way to input crystallite shape information. That can affect the relative peak heights and widths. If the crystal structures are re-determined in the future, the appropriate figures herein should be replaced.

From the experimental side, it is almost impossible to obtain powder pattern samples with completely random orientation or complete orientation in a given direction. That could be compensated for by calculating patterns with varying values of the March-Dollase orientation parameter (Dollase 1986). Another factor is the sample itself. It may not be pure, it may have minor amounts of another polymorph, and, especially when sample sizes are marginal, signals from the sample presentation system, such as adhesive tape may also occur. The latter problem should be easily corrected once identified. Some diffuse scattering from non-crystalline material is also going to be present that is not taken into account by these calculations.

Of course, many samples of interest are likely to contain more than one crystal form, such as partially mercerized samples that would be mixtures of cellulose I and II. It is important to not try to interpret such patterns with an assumption that only one form is present. However, given the ability to calculate intensity versus 2θ data for the various polymorphs with a range of peak width values, it is simple to add together the intensity data (perhaps with a spreadsheet program) to obtain theoretical patterns for mixtures of either cellulose of different polymorphs or cellulose with other polysaccharides such as xylan (Nieduszynski and Marchessault 1972) in composites.

Conclusions

Ideal diffraction patterns were readily calculated with the Mercury program using as input the crystal information files (.cif) that are provided in the Supplementary Information. The patterns were presented for both very narrow, well-resolved peaks and for the broader peaks that are found for the most crystalline practical higher plant cellulose such as cotton. This graphical representation shows which reflections are contributing to a given peak. Indexing for these peaks conforms to the modern nomenclature, with c as the fiber axis, an obtuse monoclinic angle, and the a-axis shorter than the b-axis. The Mercury program and the .cif files in Supplementary Information make it easy for even the novice to create calculated patterns for various purposes. For example, the calculated intensity versus 2θ data from various structures can be added in varying proportions to simulate diffraction patterns from partially mercerized samples or composite materials.

Supplementary information

Crystal information files (.cif) are provided for the five cellulose polymorphs described in this work. These files were either modifications of published or unpublished .cif files or created from the published unit cell parameters and coordinates by putting them into the .cif format. Readers are advised to copy the entire contents, starting with the line “#Modified Crystal Information File for Iα cellulose” into a text editor such as Microsoft Notepad and saving the file as e.g., “all_cellulose.cif”. The Mercury program will then allow any of the structures to be selected.