Introduction

Understanding the patterns of habitat use is important to understand a species’ niche (Jonsen et al. 2003), but it is the rules or trade-offs that govern how habitats and resources are used that are key to predicting behaviour. Within the broader context of current global environmental change, such an understanding is particularly relevant for making predictions about how resource use may vary with habitat alterations or climate change (Jenouvrier et al. 2009). Habitat and resource use are governed by animal decisions (Manly et al. 2002), which are influenced by the behavioural state of the animal at the time when the decision is taken as well as its previous experience (Burns 2005; Parejo et al. 2007; Stamps and Swaisgood 2007; Wolf et al. 2009). For example, whether or not a cheetah Acinonyx jubatus decides to chase a prey animal may be based on the local environmental conditions such as terrain (Bisset and Bernard 2007) and prey density (Cooper et al. 2007), its reproductive state (Cooper et al. 2007), or the time since its last meal (Caro 1994).

Unveiling the process-shaping patterns of habitat use can be challenging when direct observational data are difficult to collect. Data from activity loggers (or bio-logging tags) have improved our understanding of animal migration (Croxall et al. 2005; Guilford et al. 2009), navigation (Biro et al. 2007), foraging (Takahashi et al. 2004; Bost et al. 2007; Trathan et al. 2008), and physiology (Green et al. 2002, 2003) in an increasing number of taxa (Ropert-Coudert and Wilson 2005; Wikelski et al. 2007; Sims et al. 2008). Marine diving animals provide an excellent example of where direct observations are difficult. Indirect data such as those from tags require calibration to understand how patterns of movement or activity relate to the behaviour and the biology of the animal (Jonsen et al. 2006; Patterson et al. 2008). When classes of behaviour cannot be directly calibrated by observation, data are used to define behaviours (Guilford et al. 2004; Jonsen et al. 2005; Patterson et al. 2008). Methods include objective mathematical description of the shape and motifs present in individual dives (Wilson 1995; Halsey et al. 2007), cluster analysis (Safi et al. 2006), and machine learning (Roberts et al. 2004; Guilford et al. 2009). The nature and persistence of these behaviours are then described over time or between groups of interest (Guilford et al. 2004; Biro et al. 2007).

Dives form natural, discrete units of behaviour, as air-breathing marine animals on breath-holding dives must balance their access to oxygen at the surface with food at depth (Kramer 1988; Houston and Carbone 1992; Soto et al. 2008). Multiple dives may be needed to assess or deplete a patch (Takahashi et al. 2004; Naito 2007), so aggregations of dives have been regarded as a single behaviour (Boyd et al. 1994; sensu Dawkins 2007). Temporal clustering such as bouts have been used to test hypotheses of patch and resource use (Boyd et al. 1994; Luque and Guinet 2007). Defining bouts of movements in other animals such as caribou (Rangifer tarandus) has recently proven controversial, with arguments about whether the scale of measurement may influence the patterns generated (Nams 2006; Johnson et al. 2006). Similarly, Hart et al. (2010) suggested limitations to this approach in penguins. In diving animals, a bout is defined by the rate of dives (Boyd et al. 1994), or by separate start and end criteria (Naito et al. 1990).

We argue that the successful identification and characterisation of behaviours in diving endotherms requires (1) an objective way to identify the number of distinct behavioural states defined by the data, (2) a method of clustering similar dives that are considered part of the same behaviour at any point in time, and (3) analyses of behaviour based on transition points between these behaviours. A variety of models exist that have been used to identify change points in time series data, which include Kalman Filters (Kalman and Bucy 1961), Fourier processes (Safi et al. 2006), wavelet analysis (Cazelles et al. 2008), kmeans clustering (Rendell and Whitehead 2005; Arnold and Zuberbühler 2006; de Craen et al. 2006), and Hidden Markov Models (HMM) (Fanke et al. 2004; Roberts et al. 2004; Guilford et al. 2004; Macdonald and Raubenheimer 2007). Hidden Markov Models are well understood for temporal processes and have recently been applied to behavioural data (Fanke et al. 2004; Roberts et al. 2004; Guilford et al. 2004; Macdonald and Raubenheimer 2007) (Fig. 1).

Fig. 1
figure 1

Schematic of a hidden Markov model as used for diving analysis. This representation shows a period of four dives (numbered 14). At each instance, the bird occupies one or other behavioural state (of searching or foraging) which is hidden (above dotted line). Each behavioural state is associated with a different ‘emission’ probability. As emissions from dive 1 to dive 4 are observable (below dotted line) and contain information about the hidden state, we can use the observed data to determine the likelihood of being in each hidden state. We assume that search dives will also include travelling dives as these will be short and shallow

In this study, we use data collected on adult macaroni penguins Eudyptes chrysolophus and HMM to determine change points, as our data may be regarded as time series with non-independence between sequential dives (Tremblay and Cherel 2003, Hart et al. 2009) with k number of hidden states. We propose that dive depth and duration are related to a hidden behavioural state which may be feeding or not feeding. We wish to identify these states and describe transitions between them. When applied in this context, Hidden Markov Models can be thought of as a more sensitive version of Boyd’s (1994) iterative t-test approach. The strength of Boyd’s (1994) method is that it attempts to introduce ‘state persistence’, which is the core of the HMM approach and accounts for the potential non-independence of dives. We ask our model to find two states for three reasons, namely (1) to be consistent with previous studies of bouts (Boyd et al. 1994; Luque and Guinet 2007), whereby dives are either above a certain density and within a bout or below that density and not in a bout; (2) we used the distribution of dive depths and durations to objectively identify the number of clusters for which the model should search: plotting depth and duration revealed strong bimodality in each of these variables as well as correlation between the two variables; and (3) running the model with more than two states revealed that only two states were significantly occupied. The following predictions were then tested:

  • (H1) There are distinct states of behaviour defined by depth and timing of dives.

  • (H2) The amount of time spent in each state is related to observed weight gain.

  • (H3) There are different strategies for penguins in different reproductive stages, sexes, and years.

Methods

Data collection

We here use a data set based on 103 breeding adult macaroni penguins for which dive and mass data have been collected before and after each foraging trip. Monitored birds can be found in four different breeding stages, namely incubation (from egg-laying to egg-hatching), broodguard (where the male parent guards the young chick and the female forages), crèche (where both parents forage and the chicks form clusters in the colony guarded by a few adults), and premoult (where the parents leave the chicks for a long trip to increase their own mass prior to moulting). A trip is here defined as the period of time (in hours) away from the egg or chick (one continuous absence).

Macaroni penguins were tagged with Wildlife Computers™ Mark 7 Time Depth Recorders (TDRs) at the Fairy Point study colony (54º00′30′′S, 38°04′21′′W) on Bird Island during the 1998–2005 austral summers. Fairy Point is a small circular colony approximately 20 m in diameter. Individuals were tagged between November and March at various stages during the breeding cycle before they went to sea to forage. All procedures conformed to the Scientific Committee of Antarctic Research (SCAR) Code of Conduct for Use of Animals for Scientific Purposes in Antarctica (2006).

Devices were glued with 2-part epoxy resin onto the tips of feathers along the spine between and below the scapulae, following the methods by Wilson et al. (1997) with some modifications. The TDRs were 95 × 15 × 15 mm in section and weighed 50 g in air, which corresponded to <0.5% of the bird’s cross-sectional area and 1.0–1.5% of body mass; within guidelines for maximum device loads that these birds can potentially carry (Wilson et al. 2005; Wilson and McMahon 2006). However, any device placed on free-ranging animals has the potential to affect their welfare and alter their behaviour or reproductive success (Murray and Fuller 2000; Wilson and McMahon 2006), as has been found with flipper tags (Froget et al. 1998; Gauthier-Clerc et al. 2004). Other variables recorded were the mass of each bird on deployment and recovery, and the sex of each bird, determined using bill measurements (Williams and Croxall 1991, but for caveats see Hart et al. 2009) and observations of marked pairs exhibiting sexually dimorphic behaviour on the nest (Williams and Croxall 1991). Mass was measured using a balance, with the bird immobilised in a bag.

Birds were captured using a crook from the edge of the colony to avoid excessive colony disturbance. Birds had implanted TIRIS™ tags (Texas Instruments Radio Identification System) to avoid the possibility of recapturing birds over successive years. Birds were recaptured on their return to land and the TDRs removed before the bird entered the colony. Birds were weighed before they had fed chicks. Replicates were obtained by recycling the TDRs over the season, so each TDR tag was deployed up to four times on different birds during each breeding season.

Data processing and statistical analysis

Data in this study have been treated in three steps: (1) processing of raw time series data into individual dives, (2) using a hidden markov model (HMM) to identify two states of diving and (3) performing statistical analysis using general linear models (GLM) or generalised linear mixed models (GLMM) to investigate rates of mass gain, and general additive mixed models (GAMM) to determine what influences diving behaviour throughout a day.

Raw data were summarised into dives using our own scripts in MATLAB® (The MathWorks™, www.themathworks.com), which followed conventions of dive identification described in Tremblay and Cherel (2000). This processing included zero offset correction (ZOC) at the surface, which generally affected depth records in the upper 3 m of each data set. ZOC identifies the surface as the running mean of the top of a series of dives over a minimum of 200 s long enough for the penguin to have surfaced at least twice. If the tops of these dives were too varied or showed large trends, the tag was rejected on the assumption that the calibration of the pressure sensor had failed. The scripts then identified dives as the period from leaving the surface until the time of returning to the surface. For each dive, we recorded the time at the start and end of the dive, the maximum depth, dive duration, and the interval between the end of a dive and the next dive. Where presented, the time of sunrise and sunset was calculated using the methods of calculation described by Montenbruck and Pfleger (2005).

The Hidden Markov algorithm was applied in Matlab® using code previously developed to estimate the hidden states of navigation in homing pigeons (Roberts et al. 2004; Guilford et al. 2004). Using HMM on the data for each individual, HMM identified the state of each dive and a transition of the probability of moving between states for each penguin. HMM treats the time series as a series of states and seeks to identify the states and the transition points between them (Markov 1971). The code used is a mathematical implementation of Occam’s razor, whereby models need to balance noise in the data with explanatory power and simplicity. Models could be simple (not many states), but have large error around them, or complicated to the extreme case of one state per observation and no error. Under the method of variational learning used in this study, models containing parameters with large variation around them are down-weighted, and the number of parameters is also penalised, such that the final model chosen is the one most likely to predict the next time step. For details of this approach, see Roberts et al. (2004).

Histograms of the log depth (Fig. 2a), duration (Fig. 2b), and the two variables together (Fig. 2c) indicated that there were two strategies, a ‘short, shallow’ strategy and a ‘long, deep’ strategy. Plots of surface interval commonly showed one distribution with a long tail, so the natural log of the surface interval was used to correct this (Fig. 2d). The HMM based on depth and duration was therefore programmed to find two clusters in the dive duration and the log of dive depth and to classify each of the dives into one of these states, along with an associated probability. Long, deep dives are termed d, and short, shallow dives s. The probability of a dive being d was recorded for each dive as well as the global transition matrix between types of dive (Table 1). The transition matrix shows the probabilities of transition between successive dives. As we are concerned with patterns of foraging, we use the p(dd) (Table 1) for comparison between individuals and groups, where p(dd) reflects the relative continuity of long-deep dives.

Fig. 2
figure 2

Figures showing the justification for selecting two states of diving based on the variables included in the hidden Markov model. Each graph shows dives from one individual: z30. a Histogram of the log of dive depth. b Histogram of the duration of dive duration. c Scatter plot of dive depth versus duration, showing the relationship between the two, and the existence of two clusters, s and d. d Distribution of surface intervals. The log of surface interval is used throughout this study because histograms of the surface interval commonly show a very long tail

Table 1 The probability transition matrix from the hidden Markov models

Specific a priori hypotheses listed in the introduction were tested using generalised linear mixed models (GLMM) in R 2.7.2 (www.r-project.com). Each model was checked with standard diagnostic plots. To determine what influenced diving behaviour over a 24-h period, generalised additive mixed models (GAMM) were fitted to HMM probability output summarised by hour, using the mgcv (1.4–1) library.

Results

Tag recovery

Of 156 deployed tags, all were recovered, and data successfully downloaded from 129 of these, the remainder were rejected due to hardware or software failure, where the tag had not been set up correctly. These were primarily in the 1999/2000 season, for which no usable data exist. Of these data sets remaining, 103 had records for pre- and post-trip weights. Data from six tags were rejected after Zero Offset Correction; the resulting number of usable data sets retrieved after ZOC, and tag failure is given in Table 2.

Table 2 The sample size of tags used in this study by year and stage

Identification and interpretation of HMM states

Using the HMM algorithm, we identified two clear behaviours. An example is given in Fig. 3, showing how these states relate to the depth and time, and the probability of being in each state. The justification for hidden Markov analysis is that sequential dives are non-independent and exist in two distinct states, which represent two different behaviours. Figure 2 shows bimodality in both the depth and duration plots (which could easily indicate a four-state system), and Fig. 3b shows that most dives fit very neatly into one of two states identified by HMM with very few dives showing intermediate probability. Repeating HMM analysis with more than two states showed that only two states were readily occupied. In the light of our prior hypothesis of two behavioural states, it seems likely that the states identified by HMM represent two behaviours.

Fig. 3
figure 3

Output of the hidden Markov model for penguin z30 for a 24-h period. a The depth of dives over time, s (grey) or d (black) as identified by the HMM. b The probability of being in type 1 in relation to the dives in a. This shows that there are two clear clusters, with some intermediate dives

The effect of behaviour and life history on weight gain

There was considerable variation in the rate of weight gain (min = −1.45 × 10−3, max = 2.85 × 10−4) and p(dd) (min = 0.558, max = 0.992). The probability of continued deep dives [p(dd)] was correlated with the rate of weight gain (Pearson = 0.260, P = 0.008), but a GLM of stage on p(dd) revealed that this is because p(dd) is confounded with stage (F 1,3 = 3.01, P = 0.033). The GLMM to investigate factors that predict the rate of weight gain showed that dd was not significant (dd F 1,88 = 0.18, P = 0.676). Year, stage, and sex were all significant predictors of weight gain (year F 4,88 = 3.05, P = 0.021; stage F 3,88 = 13.46, P < 0.001; sex(stage) F 4,88 = 6.43, P < 0.001, Fig. 4). The full model explained approximately one-third of the variance (R 2 (adj) = 36.65%).

Fig. 4
figure 4

a The mean state probability by hour for each of the four reproductive stages. Points in black show the mean probability and standard error for females. Grey points and error bars show the mean and standard error for males. b Represents the rate of mass gain for the same stages by sex. Sample sizes for females are as follows: incubation = 13, broodguard = 39, crèche = 21 and premoult = 4. Sample sizes for males are as follows: incubation = 4, broodguard = 0, crèche = 20 and premoult = 1. c The probability of transition between state d for subsequent dives p(dd), showing the mean and standard error of the means. Sample sizes are: 1998/1999 = 3, 2000/2001 = 25, 2001/2002 = 42, 2003/2004 = 9 and 2004/2005 = 24. d Represents the rate of mass gain for the same penguins

Diurnal strategies between stages

Fitting a GAMM to the hourly probability of being in d shows that the hour of day was highly significant (P < 0.001) once the individual, sex, stage, and year were taken into account. Plotting the fit lines for each stage (Fig. 5) shows where the differences between stages lie. Incubation and premoult (Fig. 5a, d) showed much lower mean probability of d dives than broodguard and crèche (Fig. 5b, c) overnight. All stages showed similar probabilities of d dives during daylight hours.

Fig. 5
figure 5

The state probability by hour for each of the four reproductive stages (a incubation, b broodguard, c crèche and d premoult). The solid line represents the mean transition probability (deep-deep dives) for each hour, and dotted lines represent the 95% confidence interval around the line. All stages have similar mean transition probabilities for deep diving p(dd) during the day, but differing probabilities during the night

Linking surface interval to state

To test whether the post-dive surface interval was linked to the diving state of the previous dives, we took the mean of the previous five dives to show short-term average and plotted this against surface interval (this is shown for one individual in Fig. 6a). The variance and range of results may be linked to the number of data points over different mean probabilities (Fig. 6b). To overcome any sample bias, the data frame of mean p (the mean probability of being in state d for the five preceding dives) was sub-sampled for each individual. Sub-sampling stabilised the variance over the range of mean p (Fig. 6c), and a quadratic regression showed that there was a significant relationship between mean p and log surface interval (F  2,1141 = 8.80, P < 0.001), but that this explained very little of the variance (R 2(adj) = 1.3%). This shows that dives that are strongly within either state s or d are more likely to be followed by a longer surface interval, so runs of dives that include dives not clearly of s or d are less likely to be at the end of a long run of dives. Figure 6d shows where these dives with intermediate mean p lie on the depth/duration curve.

Fig. 6
figure 6

The effect of mean P on the surface interval of dives. a The log surface interval plotted on the mean probability of the previous 5 dives being d for a single penguin, z30. This shows that there is much greater variation in the surface interval for those dives that are strongly assigned to one state or the other. b The histogram of dives binned by the mean probability of the previous 5 dives. c The sub-sampled data of surface interval and mean P, while d shows where dives of different probability lie on the depth/duration curve. Light grey represents dives with p(dd) 0–0.05, black = p(dd) 0.05–0.95, and dark grey = p(dd) 0.95–1

Discussion

The use of hidden Markov models in foraging ecology

The use of hidden Markov models to partition observed diving into behavioural states, then analysing the transition rates between behaviours, is novel. Other studies have placed an emphasis on spatial analysis in the lateral plane (Roberts et al. 2004; Jonsen et al. 2006), but HMMs can equally be applied to time series of any behavioural data such as diving. The behaviours identified are easy to visualise as use of the environment against time (Fig. 6d), which aids interpretation. Recent advances in bio-logging have enabled more direct measures of energy consumption (Shepard et al. 2009) and prey capture (Takahashi et al. 2004) in single dives. Our approach can be used alongside these advances to better describe where behavioural changes occur and how these influence animals’ energy budgets. Most current definitions of bouts are unlikely to be applicable to penguins, because runs of dives are non-independent. In contrast, HMMs produce an estimate of likelihood, and they can be used to determine where there are notable changes in behaviour, or transition rates between behaviours.

One of the interesting features of this analysis is that there are a small number of dives that do not fall into clear states. When a dive falls between the clusters of normal behaviours, this could be because it was exploratory, accidental, or abortive. There may also be interactions with predators or other occasional external influences that alter local behaviour, after which disturbance the animal returns to the previous behavioural state. There is therefore a trade-off between sensitivity and robustness. Clustering dives into bouts is quite sensitive, but loses robustness. Alternatively, using the probability of dives being within a certain state or the transition probability loses some sensitivity, but does preserve a measure of decision-making while reducing the influence of outlier dives.

The hidden Markov models used in this study revealed that there are two different behaviours, which would be masked by analysis of the unprocessed dive depths for the same data set. While we have focussed on depth and duration (to include both a spatial and temporal element), there are numerous attributes of individual dive that could be included in an HMM, short-term changes in depth within a dive (Halsey et al. 2007). We now address the remaining hypotheses of whether the behaviour links to weight gain, whether there are differences in strategies between stages, and whether there is inter-annual variation in foraging behaviour.

Do deep dives correlate with the amount of observed weight gain?

We tested the hypothesis that the amount of time spent in one state would relate to the observed weight gain in a foraging trip. We find no evidence for this, primarily because dive state use is strongly influenced by reproductive stage. Within each stage, there was no link between the amount of time spent in one state and the amount of weight gain, but this could be because there was relatively little variation in weight gain within stage compared to variation between stages.

Given that Antarctic krill (Euphausia superba; the principle prey of macaroni penguins at South Georgia) show diel vertical migration (Everson 2000) and are found relatively deep during the day, the working hypothesis was that the deeper dives reflected foraging. A number of theoretical (Cresswell et al. 2008) and empirical (Croxall et al. 1999; Barlow et al. 2002; Hennicke and Culik 2005) studies have linked the foraging activity to trip length and weight gain in foraging penguins. However, few have commented on attributes within a trip that link to weight gain. The analysis in this paper shows that continuity of long-deep dives (dd) is not directly linked to the weight gain over a trip, but that this is due to being confounded with stage. Therefore, many of the differences in weight gain and behaviour in foraging macaroni penguins appear to be due to the difference between stages, whereby incubation and premoult individuals may be more flexible to remain at sea overnight and forage further from the nest, because they do not need to return to feed their chick.

While it seems likely that this method has successfully discriminated between foraging and searching or travelling dives, it cannot discriminate between successful and unsuccessful dives, as within each stage, the p(d) does not link to observed weight gain. It is possible that prey are caught in relatively few dives, although other work based on bill opening and undulations in the dive profiles suggested that prey were caught in most dives (Takahashi et al. 2004). If macaroni penguins forage until they are full, we would not expect relationship between the rate of weight gain and p(d). Variation in the rate of weight gain predicted by stage suggests that they cannot be returning to the colony full.

Differences in behaviour between stages and year

Broodguard and crèche individuals showed lower rates of dd than incubation and premoult birds (Fig. 4a) and weight gain (Fig. 4b), along with strong variation between years (Fig. 4c). We offer two explanations as to why broodguard females and crèche parents dive more at night than other individuals (Fig. 5). First, the timing of reproductive stages is confounded. Broodguard and crèche represent the brightest period of summer, and as such the period of dusk is much longer, and the ability to search for prey at night may be greatly enhanced. Second, this should be the period of greatest foraging demand on the parents. Broodguard represents the greatest rate of chick growth with only one parent foraging, so females in this stage should be under the greatest stress. Foraging females typically lose 12% of their body weight during broodguard (Cresswell et al. 2007). This period also represents the greatest abundance of krill (Saunders et al. 2007), so the lower, more consistent rate of diving may indicate a greater proportional success per dive. In an evolutionary ecology context, the observed pattern is interesting, as it suggests that chicks hatch and grow at their fastest rate when krill are most abundant. It is also interesting that the chicks fledge, while krill are still sufficiently plentiful to allow the fledglings to forage at first independence and put on weight for winter, during which their metabolic needs will increase (Green et al. 2002, 2005).

The loss of body mass during this time is likely to reflect a shift from optimal diving to maximise the rate of net energy gain, to a strategy that optimises the gross energy gain that can be passed onto chicks. Parents may well return to feed the chick during the day, the rate of which could be confirmed by direct observation or sensing (Berrow and Croxall 2001). It seems very likely that the female should return to land to feed the chick during the day. Our data cannot answer this question, and it would be worth using GPS tracks to determine the frequency of feeding events. Dry points in the TDR data were insufficient to determine whether feeding had occurred, and therefore positional data is necessary to accurately quantify this.

Daily patterns in foraging

The patterns between night and day shown in Fig. 5 are clear and make sense in light of penguins being visual predators (Wilson et al. 1993; Williams 1995). However, caution should be taken when interpreting the patterns linking diving behaviour to day/night cycles. Sunrise and sunset times reported are based on those observed for Bird Island. Incubation and Premoult individuals are likely to forage further from the colony, as they are not constrained by the need to return at night to feed a chick (Barlow and Croxall 2002). Foraging further from Bird Island will make these sunrise and sunset calculations less accurate. The relationship between daylight and the type and duration of bouts should be investigated with GPS tags as well as TDRs, so that positional information can be used to accurately reflect day length (Bost et al. 2009).

How do states of previous dives link to post-dive surface interval?

As discussed in the introduction, we are reluctant to convert state probability into runs of dives because the burden of proof for the cut-off is unresolved. We therefore used a running mean to determine how p links to surface interval. A running mean of five dives was chosen as this represented a short-term average, which is shorter than the bout structure reported previously (Green et al. 2003) and therefore more sensitive to small changes. Regression analysis showed that the post-dive surface interval was linked to the mean p of the previous five dives, and that being strongly associated with either state s or d slightly increased the post-final dive surface interval. We hesitate from over-interpreting this for three reasons: first, the use of five dives for the running mean was a subjective choice; second, long and short dive intervals may have a different mechanism; and third, long dive intervals are likely to be rare and therefore under-represented in this study.

It is interesting that observed surface intervals increase with the certainty of being in one state or the other (Fig. 6c). It is still possible that the long tail in Fig. 2d indicates two strategies, with longer surface intervals showing higher variation. Such variation is reduced in depth and duration because of the need to resurface. There have been many commentaries on dual or scale-dependent strategies of foraging seabirds (Weimerskirch 2007) both with (Sims et al. 2008) and without Lévy flight (Elliot et al. 2009), which has been shown to be an artefact under certain conditions (Edwards et al. 2007). There are still many scale-dependent mechanisms that could be important to foraging penguins, and using HMM has highlighted these. It is interesting that there are two distinct strategies in depth, duration and surface interval. It is possible that birds need distinct behaviours for accessing prey at different depths, but to determine this, we need more direct indicators of prey capture.

Conclusions

How units of activity cluster into behaviours is a ubiquitous problem in studies of animal behaviour, but as behavioural studies become more important in conservation (Sepulveda et al. 2007; Stamps and Swaisgood 2007), this type of problem only becomes more pressing. In this paper, we have determined a new way of disentangling the observed activity into a process that represents the underlying behaviour, or internal state of diving animals. We have also highlighted the problems of finding a method that is robust to occasional activities that do not fit into the normal suite of behaviours.

Of the questions we highlighted in the introduction, this study has made progress towards answering what influences the timing and nature of decision-making. We have characterised the behavioural plasticity of foraging penguins in more detail than previous studies. This study could be enhanced with a more direct estimation of prey abundance, and the technique may be useful in future to link reproductive success to foraging activity. Increasing the number of simultaneous sensors and data descriptors has increased the discriminatory power of behaviours in other studies, such as the “daily diary” approach (Wilson et al. 2008), and it is likely that our approach would benefit from recording additional variables such as the geographical location of each dive, in particular to determine what goes on in a sequence of dives that are not clearly foraging.