Introduction

The deep water penaeoid shrimps are commercially valuable and constituted more than 40% of the total deep water shrimp landings during 2015 (CMFRI 2016) which includes families: Aristeidae, Solenoceridae and Penaeidae distributed at a depth range of 100–3200 m occupying the continental shelf and slope of the Indian coast (George 1979; Suseelan 1989; Suseelan et al. 1989; Gueguen 1998; Dineshbabu and Manissery 2009). Among these, Aristeus alcocki, Metapenaeopsis andamanensis, M. coniger, Solencera hextii and Penaeopsis jerryi form the targeted species of trawl fishery, in the southwest and southeast coasts of India (Radhakrishnan et al. 2012; James 2014). In recent years there was a decline in the stock of deep water penaeoid shrimps (CMFRI reports: 2014, 2015, 2016) due to the increased fishing effort over the years. Therefore, it becomes pertinent to review all the species of Indian deep waters. Moreover a few names that appear in the checklist (Radhakrishnan et al. 2012) were not recorded in the regular fishery.

Detailed species and larval level identification forms the prerequisite for the proper conservation and management of the declining deep water shrimp resource of the country. DNA barcoding has been successfully used for species identification and discovery of new species, utilizing 650 bp fragments of the mitochondrial gene, cytochrome oxidase subunit I (COI) (Hebert et al. 2003; Smith et al. 2006; Bucklin and Frost 2009; Asgharian et al. 2011; Baldwin et al. 2011; Zhang and Hanner 2011; Silva et al. 2013). COI was effectively used for the discrimination of closely related species (Hebert et al. 2003) and detection of cryptic species (Ni et al. 2012) as well as for the identification of fish products (Carvalho et al. 2011, 2014). Mitochondrial DNA (Mt-DNA) sequence information has been used as an accurate and automated species identification tool for carrying out studies in a wide range of animal taxa, due to the presence of a significant amount of information (Hebert et al. 2003). Phylogenetic relationship of selected penaeoids has been studied in detail, using partial Mt-DNA of 16 species by Vázquez-Bader et al. (2004), 11 species by Quan et al. (2004), 30 species by Chan et al. (2008) and 54 species by Voloch et al. (2009) from Indo-West Pacific and Atlantic waters. Additionally, the genus Parapenaeopsis from Chinese water (Li et al. 2014), Metapenaeopsis from Atlantic Ocean (Cheng et al. 2015) and Parapenaeus from Indo-West Pacific and Atlantic (Yang et al. 2015) were studied thoroughly. However, analysis of a combination of dataset (molecular and morphological) has been used effectively for phylogenetic relationships, origin, diversification of the taxa and biogeographic distributions of decapoda (Vereshchaka et al. 2015; Yang et al. 2015). The present study aims to demonstrate the identification and phylogenetic relationships of deep water penaeoid shrimps from the Indian coast using morphological characters and Mt-DNA sequence data. Also, it aims to identify the important morphological characters for differentiating the clades and dispersal pattern of these commercially important shrimps.

Table 1 Sampling location, voucher number, collection depth and GenBank numbers of deep water penaeoid shrimps along Indian Coast.

Materials and methods

Specimens of deep-sea penaeoid shrimps were collected (2013–2016) from commercial trawl landings along the Bay of Bengal and southeastern Arabian Sea. A total of 14 species were collected (Aristeidae: genus 1, species 1; Solenoceridae: genus 3, species 9; Penaeidae: genus 3, species 4) (table 1), preserved in 90–95% ethanol for molecular studies and deposited at Crustacean Fisheries Division, Central Marine Fisheries Research Institute, Cochin, India.

Molecular analysis

The total genomic DNA was extracted from the pleopod of the individual specimens using a DNeasy Blood & Tissue kit (Qiagen) according to the manufacturer’s protocol. The cells were lysed by incubating at \(56^{\circ }\hbox {C}\) for 2 h and all other steps were followed as per the protocol. The universal primer for three mitochondrial genes: COI, 16S rRNA (16S) and cytochrome b (Cytb) partial sequences were amplified (Folmer et al. 1994; Palumbi 1996; Merritt et al. 1998). The reactions were performed in \(25\, \mu \hbox {L}\) reaction cocktails containing genomic DNA (\(0.5\, \mu \hbox {g}/\mu \hbox {L}\)), Taq DNA polymerase (\(0.05\, \hbox {U}/\mu \hbox {L}\)), \(1\times \) buffer, \(\hbox {MgCl}_{2}\) (3 mM), \(10\, \hbox {pM}/\mu \hbox {L}\) of each primer and dNTPs (\(200\, \mu \hbox {M}\)). The polymerase chain reaction (PCR) thermal profile used was \(94^{\circ }\hbox {C}\) for 5 min for initial denaturation, followed by 35 cycles of \(94^{\circ }\hbox {C}\) for 1 min, \(52^{\circ }\hbox {C}\) for 1 min, \(72^{\circ }\hbox {C}\) for 1.5 min and a final extension at \(72^{\circ }\hbox {C}\) for 5 min. Amplification of PCR products was confirmed by electrophoresis on a 1.5% agarose gel containing ethidium bromide and visualized under a UV transilluminator (Lark, India). Amplified PCR products were purified with the XcelGen DNA Gel/PCR Purification Mini kit (Xcelris Labs Limited, India) according to the manufacturer’s protocol. The eluted PCR products were sequenced bidirectionally by the dideoxy chain termination method using the Big-Dye Ready-Reaction kit v 3.1 (Applied Biosystems) on an ABI prism 3770 automated sequencer at AgriGenome Labs, Scigenom, Cochin, India.

Data analysis

Molecular sequences were checked and confirmed using ABI SeqEditor v.1.0. Protein coding gene sequences (COI and Cytb) were translated into amino acids using Transeq (EMBOSS online tool) to avoid the inclusion of pseudogenes (Tsang et al. 2008). All the sequences were blasted to report GenBank data to verify the potential contamination and the nucleotide sequences were aligned using the Clustal W algorithm (Thompson et al. 1994). The aligned data were edited using bioedit V.7.0.5.2 (Hall 1999), gaps in sequences was treated as missing data. All the sequences were submitted to the GenBank (table 1). The pairwise genetic distance was calculated using MEGA 6.0 (Tamura et al. 2013).

For phylogenetic analysis, the maximum-likelihood (ML) method was used for individual gene sequences to compare the similarity between tree topology and MEGA 6.0 was used to select the best-fit model for individual and combined data. General time-reversible model with a gamma distribution and invariable sites (GTR+G+I) (COI and Cytb), and Tamura–Nei model with a gamma distribution and invariable sites (TrN+G+I) (16S) were selected and used to generate ML gene trees with 1000 bootstrap replicates (Nei and Kumar 2000; Tamura et al. 2013).

Two methods were followed to construct the phylogenetic tree from concatenated data: maximum parsimony (Mp) analysis was conducted using PAUP v.4.0 (Swofford 2002) with all the characters assigning equal weightage and branch support was assessed using 1000 bootstrap replicates. A Bayesian inference (BI) was conducted with MrBayes v. 3.2.1 (Ronquist and Hulsenbeck 2003) and Markov chain Monte Carlo algorithms were run for 5,00,000 generations, sampling one tree every 100 generations. All the parameter estimations were checked and observation of likelihood (L) scores allowed us to determine the burn-in and stable distributions of the data. A 50% majority rule consensus tree was constructed from the remaining saved trees and was printed by Fig Tree v. 1.4.3 (Rambaut 2016) with all relevant support values.

Morphological character evolution

Ancestral state reconstruction was used to evaluate character evolutions (Pagel 1999). Fifty-two morphological characters (24 binary, 27 multistate and one noninformative) were chosen and considered for phylogenetic analysis based on the original taxonomic work of Ramadan (1938), Crosnier (1978, 1985), Pérez Farfante and Kensley (1997) and Dall (1999). All these major characters were reexamined carefully, listed in table 2. The data matrix in table 3 was analysed with Mp using combinations of programs: Mesquite v.3.01 (Maddison and Maddison 2015) and PAUP v.4.0 (Swofford 2002). These characters were given equal weightage and unordered, the code given for each state (i.e. 0, 1, 2, 3 and 4). Branch support was assessed using 1000 bootstrap replicates without any out-groups.

Results

The molecular data used in the present analysis constitutes 27 individuals belonging to nine species from three genera of Solenoceridae, 27 individuals belonging to four species from three genera of Penaeidae and 13 individuals from Aristeus alcocki of Aristeidae. The out-group for these analyses represents the individuals from Oplophorus gracilirostris and Palinustus waguensis. Among them, two sequences of A. alcocki, three sequences of O. gracilirostris and two sequences of P. waguensis from our earlier studies were used for analysis (table  1). No insertion, deletion or stop codons were observed and missing sequences were denoted with ‘–’ in the final alignment. A total of 63 COI sequences (665 bp including gaps), 55 16S sequences (540 bp including gaps) and 29 Cytb sequences (341 bp including gaps) were obtained from deep water penaeoid shrimps. We followed the taxonomic identification keys of penaeoid shrimps by Crosnier (1978, 1985, 1987), Pérez Farfante and Kensley (1997) and Dall (1999).

Table 2 List morphological characters and their states.
Table 3 The data matrix of morphological characters of deep water penaeoid shrimps along the Indian coast.
Fig. 1
figure 1

Phylogenetic tree recovered by Bayesian analysis from the COI gene data with nodal support values represent posterior probabilities: (a) Aristeidae, (b) Penaeidae, (c) Solenoceridae, and (d) out-group.

Replicates of all taxa formed a monophyletic and sister clade in the COI Bayesian tree (figure 1). The mean value of \(\hbox {K}_{2}\hbox {P}\) distances was recorded for all the taxa (table 4) which indicated 0.0–3.0% divergence between the individuals and 16.5–20.5% between the genus in the family Penaeidae, while divergence was found to be slightly higher (19.1–24.5%) in between the genus of family Solenoceridae. However, A. alcocki (family Aristeidae) formed a few sister clades with <2.0% divergence (ranges: 0.0–1.7%). M. andamanesis and M. coniger (between 3.3%) both were closer and appeared to exhibit a significant relationship with the genus Solenocera and sister clade of genus Parapenaeus and Penaeopsis. The genetic distances ranged from 7.1 to 21.8% in genus Solenocera showing three major clades.

Phylogenetic relationships

The tree topologies using Mp and BI approaches reported similarities with strong support in most of the nodes. Three of the families, Aristeidae (1.0, 100), Penaeidae (1.0, 63) and Solenoceridae (0.79) are found to be monophyletic forming the superfamily Penaeoidea which exhibited strong support (0.98, 100). In the family Penaeidae, the genus Metapenaeopsis was separated with high support (1.0, but 68) in comparison with the two genera namely, Parapenaeus and Penaeopsis (0.68, 68). In Solenoceridae family, genus Hymenopenaeus and Hadropenaeus were found to be distantly related to the genus Solenocera (0.79). S. hextii showed early divergence from this group (1.0, 100) while the remaining species clustered to give rise to two subgroups with strong support (0.80, 81). The first subgroup included S. rathbuni, S. pectinata and S. annectens while the second subgroup represented S. melantho, S. choprai and S. crassicornis (figure 2). In addition, COI gene sequences from the NCBI GenBank were retrieved for each genus separately except for Penaeopsis and combined with our sequences to understand the phylogenetic position of our species (figures 1–5 in electronic supplementary material at http://www.ias.ac.in/jgenet/).

Morphological character evolution

Fifty-four morphological characters representing the carapace (20), thoracic appendages (10), abdominal segments (7) and reproductive organs (15) were used to derive the character matrix (table 2). The characters on the carapace were generated based on the present (0), and absent (1) state except for the 18th and 20th characters. The remaining characters represented the thoracic appendages, abdominal segments and reproductive organs were separated/sorted as with multistate. The reproductive characters were strongly taxa-specific, thelycum demonstrated various shapes at the anterior (squares like hallow, narrow vertical, rounded, triangle, subtriangular shape, shallow transverse and T shape) and posterior (broad: bilobed, trapezoid shape, hallow, rounded boss numbers) regions. While in petasma, it was demonstrated based on the symmetrical and asymmetrical structure (like petaloid, broad, subrectangular, triangular and coiled) and the number of spines or setae.

Table 4 Average K2P distances of COI sequences between Taxa.
Fig. 2
figure 2

Phylogenetic tree recovered by Bayesian analysis based on the combined sequences of COI, Cytb and 16S genes. Nodal support values represent BI/Mp bootstrap, continental slope—Csp, continental shelf I—Csf I, continental shelf II—Csf II, (a) Aristeidae, (b) Penaeidae, (c) Solenoceridae, and (d) out-group.

Fig. 3
figure 3

Strict consensus tree recovered by parsimony analysis using morphological characters: (a) Aristeidae, (b) Penaeidae, (c) Solenoceridae.

Based on parsimony analysis, 10 characters were noninformative, 41 (78%) were informative while one character was constant and the strict consensus tree (consistency \(\hbox {index}=0.67\), retention \(\hbox {index}=0.57\), rescaled consistency \(\hbox {index}=0.37\)) is represented in figure 3.

Discussion

The standardized usage of mitochondrial COI gene sequences as DNA barcodes has emerged as an accurate tool for rapid identification of various animal species providing high species resolution (e.g. Costa et al. 2007; Burns et al. 2008) and is increasingly used for crustacean identification (Goldstein and DeSalle 2011; Hultgren et al. 2014). In our study, 14 taxa of the seven genera from the penaeoid group have been incorporated in the barcoding gene (COI) analysis. The strong (>3.0%) genetic distance between the taxa in the COI gene, showed the identification of the species in accordance with Crosnier (1978, 1984, 1985), Pérez Farfante and Kensley (1997) morphological classification scheme. The intraspecific distance of all taxa was in agreement with the Hebert et al. (2003) hypothesis except for A. alcocki (>3.0%), however this goes in concordance with Chan et al. (2017) where intraspecific distances were 3.8% indicating a very conservative genetic divergence.

The observation made by Cheng et al. (2015) states the existence of larger genetic distance (15.4%) between the species of the genus Metapenaeopsis which are mostly distributed in shallow water and 0.3% in M. provocatoria longirostris and M. velutina which inhabit deeper waters. Comparative results of the present study indicated less genetic distance (3.3%) between M. andamanensis and M. coniger due to the major differences in the thelycum (Crosnier 1987) being distributed in deeper waters (>200 m). According to Cheng et al. (2015) and the results of the present analysis a negative correlation is suggested between the depth and genetic distances.

Yang et al. (2015) worked on the genus Parapenaeus and the results of his study revealed less intraspecific distance (0.7%) in the species inhabiting shallow water and which takes up subsequent migration to the deeper water. Similar results (0.1% genetic distances) were observed in our study with P. investigatoris. These are widely distributed throughout the Indian Ocean and fairly abundant in south western coast of India at a depth of 160–300 m where the biological signatures of upwelling process is characterized by vertical mixing phenomena and cascading flows of denser upper layers enriching the deeper waters with organic nutrients (phytoplankton and zooplankton swarms) which would help crustacean development (Madhupratap and Haridas 1990). The study conducted by Chan et al. (2008) on the genus Penaeopsis revealed 1.9% genetic distance at the species level. Similarly, in this study, P. jerryi showed 0.0% distance between the individuals of the same species. This low rate of divergence might be due to the stabilizing selection on morphological or ecological characters.

Quan et al. (2004) observed higher genetic distance in genus Solenocera (22.8%) which is much smaller than the largest distance between genera in Penaeidae (25.39%) by COI analysis which indicates the presence of a greater amount of barcode gap between the members of these families. Similarly, results of the present study revealed higher genetic distance within and between the genera of Solenoceridae (15.1–24.5%) except between S. melantho and S. crassicornis (7.1%) while a slightly lesser distance in Penaeidae (16.5–20.5%).

Phylogenetic relationship

The results of the present study included mitochondrial genes (COI, Cytb and 16S) using Mp. BI and distance methods revealed two major clades with an out-group, which is in consensus with the reports of Crosnier (1978) and Burkenroad (1983) where taxonomical characters namely, gills formulas, prosartema, postorbital spines and antennular segments were used. Clade A consists of Aristeidae while clade B included the members of Penaeidae and clade C formed Solenoceridae. Each of the family exhibited monophyletic nature having strong support and large genetic distance. Compared with Aristeidae, Penaeidae and Solenoceridae showed close evolutionary relationship which can be further compared with the other published literature using mitochondrial markers (Quan et al. 2004; Voloch et al. 2005; Cheng et al. 2015) and nuclear protein-coding genes (Ma et al. 2009).

In the present study, clade C included three genera (Hadropenaeus, Hymenopenaeus and Solenocera) of family Solenoceridae with high support. These three genera characterized by the presence or absence of respiratory tubes like antennular flagella and external ramus of uropod spines. These species occupy the benthic regions and mostly preferred soft substrates, where they bury deeply in soft sediments and stayed keeping their respiratory tubes upwards (Dineshbabu and Manissery 2009). These are divided into three subclades showing a strong support in a phylogenetic relationship. Dall (1999) described the species of the genus (Solenocera) which are inhabitants of the continental shelf and slope, from about 15 m down to several hundred metres distributed in the Indo-West Pacific. Based on the depth-wise distribution in Indian waters, this genus can be classified into three groups, namely upper continental slope (>250 m) (Csp), the continental shelf I (150–250 m) (Csf I) and continental shelf II (<150 m) (Csf II). Continental slope included only one species (S. hextii) which differed from other continental shelf species by the presence of distinct and inverted ‘L’ shaped branchiocardiac crest (FAO 1983). Continental shelf I included three species (S. pectinata, S. rathbuni and S. annectens) and could be differentiated from continental shelf II (S. melantho, S. crassicornis and S. choprai) by the presence of postrostral carina extending little or beyond the cervical sulcus but it was reaching or almost reaching the end of carapace in S. melantho, S. crassicornis and S. choprai. The molecular results of the present study were in accordance with the classification of Crosnier (1978).

Clade B included three genera (Metapenaeopsis, Penaeopsis and Parapenaeus) falling under the family Penaeidae exhibiting monophyletic (1.0, 68) characters as proposed by Burkenroad (1983). Penaeopsis andParapenaeus align as a sister clade to the genus Metapenaeopsis. Penaeopsis and Parapenaeus possessed symmetrical petasma and maxilliped III without basal spine and where it was asymmetrical petasma and maxilliped III with basal spine in Metapenaeopsis as described by Burukovsky (1983). Similarly, Parapenaeini aligns as a sister clade to Penaeini as it was confirmed to be older from an evolutionary point of view in comparison with other penaeid group (Chan et al. 2008; Voloch et al. 2009).

Quan et al. (2004) examined 11 species of Penaeidae and S. crassicornis using a combined sequence (COI and 16S) resulted in the clustering of S. crassicornis within Parapenaeini cluster with a greater genetic distance. A similar observation was made by Voloch et al. (2005) in the phylogenetic classification of 39 species of Penaeidae and S. koelbeli. However, the results of the present study revealed that S. crassicornis clustered in the Solenoceridae with strong support (1.0, 82) instead of Parapenaeini indicating that S. crassicornis is monophyletic not paraphyletic. These results need to be studied in detail in future.

A. alcocki specimens belonging to the family Aristeidae clustered in clade A which appeared to be the earliest divergence in the penaeid group with strong support. This family differs from other penaeoid by not having prosartema on the eye and our result is fairly similar to Burkenroad (1983) classifications. However, Aristeidae is closely related to Penaeidae and Solenoceridae using nuclear protein-coding genes (Ma et al. 2009; Fernández et al. 2013).

Morphological character evolution

The studies based on morphological characters using parsimony analysis revealed that most of the synapomorphies are in the carapace and reproductive organs, which are frequently shared with other taxa in the family. The presence of three rostral teeth (03) and the absence of prosartema (06) and antennular flagella (22) forms the synapomorphic characters for the Aristeidae, showing that the A. alcocki species have diverged in the early phylogenetic evolution. Similarly, petasma and thelycum morphology separates Metapenaeopsis and Parapenaeus with a deep node (Crosnier 1985, 1987, 1991) particularly, process a in the petasma, the presence or absence of anterolateral protuberance in the thelycum, length of the rostrum, branchiostegal spine with or without carina are some of the important synapomorphies for Parapenaeus (Yang et al. 2015). In the present study, characters 13, 15, 21, 42, 48 and 49 for Parapenaeus and 43, 44, 45, 46, and 52 for Metapenaeopsis were observed to be synapomorphic. However, the members of the family Solenoceridae diverge from others with respect to postorbital spine (11), cervical sulcus (8) and rostral length (2) observed by Burukovsky (1983). Characters state 22 (1, 2) corresponding to antennular flagella modified as tube-like structure helpful in respiration along with ocular angle (9), and pereopod (27) were suggested to be synapomorphic for the genus Solenocera (Dall 1999). While the species in this genus are getting diverged by the presence of two synapomorphic characters like postrostral carina extending or not-extending to the end of the carapace (7) and thickness of pereopod IV (29). Thus, the development of the postrostral carina on the dorsal side of the carapace is found to have evolutionary importance to the genus Solenocera. Characters state 18 (1) corresponds to the pterygostomain region in S. pectinata and S. rathbuni re-curved while distinctly forming (18–2) in S. annectens. In a few other species, namely S. melantho, S. crassicornis, S. choprai and S. hextii thelycum characters were found to be important, but by molecular analysis, S. hextii was found to diverge from this group. So thelycal characters cannot be considered as synapomorphic for this species.

In conclusion, in the present study, we identified nine species which include three genera under Solenoceridae, four species from three genera of Penaeidae and one species in the family Aristeidae with a higher molecular divergence (COI: 3.3–33.0%) obtained along the Indian coast. Further, we generated the DNA barcode database using these species which can help in further investigations concerning the detailed evolution and biogeography of these valuable crustacean resources. Results derived from the integration of molecular and morphological characters can contribute to the elaboration of phylogenetic hypotheses. Both data helped to understand species circumscriptions within this group and it clearly showed that all species from these families are monophyletic. A comparison of these data would be best to generate a robust phylogenetic hypothesis, instead of using only taxonomical or a single DNA region. Moreover, concatenation of sequences from three genes (COI, Cytb, 16S) would be best to generate robust phylogenetic hypothesis which strongly supports the monophyly in Penaeidae, Aristeidae and Solenoceridae. However, a few authors find Solenoceridae nested within Penaeidae, making this family paraphyletic. Nevertheless, large and accurate species data collections from Indian waters are the pre-requisite to understand and to explain the stage of evolutionary relationships in these families.