Keywords

1 Introduction

Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a betacoronaviridae family member, and has been a primary and urgent concern worldwide [1,2,3]. As of March 4, 2022, over 107 countries had reported infections due to Omicron variants, since the reporting of first case on November 29, 2021 [4]. India saw the first few Omicron cases originating in the state of Karnataka on December 1, 2021 [5], with Delhi reporting a case later from a Tanzania returnee [6]. In this study, we sought to sequence all COVID-19 samples including Omicron variants that were reported in our tertiary care to gain further insights into the mutations occurring in this SARS-CoV-2 variant.

2 Methods

Nasopharyngeal swab samples were collected from 75 patients with a travel history of Africa/Middle East. Here, we randomly analysed samples from 10 representative patients who presented with mild symptoms (fever, cold, cough, sore throat and mild weakness) within 3 days of onset of infection and prior to hospitalization. The samples were used as an input for the ARTIC network “Midnight” protocol (Fig. 15.1) for PCR tiling of SARS-CoV-2, including sequencing with Oxford Nanopore Technologies (ONT) long-read whole-genome sequencing (Rapid Barcoding Kit 96/SQL-RBK-110-96) [7, 8].

Fig. 15.1
A vertical flow diagram with 6 steps marked on a downward pointing arrow to the left. The steps are c DNA synthesis, library preparation pool A and B, combined pool A plus B, cleanup quantification, loading library in Spot On, and data analysis.

Midnight workflow for preparation of SARS-CoV-2 whole-genome sequencing. This method was similar to the ARTIC amplicon sequencing protocol for MinION for SARS-CoV-2 v3 (LoCost) by Josh Quick and the method used in Freed et al. [8]

3 Results and Discussion

ONT sequencing yielded an average of 25 million reads from all 10 samples, spanning 96.28% of the SARS-CoV-2 genome (20× coverage depth) (Table 15.1). To check the transmissibility associated with the number of mutations in the spike glycoprotein associated with receptor-binding domain (RBD), we compared the 44 common mutations from our samples with the recently emerging mutations of Omicron. Our preliminary analysis indicated that the Omicron variant subcladed with the dominant Delta variant and might have evolved rapidly from multiple mutations (Tables 15.2a, 15.2b, 15.3 and 15.4). A neighbourhood joining tree was constructed using Clustal Omega with the sequences sorted vertically, thereby drawing a circular and unrooted tree (Fig. 15.2a) [9]. We observed that the Indian Omicron variants were clustered together with a root emerging from OL815455, the variant that was first detected from Botswana. The iTOL containing the 75 sequenced samples and Wuhan reference yielded distinct clades in both unrooted and rooted circular tree (data not shown) and the four samples that were claded separately suggested that these were among the first suspected Omicron cases in India (Fig. 15.2a) [10,11,12]. We obtained p.Thr614Ile, p.Thr1822Ile, p.Thr6098Ile and p.Asp155Tyr from LNHD9, p.Ala701Val and p.Val1887Ile from LNHD8 and p.Gly667Ser from LNHD1. However, our preliminary observations indicated that none of these are known to confer detrimental properties to the spike (e.g. changes in transmissibility, severity or immune evasion). Mutations in the spike proteins (Fig. 15.2b(i–iii)) of SARS-CoV-2 variants of concern have also been compared to the parental SARS-CoV-2 isolate B.1 suggesting that the amino acid substitutions are already found in altered positions but with distinct substitutions (Supplementary Tables 15.1 and 15.2).

Table 15.1 List of the 10 samples with coverage, CT values, clinical symptoms and age/sex
Table 15.2a Genome Coverage and mutations in the cohort
Table 15.2b Multiple mutations identified in the study cohort
Table 15.3 Amino acid substitutions in the spike region observed in the study cohort
Table 15.4 Common mutations (n = 44) seen across the Indian cohort
Fig. 15.2
2 illustrations. 1. A circular phylogenetic tree with 75 samples. 2. It has 3 illustrations. 2 super folded structures of spike glycoprotein with and without a complex with the host cell. A Venn diagram with 3 circles illustrating 3 samples titled L N H D 1, L N H D 8, and L N H D 9.

(a) Circular phylogenetic tree of all 75 samples from India claded with the Wuhan reference genome. The unrooted tree shows a clear dissection of Wuhan from other lineages. All LNHD accessions are labelled. In the Indian sub-population, spike mutations (n = 35) were seen with the nearest residue if in loop/termini region (A67V, V70I(69), T95I, G142V, Y145H(143), N211I, L212I, G339D, R346K, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H T547K, D614G, H655Y, N679K(674), P681H(674), A701V, N764K, D796Y, N856K, Q954H, N969K and L981F). (b) (i) Spike glycoprotein (PDB: 6acc, EM 3.6 Angstrom) with RBD in down conformation. (ii) Multi-Venn diagram of three samples LNHD1, LNHD8 and LNHD9 showing unique and common mutations to all the LNHD series. (iii) Spike glycoprotein (PDB: 6acj, EM 4.2 Angstrom) in complex with host cell receptor ACE2 (green ribbon). (Also see links to Supplementary Tables 15.1 and 15.2)

The limitation of our study is that although the adopted ARTIC sequencing protocol allowed the confirmation of SARS-CoV-2 infections, we did not carry out analyses to determine the probable structural impact of mutations on binding of antibodies produced by existing vaccines or previous SARS-CoV-2 infections, as described by Kannan et al. [13].

In conclusion, our study has demonstrated the utility of nanopore sequencing for SARS-CoV-2 genomes from clinical specimens. We firmly hope that prompt diagnosis and rapid whole-genome analysis would allow a decisive response to the SARS-CoV-2 outbreak that will bring disease control and prevention efforts.