Main

We hypothesized that the genes responsible for cytoplasmic incompatibility (CI) (Extended Data Fig. 1a) are present in all CI-inducing Wolbachia strains and absent or divergent in non-CI strains; we also predicted that these genes are expressed in the gonads of infected insects. To elucidate CI candidates, we determined the core genome shared by the CI-inducing Wolbachia strains wMel (from D. melanogaster), wRi (from Drosophila simulans), wPip (Pel strain from Culex pipiens), and wRec (from Drosophila recens), while excluding the pan-genome of the mutualistic strain wBm (from Brugia malayi). This yielded 113 gene families representing 161 unique wMel genes (Fig. 1a and Supplementary Table 1). We further narrowed this list by comparing it with (1) homologues of genes previously determined by comparative genomic hybridization to be absent or divergent in the strain wAu9, a non-CI strain, (2) homologues to genes highly expressed at the RNA level in wVitA-infected Nasonia vitripennis ovaries, and (3) homologues detected at the protein level in wPip (Buckeye)-infected C. pipiens ovaries. We included ovarian data with the reasoning that CI genes might be generally expressed in infected reproductive tissues, or that the CI induction and rescue genes might be the same. Remarkably, only two genes, wMel locus tags WD0631 and WD0632, were shared among all four gene subsets (Fig. 1b and Supplementary Tables 2–4). Notably, the homologue of WD0631 in the Wolbachia strain wPip was found at the protein level in the fertilized spermathecae of infected mosquitoes, lending support to its role in reproductive manipulation10.

Figure 1: Comparative analyses reveal WD0631 and WD0632 in the eukaryotic association module of prophage WO as candidate CI genes.
figure 1

a, Venn diagram illustrating the number of unique and shared gene families from four CI-inducing Wolbachia strains. b, Venn diagram illustrating the number of unique and shared wMel genes matching each criteria combination. c, e, Bayesian phylogenies of (c) WD0631 and (e) WD0632 and their homologues, on the basis of a core 256-amino-acid (aa) alignment of WD0631 reciprocal BLASTp hits and a core 462-aa alignment of WD0632 reciprocal BLASTp hits. When multiple similar copies exist in the same strain, only one copy is shown. Consensus support values are shown at the nodes. Both trees are based on the JTT+G model of evolution and are unrooted. d, CI patterns correlate with WD0631/WD0632 sequence homology. wRi rescues wMel and both share a similar set of homologues (*). The inability of wMel to rescue wRi correlates with a type (†) that is present in wRi but absent in wMel. Likewise, bidirectional incompatibility of all other crosses correlates to divergent homologues. This diagram was adapted from ref. 30. f, Protein architecture of the WD0631 and WD0632 types is conserved for each clade and is classified according to the WD0632-like domain. TM, transmembrane. Dotted shading represents the region of shared homology used to construct phylogenetic trees. For c and e, the WO-prefix indicates a specific phage WO haplotype and the w-prefix refers to a ‘WO-like island’, a small subset of conserved phage genes, within that specific Wolbachia strain.

PowerPoint slide

We analysed the evolution and predicted protein domains of these two genes and found that their homologues are always paired within the eukaryotic association module of prophage WO8, and they co-diverged into three distinct phylogenetic groups that we designate types I, II, and III (Fig. 1c, e and Supplementary Table 5). These relationships are not evident in the phylogeny of the Wolbachia cell division gene ftsZ, which exhibits the typical bifurcation of supergroup A and B Wolbachia (Extended Data Fig. 1b), or in the phylogeny of prophage WO baseplate assembly gene gpW (Extended Data Fig. 1c). This suggests that homologues of WD0631 and WD0632 evolve under different evolutionary pressures than genes in the core Wolbachia genome or in a structural module of phage WO.

Type I variants are the most prevalent among ten sequenced Wolbachia strains, and are always associated with large prophage WO regions that often lack tail genes (Extended Data Fig. 2); it is unclear whether these WO regions forge fully intact or defective interfering particles. The functions of type I WD0631 homologues are unknown, although type I WD0632 homologues contain weak homology to a putative Peptidase_C48 domain (wMel, National Center for Biotechnology Information (NCBI) conserved domain E = 6.69 × 10−4, Fig. 1f), a key feature of Ulp1 (ubiquitin-like-specific protease) proteases10. Type II variants are located within more complete phage haplotypes (Extended Data Fig. 2), but the WD0632 homologues are truncated and lack recognized protein domains (Fig. 1f). Notably, all Wolbachia strains that contain type II variants contain at least one pair of intact type I variants. Type III variants possess WD0631 homologues with a weakly predicted cytochrome C552 domain involved in nitrate reduction (wNo, NCBI conserved domain E = 3.79 × 10−3), while type III WD0632 homologues contain weak homology to the PD-(D/E)XK nuclease superfamily (wNo, NCBI conserved domain E = 1.15 × 10−3) and to a transmembrane domain predicted by the transmembrane hidden Markov model11 (Fig. 1f). Finally, a putative type IV variant encoding a carboxy (C)-terminal PD-(D/E)XK nuclease superfamily (NCBI conserved domain E = 3.69 × 10−3) was identified in Wolbachia strains wPip and wAlbB, but not included in phylogenetic analyses because the WD0632 homologues are highly divergent (28% identity across 17% of the protein) and do not appear in reciprocal BLASTp analyses. The predicted functions of type III and IV protein domains are not well understood, but a homologue of the putative nuclease domain was previously found in a selfish genetic element that mediates embryonic lethality in Tribolium beetles12. Uncertain annotations and substantial unknown sequence across all of the phylogenetic types necessitate caution in extrapolating definitive gene functions. Importantly, the region of shared homology among the WD0632 homologues (Fig. 1f) is outside the putative C-terminal Peptidase_C48 domain, suggesting that the unannotated regions represent an ancestral CI sequence core that warrants closer inspection.

Consistent with a role in CI, the degree of relatedness and presence/absence of homologues of WD0631 and WD0632 between Wolbachia strains correlates with known patterns of bidirectional incompatibility (Fig. 1d). Among the strains wRi, wHa, and wNo, only wRi rescues wMel-induced CI in same-species crosses13,14. We postulate that this is due to wRi and wMel sharing highly related type I homologues (99% amino-acid identity), and thus probably sharing a rescue factor, while wRi also has a type II homologue that may explain its ability to induce CI against wMel. Meanwhile, bidirectionally incompatible pairs are highly divergent, with only 29–68% amino-acid identity (Extended Data Fig. 3a). Additionally, variation in CI strength between strains appears to correlate with the number of copies of the WD0631/WD0632 pair (Extended Data Fig. 3b). Strains with only one copy, such as wMel, have a comparatively weak CI phenotype, while those with two or three copies, such as wRi and wHa, cause strong CI14.

Given the various lines of evidence that associate these two genes with CI, we next examined the functional role of WD0631 and WD0632 in CI. For comparison, the following control genes were also used: WD0034, which has a predicted PAZ (Piwi, Argonaut, and Zwille) domain (NCBI conserved domain E = 1.85 × 10−18); WD0508, a prophage gene annotated as a putative transcriptional regulator with two helix–turn–helix domains (NCBI conserved domain E = 9.29 × 10−12) in the Octomom region; and WD0625, a prophage gene annotated as a DUF2466 with a JAB1/MPN/Mov34 metalloenzyme (JAMM) domain (NCBI conserved domain E = 1.60 × 10−41). We first examined the expression of these genes in the testes of wMel-infected, 1-day-old and 7-day-old D. melanogaster males. Since CI strength decreases significantly in aged males15, we predicted that a CI factor would be expressed at a lower level in 7-day-old males versus 1-day-old males that both emerged on day 1 of the cross. Indeed, WD0631 and WD0632 showed a significantly lower transcription level in aged males (Fig. 2). Moreover, WD0631 exhibited 18.6- and 83.0-fold higher expression than WD0632 for young and aged males, respectively (Fig. 2). Coupled with RNA-seq expression data16 and operon predictor algorithms, evidence suggests that these genes are not generally acting as an operon in wMel. Both prophage-associated control genes, WD0508 and WD0625, also exhibited this age-dependent expression pattern, but the non-prophage gene WD0034 did not (Fig. 2). WD0640, which encodes prophage WO structural protein GpW, was also reduced in older males, suggesting that prophage genes in general are relatively downregulated in 7-day-old testes (Fig. 2). The phenomenon of decreased CI in older males was not due to decreases in Wolbachia titre over time, as the copy number of Wolbachia groEL relative to D. melanogaster rp49 increased as males aged, and there was no significant difference in absolute Wolbachia gene copies between 1-day-old and 7-day-old males (Extended Data Fig. 4a, b). Since CI expression is also correlated with male development time, we examined gene expression in early emerging ‘older brothers’ (emerged on day 1) and later emerging ‘younger brothers’ (emerged on day 5). Expression was statistically equivalent for WD0631 (Extended Data Fig. 4c), and slightly reduced in younger brothers for WD0632 (Extended Data Fig. 4d). These results are consistent with a small younger brother effect17, although we did not observe a statistically significant effect on CI penetrance (Extended Data Fig. 4e).

Figure 2: Relative expression of CI candidate and prophage WO genes decreases as males age.
figure 2

RNA expression in 1-day-old versus 7-day-old testes, normalized to expression of groEL in wMel-infected D. melanogaster testes from the fastest-developing males. Values denote 2−ΔCt. n = 6 independent pools of 20 testes for each group. Bars, mean ± s.d. *P < 0.05, **P < 0.01 by Mann–Whitney U-test. This experiment was performed once. Exact P values are provided in Supplementary Table 7.

PowerPoint slide

Source data

To directly test involvement of these genes in CI, we generated transgenic D. melanogaster expressing genes using an upstream activating sequence (UAS), since Wolbachia itself cannot be genetically transformed. We used a nanos-Gal4 driver line for tissue-specific expression in the male and female germline18,19. We assessed CI by measuring the percentage of embryos that hatched into larvae. While wild-type (WT) CI between infected males and uninfected females led to significantly reduced hatch rates, expressing each of four candidate transgenes in uninfected (fastest-developing, 1 day old) males did not affect hatch rates when crossed to uninfected females (Fig. 3a and Extended Data Fig. 5a). These transgenes also had no effect on sex ratios (Extended Data Figs 5b and 6a). There were no phenotypic effects despite confirmed expression of each transgene in the testes (Extended Data Fig. 7a–d).

Figure 3: Dual expression of WD0631 (cifA) and WD0632 (cifB) is necessary to induce CI-like defects.
figure 3

ac, Hatch rate assays used the fastest developing males that were aged either (a, b) 1 day or (c) 2–4 days in parental crosses; older males express incomplete CI. Parental infection status is designated with filled symbols for wMel-infected parents or open symbols for uninfected parents. Transgenic flies are labelled with their transgene to the right of their male/female symbol. Unlabelled symbols represent WT flies. Data points are coloured according to the type of cross: blue, no CI; red, CI crosses; purple, rescue crosses with wMel-infected females. n = 24–54 for each group. Bars, mean ± s.d. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 by analysis of variance (ANOVA) with Kruskal–Wallis test and Dunn’s multiple test correction. Statistical comparisons are between all groups (a, b); or between CI crosses only (c). All experiments were performed at least twice, except for the increase of WT CI by WD0631, which was done once. Exact P values are provided in Supplementary Table 7.

PowerPoint slide

Source data

As WD0631 and WD0632 are adjacent, coevolving genes, we reasoned that dual expression of WD0631 and WD0632 might be required to induce CI. Indeed, expression of both transgenes in the same males significantly reduced hatch rates by 68% compared with uninfected WT crosses (Fig. 3b), with no effect on sex ratios (Extended Data Fig. 6b). Roughly half of the crosses with transgenic males yielded hatch rates within the range observed in WT CI (3.8 ± 5.6% hatch rate). Interestingly, there was a strong positive correlation between hatch rate and clutch size when both transgenes were expressed (rs = 0.7; P = 0.0003), but not in WT CI, suggesting that dilution of transgene products across larger clutches may explain variation in transgene-induced CI. It is also possible that full transgene induction of CI requires other factors, or that transgenes are not expressed at the ideal time or place for complete CI, although transgene expression in adult testes was confirmed (Extended Data Fig. 7c, d).

Importantly, transgene-induced lethality is fully rescued in embryos of wMel-infected females (Fig. 3b), indicating that these genes produce probable CI factors rather than artefacts that reduce hatch rates through off target effects that would not be rescued. We therefore name and hereafter refer to these genes as cytoplasmic incompatibility factor A (cifA) and B (cifB) for WD0631 and WD0632, respectively. Type II, III, and IV homologues are designated cif-like until experimental evidence demonstrates that they recapitulate CI.

To test whether cifA (WD0631) and cifB (WD0632) transgenes act additively with Wolbachia to enhance WT CI levels, wMel-infected male flies expressing either transgene were aged 2–4 days to lower WT CI penetrance before crossing with uninfected females. In support of transgene-induced enhancement of CI, hatch rates in these aged males decreased significantly compared with WT CI crosses of the same age (Fig. 3c), with no effect on sex ratios (Extended Data Fig. 6c). In this context, wherein aged flies cause a weaker level of WT CI, the transgenes appear to add to the quantity of CI effectors in wMel-infected tissues, causing stronger CI overall. This effect was not observed when control transgenes WD0508 or WD0625 were expressed individually in wMel-infected males (Extended Data Fig. 8a, b). Moreover, dual expression of cifA and cifB in wMel-infected flies reduced hatch rates further than either gene alone, yet was still fully rescued in embryos of wMel-infected females (Fig. 3c). Adding WD0625 to cifB in aged wMel-infected males did not increase CI beyond cifB alone (Extended Data Fig. 8b), and had no effect on embryonic hatch rates from crosses with 1-day-old uninfected males (Extended Data Fig. 8c). Finally, none of these gene combinations affected offspring sex ratios (Extended Data Fig. 9). Taken together, these findings support the central conclusion that cifA and cifB are both necessary to induce the CI phenotype, and they do not represent an artefact of the transgenic system.

To rule out the possibility that transgene-induced enhancement of CI in infected lines is due to increased Wolbachia titres, we quantitated amplicons of single-copy genes from Wolbachia and D. melanogaster. Although there were some differences in Wolbachia titres between infected transgenic lines (Extended Data Fig. 10a–c), the variation did not correlate with induction or magnitude of CI, signifying that decreased offspring viability was due to the direct effect of transgene products rather than changes in Wolbachia density. Most notably, densities significantly increased in infected flies expressing the control Octomom transgene WD0508 (Extended Data Fig. 10a) but did not enhance CI (Extended Data Fig. 8a).

Next, we tested whether transgene-induced CI associates with canonical cytological defects observed in Wolbachia-induced CI. Although CI is typically thought to cause failure of the first mitotic division20,21, nearly half of the embryonic arrest in incompatible crosses occurs during advanced developmental stages in D. simulans22,23, Aedes polynesiensis24, and C. pipiens25. We examined embryos from control and experimental crosses after 1–2 h of development and binned their cytology into one of six phenotypes. While a few embryos in each cross were unfertilized (Fig. 4a), most embryos in compatible crosses were either in normal late-stage preblastoderm (Fig. 4b) or syncytial blastoderm stages26 (Fig. 4c). In WT CI, significantly more embryos exhibited three defects: arrest of cellular division after two to three mitotic divisions (Fig. 4d), later stage arrest associated with chromatin bridging, as is classically associated with strong CI in D. simulans21 (Fig. 4e), or arrest associated with regional failure of division in one embryo region (Fig. 4f). After blindly scoring embryo cytology, we determined that aberrant phenotypes (a, d, e, and f) were significantly more common in the offspring of dual cifA/cifB transgenic males mated to uninfected females, but these abnormalities were rescued in embryos from wMel-infected females (Fig. 4g). These effects were not seen with control transgene WD0508 or with singular expression of cifA or cifB (Fig. 4h). These data again validate that transgene-induced CI, as measured through cytological defects, recapitulates WT CI. Most of the embryos arrest after two to three mitotic divisions.

Figure 4: Dual expression of WD0631 (cifA) and WD0632 (cifB) recapitulates CI-associated embryonic defects.
figure 4

af, Representative embryo cytology is shown for (a) unfertilized embryos, (b) normal multi-nucleated embryos at 1 h of development, (c) normal embryos near 2 h of development in which nuclei begin to migrate to the periphery of the cytoplasm, and three different mitotic abnormalities: (d) failure of nuclear division after two to three mitoses, (e) chromatin bridging, and (f) regional mitotic failure. g, The number of embryos with each cytological phenotype resulting from the indicated crosses is shown. Infection status is designated with filled symbols for wMel-infected parents or open symbols for uninfected parents. Transgenic flies are labelled with their transgene to the right of their male/female symbol. Unlabelled symbols represent WT flies. Black lines on each graph indicates mean hatch rate for the cross. ***P < 0.001, ****P < 0.0001 by two-tailed Fisher’s exact test comparing normal (phenotypes b and c) with abnormal (phenotypes a, d–f) for each cross. h, Quantitation of cytological defects in control crosses. Cytology for g was performed twice and for h once. Exact P values are provided in Supplementary Table 7.

PowerPoint slide

Source data

This study identifies, for the first time, two differentially transcribed and codiverging prophage WO genes that recapitulate and enhance CI. These rapidly evolving genes are not chromosomal Wolbachia genes per se, but rather occur widely in the eukaryotic association module of prophage WO8. This module notably contains genes with amino-acid sequences homologous to eukaryotes or annotated to interact with animal cells, although cifA and cifB do not appear to have eukaryotic homology. CI can therefore be categorized as a prophage WO-induced phenotype rather than a Wolbachia-induced phenotype. We name the genes and close homologues cytoplasmic incompatibility factors A and B for WD0631 and WD0632, respectively. The cif name is conservatively grounded in phenotype and makes no assumptions about mechanism, which is notable because there are unannotated core regions throughout the cif genes that may have as much bearing on mechanism as the annotated domains.

The discovery of cifA and cifB genes that functionally recapitulate and enhance CI is the first inroad in solving the genetic basis of reproductive parasitism, a phenomenon induced worldwide in potentially millions of arthropod species27. These prophage WO genes have implications for microbe-assisted speciation, because they can underlie CI-induced hybrid lethality observed between closely related species of Nasonia and Drosophila28,29. Finally, cifA and cifB are important for arthropod pest and vector control strategies, as they could be an alternative or adjunct to current Wolbachia-based efforts aimed at controlling agricultural pests or curbing arthropod-borne transmission of infectious diseases4,5,6,7.

Methods

Comparative genomics and transcriptomics

MicroScope31 was used to select the set of genes comprising the core genomes of CI-inducing Wolbachia strains wMel (NC_002978.6)32, wRi (NC_012416.1)33, wPip (Pel) (NC_010981.1)34, and the recently sequenced wRec (GCA_000742435.1)35, while excluding the pan-genome of the mutualistic strain wBm (NC_006833.1)36, using cutoffs of 50% amino-acid identity and 80% alignment coverage. For the ‘absent in wAu’ criterion, wAu microarray data were obtained from the original authors9 and genes that were present in CI-inducing strains wRi and wSim but absent or divergent in the non-CI strain wAu were selected.

For ovarian transcriptomics, 1-day-old females from wVitA-infected N. vitripennis 12.1 were hosted as virgins on Sarcophaga bullata pupae37 for 48 h to stimulate feeding and oogenesis. Females were then dissected in RNase-free 1× PBS buffer, and their ovaries were immediately transferred to RNase-free microcentrifuge tubes in liquid nitrogen. Fifty ovaries were pooled for each of three biological replicates. Ovaries were manually homogenized with RNase-free pestles, and their RNA was extracted using the RNeasy Mini Kit (Qiagen) according to the manufacturer’s protocol for purification of total RNA from animal tissues. After RNA purification, samples were treated with RQ1 RNase-free DNase (Promega), and ethanol precipitation was performed. PCR of RNA samples with Nasonia primers NvS6KQTF4 and NVS6KQTR4 (ref. 38) confirmed that all samples were free of DNA contamination. RNA concentrations were measured with a Qubit 2.0 Fluorometer (Life Technologies) using the RNA HS Assay kit (Life Technologies), and approximately 5 μg of total RNA from each sample was used as input for the MICROBEnrich Kit (Ambion) to enrich for Wolbachia RNA in the samples. Bacterial-enriched RNA was then ethanol-precipitated, and rRNA was depleted from the samples using the Ribo-Zero Magnetic kit (Illumina) according to the manufacturer’s protocol. Approximately 1.5 μg of enriched, rRNA-depleted RNA for each replicate was shipped to the University of Rochester Genomics Research Center for sequencing. Library preparation was performed using the Illumina ScriptSeq version 2 RNA-Seq Library Preparation kit, and all samples were run multiplexed on a single lane of the Illumina HiSeq2500 (single-end, 100 base pair reads). Raw sequence reads were trimmed and mapped to the wVitA genome (PRJNA213627) in CLC Genomics Workbench 8.5.1 using a minimum length fraction of 0.9, a minimum similarity fraction of 0.8, and allowing one gene hit per read. With all three replicates combined, a total of 364,765 reads out of 41,894,651 (0.87%) mapped to the wVitA genome, with the remaining reads mapping to the N. vitripennis host genome (GCF_000002325.3). All Wolbachia genes with at least five RNA-seq reads, with the exception of the 16S and 23S RNA genes, were selected. For non-wMel data sets, the closest homologues in wMel were found using BLASTp in Geneious Pro version 5.5.6 (ref. 39).

Protein extraction and mass spectrometry

Protein was extracted from C. pipiens tissues as described previously10. Ovaries from 30 wPip (Buckeye)-infected mosquitoes were dissected in 100% ethanol and collected in a 1.5 ml tube filled with 100% ethanol. Pooled tissues were sonicated at 40 mA for 10 s in a Kontes GE 70.1 ultrasonic processor, and trichloroacetic acid was added to a final concentration of 10% (v/v). After centrifugation at 13,000 r.p.m. in a microcentrifuge, pellets were washed with acetone:water (9:1), dried, and stored at −20 °C. Samples were directly submitted to the University of Minnesota’s Center for Mass Spectrometry and Proteomics for iTRAQ (isobaric tagging for relative and absolute quantification) analysis. Proteins were sorted according to their relative abundance as determined by the number of spectra from the single most abundant peptide. Because proteins can often produce varying amounts of detectable tryptic peptides, depending upon protein size and lysine/arginine content, we counted only the single most abundant peptide for each protein. This quantification was justified by a previous report40 showing that the two most abundant proteins are the Wolbachia surface protein (WSP; WP_007302328.1) and another putative membrane protein (WP0576; WP_012481859.1). Only proteins with at least three unique peptides (95% confidence) detected were reported; using this criterion the false discovery rate was zero.

Evolutionary analyses

WD0631 (NCBI accession number AAS14330.1) and WD0632 (AAS14331.1) from wMel were used as queries to perform a BLASTp search of NCBI’s nonredundant (nr) protein sequence database with algorithm parameters based on a word size of six and BLOSUM62 scoring matrix41. Homologues were selected on the basis of the satisfaction of three criteria: (1) E = ≤ 10−20, (2) query coverage greater than 60%, and (3) presence in fully sequenced Wolbachia and/or phage WO genomes. FtsZ and gpW proteins were identified for all representative Wolbachia and phage WO genomes, respectively. Protein alignments were performed using the MUSCLE plugin42 in Geneious Pro version 8.1.7 (ref. 39); the best models of evolution, according to corrected Akaike information criteria43, were estimated using the ProtTest server44; and phylogenetic trees were built using the MrBayes plugin45 in Geneious. Putative functional domains were identified using NCBI’s BLASTp, Wellcome Trust Sanger Institute’s PFAM database46, a transmembrane hidden Markov model11, and EMBL’s Simple Modular Architecture Research Tool (SMART)47. WD0631/WD0632 protein homology (percentage amino-acid identity) was based on a 1:1 BLASTp analysis for each pair. Prophage/WO-like island association for each pair of genes was based on prophage regions identified in a previous study8.

Gene expression assays and Wolbachia titres

For the male age effect, native expression of CI candidates was tested with RT–qPCR on replicate pools of 20 pairs of testes from the fastest-developing virgin males that were aged 1 day or 7 days. RNA was extracted with a Qiagen RNeasy mini kit, DNase treated with TURBO DNase (Life Technologies), and cDNA generated with Superscript III Reverse Transcriptase (Invitrogen). Primer sequences are listed in Supplementary Table 6. Quantitative PCR was performed on a Bio-Rad CFX-96 Real-Time System using iTaq Universal SYBR Green Supermix (Bio-Rad). Thirty cycles of PCR were performed against positive controls (extracted DNA), negative controls (water), RNA, and cDNA with the following conditions: 95 °C 2 min, 30× (95 °C 15 s, 56 °C 30 s, 72 °C 30 s), 72 °C 5 min. Values of 2−ΔCt between the target gene and housekeeping gene groEL were used to determine relative gene expression. These experiments were performed once with multiple replicates for each condition.

For experiments on the younger brother effect, replicate pools of 20 pairs of testes were collected from the fastest-developing virgin males that emerged on the first day (older brothers) or fifth day (younger brothers). Male siblings for the younger brother effect analysis were also collected concurrently for hatch rates as described for hatch rate assays by crossing the wMel-infected males to 3- to 5-day-old wMel-infected or uninfected females. RNA was extracted using the Direct-zol RNA MiniPrep Kit (Zymo), DNase treated with DNA-free (Ambion, Life Technologies), cDNA was generated with SuperScript VILO (Invitrogen), and RT–qPCR was run using iTaq Universal SYBR Green (Bio-Rad). Primers, PCR conditions, and analysis were the same as for the male age effect above. These experiments were performed once with multiple replicates for each condition.

For gene expression in Extended Data Fig. 7, six pools of six pairs of testes were dissected from parents used in hatch rate assays from a repeat of Fig. 3a and Extended Data Fig. 5. In samples designated ‘High CI’ and ‘No CI’, the males correspond to crosses that had lower or normal hatch rates, respectively. For all other samples, the flies used were chosen at random. RNA was extracted using the same method as the younger brother experiment above. Thirty cycles of PCR were performed against positive controls (extracted DNA), negative controls (water), RNA, and cDNA with PCR conditions described above. Gel image size and brightness were adjusted in some cases for clarity. These experiments were performed once.

For the Wolbachia titres, pools of testes were dissected from 15 males in ice-cold PBS. For Extended Data Fig. 10a–c, brothers of those used in the corresponding hatch rates were used. DNA was extracted using a Gentra Puregene Tissue kit (Qiagen). qPCR was done as described above. Absolute quantification was achieved by comparing all experimental samples with a standard curve generated on the same plate. Primers are listed in Supplementary Table 6. qPCR conditions were as follows: 50 °C 10 min, 95 °C 5 min, 40× (95 °C 10 s, 55 °C 30 s), 95 °C 30 s. To obtain a more accurate Wolbachia:host cell ratio, it was assumed that each host cell had two copies of rp49 and each Wolbachia cell had one copy of groEL. These experiments were performed once but with a sample size of eight for each condition.

Fly rearing

D. melanogaster were reared on a standard cornmeal- and molasses-based media. Stocks were maintained at 25 °C while virgin flies were stored at room temperature. During virgin collections, stocks were kept at 18 °C overnight and 25 °C during the day. All flies were kept on a 12-h light/dark cycle. Wolbachia-uninfected lines were generated through tetracycline treatment for three generations. Briefly, tetracycline was dissolved in ethanol and then diluted in water to a final concentration of 1 mg/ml. One millilitre of this solution was added to 50 ml of media (final concentration of 20 μg/ml). Freshly treated media was used for each generation. Infection status was confirmed with PCR using Wolb_F and Wolb_R3 primers48, and flies were reared on untreated media for at least three additional generations to allow for mitochondrial recovery before being used49.

Transgenic flies

Each CI candidate gene was cloned into the pTIGER plasmid for transformation and expression in D. melanogaster50. pTIGER, a pUASp-based vector designed for germline expression, exhibits targeted integration into the D. melanogaster genome using PhiC31 integrase51 and tissue-specific, inducible expression through the Gal4–UAS system52. Cloning was performed using standard molecular biology techniques and plasmids were purified and Sanger-sequenced for confirmation before injection. At least 200 D. melanogaster embryos were injected per gene by Best Gene (Chino Hills, California), and transformants were selected on the basis of w+ eye colour. All transgenic lines were made in the yw D. melanogaster background, and each was an isofemale line derived from the offspring of a single transformant. Homozygous lines were maintained when possible, or heterozygous flies were maintained when homozygous transgenics were inviable (WD0625/CyO). WD0508 and WD0631 insertion was performed with the y1 M{vas-int.Dm}ZH-2A w*; P{CaryP}attP40 line. WD0625 was inserted into BSC9723 with the genotype y1 M{vas-int.Dm}ZH-2A w*; PBac{y+-attP-3B}VK00002. WD0632 insertion was done using BSC8622 with the genotype y1 w67c23; P{CaryP}attP2.

Hatch rate and sex ratio assays

Parental females were either infected or uninfected y1w* flies (wMel-infected or uninfected) and aged for 2–6 days before crossing. Uninfected y1w* flies were generated as described for transgenic lines. Parental males were created by crossing nanos-Gal4 virgin females (wMel-infected or uninfected) with either y1w* or UAS-candidate gene-transgenic, uninfected males. Only the first males emerging between 0 and 30 h from these crosses were used in CI assays to control for the younger-brother effect associated with CI17. To test whether CI can be increased by transgenes, virgin, day 1 males were aged for 2–4 days before crossing to reduce the level of WT CI. Within experiments, care was taken to match the age of males between experimental and control crosses. Thirty-two to 64 individual crosses were set up for each crossing condition. The flies used were chosen at random from the desired group on the basis of age, sex, and genotype. These sample sizes were based on previous studies of CI in D. melanogaster that detected significant differences between treatment groups53.

To perform the hatch rate assays, a male and female pair was placed in an 8-ounce, round bottom, polypropylene Drosophila stock bottle. A grape-juice–agar plate with a small amount of yeast mix smeared on top was placed in the bottle opening and affixed with tape. To create grape-juice–agar plates, 12.5 g of agar was mixed in 350 ml of de-ionized water and autoclaved. In a separate flask, 10 ml of ethanol was used to dissolve 0.25 g tegosept (methyl 4-hyrdoxybenzoate). Welch’s grape juice (150 ml) was added to the tegosept mix, combined with the agar, and poured into lids from 35 ×10-mm culture dishes (CytoOne).

Hatch rate bottles were placed in an incubator at 25 °C overnight (~16 h). After this initial incubation, the grape plates were discarded and replaced with freshly yeasted plates. After an additional 24 h, the adult flies were removed and frozen for expression analysis, and the embryos on each plate were counted. The counting was not blinded. These plates were then incubated at 25 °C for 36 h before the number of unhatched embryos was counted. Larvae from each pair of flies were moved from these plates using a probe and placed in vials of standard fly media with one vial being used for each individual grape plate to be assayed for sex ratios at adulthood. A total of 10–20 vials were used for each cross type. Any crosses with fewer than 25 embryos laid were discarded from the hatching analysis while vials with fewer than ten adults emerging were discarded from the sex ratio analysis. Statistical analysis and outlier removal, using the ROUT method, were performed using Graphpad Prism version 6 software.

Embryo cytology

Embryos were collected in a fashion similar to hatch rate assays except bottles contained 60–80 females and 15–20 males. All flies used were brothers and sisters of those used during corresponding hatch rates. Embryo collections and hatch rates were performed side-by-side. After initial mating overnight, fresh grape plates with yeast were provided and removed after 60 min. The embryo-covered plates were then placed in the incubator at 25 °C for a further 60 min to ensure each embryo was at least 1–2 h old. Embryos were then moved to a small mesh basket and dechorionated in 50% bleach for 1–3 min. These were then washed in embryo wash solution (0.7% NaCl, 0.05% Triton X-100) and moved to a small vial containing ~2 ml heptane. An equal amount of methanol was added to the vial and then vigorously shaken for 15 s. After the embryos settled, the upper heptane layer and as much methanol as possible were removed, and the embryos were moved into ~500 μl fresh methanol in a 1.5 ml microcentrifuge tube. Embryos were stored overnight at 4 °C. The old methanol was then removed and replaced with 250 μl of fresh methanol along with 750 μl of PBTA (1× PBS, 1% BSA, 0.05% Triton X-100, 0.02% sodium azide). After inverting the tube several times, the solution was removed and replaced with 500 μl PBTA. Embryos were then rehydrated for 15 min on a rotator at room temperature. After rehydrating, the PBTA was replaced with 100 μl of a 10 mg/ml solution of RNase A (Clontech Labs) and incubated at 37 °C for 2 h. The RNase was then removed and embryos were washed several times with PBS followed by a final wash with PBS–azide (1× PBS, 0.02% sodium azide). After removing the PBS–azide, embryos were mounted on glass slides with ProLong Diamond Antifade (Life Technologies) spiked with propidium iodide (Sigma-Aldrich) to a final concentration of 1 μg/ml. Imaging was performed at the Vanderbilt University Medical Center Cell Imaging Shared Resource using a Zeiss LSM 510 META inverted confocal microscope. All scores were performed blind (researcher was not aware of which slide represented which cross) and image analysis was done using ImageJ software54. Matched scoring, where embryos were derived from a side-by-side hatch rate, was performed once for conditions shown in Fig. 4h and twice for Fig. 4g.

Statistical analyses

No statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment, except scoring of cytology (Fig. 4), which was done blindly.

All statistical analyses used GraphPad Prism software (either Prism 6 or online tools). When comparing gene expression levels or Wolbachia titres between two sets of data, we used a two-tailed, non-parametric Mann–Whitney U-test since it does not require a normal distribution of the data. For comparisons between more than two data sets, we used a non-parametric Kruskal–Wallis one-way analysis of variance test that, if significant, was followed by a Dunn’s test of multiple comparisons. This allowed robust testing between all data groups while avoiding multiple test bias. For the cytology studies, embryos were classified as either ‘normal’ or ‘CI-like’ in a 2 × 2 contingency table, and statistical differences between the groups were calculated using a Fisher’s exact test.

Data availability

wVitA transcriptome data have been deposited in the Sequence Read Archive with Bioproject PRJNA319204 and BioSample SAMN04881412. wPip-infected ovarian proteome data have been deposited in the Proteome Xchange Consortium via the PRIDE55 partner repository with the data set identifier PXD004047. All other source data are available as Supplementary Information with this publication.