Genetic diversity in cultured and wild marine cyanomyophage reveals phosphorus stress as a strong selective agent 

Libusha Kelly, Huiming Ding, Katherine H. Huang, Marcia S. Osburne, and Sallie W. Chisholm


Viruses that infect marine cyanobacteria cyanophage often carry genes that have orthologs in their cyanobacterial hosts, and the frequency of these genes can vary with habitat.  To explore habitat-influenced genomic diversity more deeply, we used the genomes of 28 cultured cyanomyoviruses, including 11 new genomes, as references to identify phage genes in three different ocean habitats.  Only about 6-11% of genes were consistently observed in the wild, revealing high gene content variability in these populations. Numerous shared phage/host genes differed in relative frequency in different environments, including genes related to phosphorous acquisition, photorespiration, photosynthesis, and the pentose phosphate pathway, possibly reflecting environmental selection for these genes in phage genomes. The strongest emergent signal was related to phosphorous; a higher fraction of phage genomes from relatively low-phosphorus environments the Sargasso and Mediterranean Sea contained host-like phosphorus assimilation genes compared with those from the N. Pacific Gyre. These genes were previously observed to be upregulated when the host is phosphorous starved; a response mediated by pho box motifs in phage genomes that bind a host regulatory protein. Eleven phage genomes have predicted pho boxes upstream of the phosphate-acquisition genes pstS and phoA, and eight of these have a conserved, hypothetical, cyanophage-specific gene (PhCOG173) between the pho box and pstS. PhCOG173 is also found upstream of other shared phage/host genes, suggesting a unique regulatory role.  Pho boxes are also found upstream of high light-inducible protein (hli) genes in phage, suggesting that this motif may play a broader role than simply regulating phosphate-stress responses in infected hosts, or that these hlis may play a role in phosphate stress.


This version contains following 68 (24 host, 44 phage) genomes and 55622  genes, among which 49559 host and 6063 phage genes are annotated in various COGs, respectively. This version does not apply the hmm scan.

Genome Type
P-RSP2 cpp
P-SSM2 cpm
P-SSM4 cpm
S-PM2 cpm
Syn9 cpm
P-HM1 cpm
P-HM2 cpm
P-RSM4 cpm
P-SSM7 cpm
Syn19 cpm
Syn33 cpm
Syn1 cpm
S-ShM2 cpm
S-SM2 cpm
S-SSM7 cpm
S-SSM5 cpm
S-SM1 cpm
S-RSM4 cpm
S-SSM4 cpm
P-SSM3 cpm
P-SSM5 cpm
MED4-213 cpm
P-RSM1 cpm
P-RSM3 cpm
P-RSM6 cpm
S-SSM2 cpm
Syn10 cpm
Syn2 cpm
Syn30 cpm
MED4-117 cpm
P-SSP7 cpp
P60 cpp
Syn5 cpp
P-HP1 cpp
P-RSP5 cpp
P-SSP2 cpp
P-SSP9 cpp
P-SSP5 cpp
MED4-184 cpp
P-SSP10 cpp
P-SSP6 cpp
P-GSP1 cpp
PSS2 cps
P-SSS3 cps
MED4 p
MIT9313 p
MIT9303 p
AS9601 p
MIT9515 p
MIT9215 p
MIT9211 p
MIT9312 p
SS120 p
MIT9301 p
MIT9202 p
Syn_CC9311 s
Syn_CC9605 s
Syn_CC9902 s
Syn_WH8102 s
Syn_WH7803 s
Syn_RCC307 s
Syn_WH7805 s
Syn_BL107 s
Syn_RS9917 s
Syn_RS9916 s
Syn-WH5701 s

p - Prochlorococcus; s - Synechococcus; cpm - myo phage; cpp - podo phage; cps - sipho phage.