Recent Gene Duplications in Streptococcus mutans and Streptococcus sanguinis

Ching-Hung Tseng and Gary Xie

Introduction

In this project, we looked for recent duplicated genes from Streptococcus mutans and Streptococcus sanguinis. Gene duplication event is considered recent if the duplication happens after the speciation event (See Figure 1). In this case, searching all proteins in an organism against proteins in non-redundant (NR) database using BLAST will identify the recently duplicated genes. The top hit will be against itself but if the second best hit comes from the same organism, we can consider this gene to be recently duplicated except in the case of gene loss for non-orthologous gene displacement1. Therefore, results shown here represents preliminary results that may need further analysis to be sure that the gene found is indeed recently duplicated.

The criteria for searching recent duplicated gene in Streptococcus mutans and Streptococcus sanguinis are as follows:

From this search, S. mutans and S. sanguinis are found to have 16 and 29 putative duplicated genes in their genomes respectively. If the potentially duplicated gene pairs possess similar functional definition, we are more confident to say that they might be resulting from a gene duplication event. Bidirectional second best hit pairs for S. mutans and S. sanguinis are shown in the tables 1 and 2 below respectively.

Figure 1. Concept of recent gene duplication

Table 1: The list of recent duplicated genes in Streptococcus mutans

Locus TagDescriptionHit Locus TagHit DescriptionHit CoverageHit E-value
SMU.875cputative transposase, IS150-likeSMU.1370cputative transposase, IS150-like0.980(201/205)1.00E-115
SMU.1370cputative transposase, IS150-likeSMU.875cputative transposase, IS150-like0.980(201/205)1.00E-115
SMU.436cputative transposase, ISSmu1SMU.565cputative transposase, ISSmu10.996(277/278)1.00E-157
SMU.565cputative transposase, ISSmu1SMU.436cputative transposase, ISSmu10.996(277/278)1.00E-157
SMU.767putative transposase, ISSmu1SMU.565cputative transposase, ISSmu10.996(277/278)1.00E-157
SMU.286putative ABC transporter, ATP-binding protein ComASMU.1881cputative ABC transporter, ATP-binding protein0.811(619/763)0
SMU.1881cputative ABC transporter, ATP-binding proteinSMU.286putative ABC transporter, ATP-binding protein ComA0.834(634/760)0
SMU.1347cconserved hypothetical protein; possible permeaseSMU.1347c. Best hit is SMU.1365cconserved hypothetical protein; possible permease0.985(768/780)0
SMU.1365chypothetical protein; possible permeaseSMU.1347cconserved hypothetical protein; possible permease0.985(768/780)0
SMU.1021putative citrate lyase, alfa subunitnot found in smut gbk file, x-ray dataPutative Alfa Subunit Of Citrate Lyase0.960(498/519)0
SMU.112cputative transcriptional regulatorSMU.112c. Best hit gi is 93277277, a RpiR-like transcription factor.putative transcriptional regulator0.884(220/249)1.00E-115
SMU.1407cputative transposase, ISSmu1SMU.565cputative transposase, ISSmu10.996(277/278)1.00E-157
SMU.18363-deoxy-7-phosphoheptulonate synthaseSMU.18373-deoxy-7-phosphoheptulonate synthase0.875(300/343)1.00E-156
SMU.1893cputative transposase, ISSmu1SMU.565cputative transposase, ISSmu10.996(277/278)1.00E-157
SMU.1954chaperonin GroELnot found in smut gbk file, OME175 strainGroEL0.930(489/526)0
SMU.590cputative transposase, fragmentSMU.1024cputative transposase fragment0.943(50/53)7.00E-20


Table 2: The list of recent duplicated genes in Streptococcus sanguinis

Locus TagDescriptionHit Locus TagHit DescriptionHit CoverageHit E-value
SSA_0134Membrane carboxypeptidase (penicillin-binding protein), putative SSA_0175Penicillin-binding protein 1B, putative0.935(701/750)0
SSA_0175Penicillin-binding protein 1B, putative SSA_0134Membrane carboxypeptidase (penicillin-binding protein), putative0.872(701/804)0
SSA_1362ORFB, transposon ISSsa2 SSA_0266ORFB, transposon ISSsa10.995(201/202)1.00E-111
SSA_0266ORFB, transposon ISSsa1 SSA_1362ORFB, transposon ISSsa20.995(201/202)1.00E-111
SSA_1305hypothetical protein SSA_1110hypothetical protein SSA_11100.870(160/184)8.00E-79
SSA_1110hypothetical protein SSA_1305hypothetical protein SSA_13050.870(160/184)8.00E-79
SSA_1285hypothetical protein SSA_1289hypothetical protein SSA_12890.975(154/158)5.00E-82
SSA_1289hypothetical protein SSA_1285hypothetical protein SSA_12850.975(154/158)5.00E-82
SSA_1681ABC-type bacitracin resistance protein A, ATPase component, putative SSA_1660ABC-type antimicrobial peptide transport system, ATPase component, putative0.992(251/253)1.00E-135
SSA_1660ABC-type antimicrobial peptide transport system, ATPase component, putative SSA_1681ABC-type bacitracin resistance protein A, ATPase component, putative0.969(251/259)1.00E-135
SSA_1757hypothetical protein SSA_1758hypothetical protein SSA_17580.842(170/202)4.00E-80
SSA_1758hypothetical protein SSA_1757hypothetical protein SSA_17570.842(170/202)4.00E-80
SSA_0148Sugar ABC transporter, ATP-binding protein, putative SSA_2040ABC transporter ATP-binding protein-multiple sugar transport, putative0.960(361/376)0
SSA_0394hypothetical protein SSA_0599hypothetical protein SSA_05990.849(73/86)8.00E-32
SSA_0407ABC-type multidrug transport system (3-component subtilin immunity exporter), ATPase component, putativeSSA_0412ABC-type multidrug transport system (3-component subtilin immunity exporter), ATPase component, putative0.828(250/302)1.00E-112
SSA_0432Formate--tetrahydrofolate ligase, putative not found in ssan gbk file. From NCBI, this gi is SSA_0432.FTHS1_STRSV Formate--tetrahydrofolate ligase 1 (Formyltetrahydrofolate synthetase 1) (FHS 1) (FTHFS 1)0.966(538/557)0
SSA_0524Microcompartment protein, putative SSA_0525Microcompartment protein, putative0.802(73/91)1.00E-29
SSA_0555Conserved hypothetical cytosolic protein SSA_0557hypothetical protein SSA_05570.942(129/137)5.00E-66
SSA_0556hypothetical protein SSA_0562hypothetical protein SSA_05620.932(96/103)9.00E-50
SSA_0684Fibril-like structure subunit FibA, putative SSA_1635hypothetical protein SSA_16350.958(660/689)1.00E-94
SSA_0904CshA-like fibrillar surface protein A SSA_0906CshA-like fibrillar surface protein C0.850(2268/2669)0
SSA_0947hypothetical protein SSA_0948hypothetical protein SSA_09480.832(164/197)9.00E-80
SSA_0969hypothetical protein SSA_0968hypothetical protein SSA_09680.858(103/120)5.00E-47
SSA_1007ABC transporter ATP-binding protein-multiple sugar transport, putative SSA_0148Sugar ABC transporter, ATP-binding protein, putative0.923(347/376)0
SSA_1340Zn/Mn ABC-type porter lipoprotein, putative SSA_1990Zn-porter lipoprotein, putative0.820(250/305)1.00E-121
SSA_1369FmtA-like protein, putative SSA_1366Alkaline D-stereospecific endopeptidase precursor, putative0.875(398/455)0
SSA_1663Collagen-binding protein A SSA_1666Collagen-binding surface protein, putative1.479(954/645)1.00E-170
SSA_2016Phosphoglycerate mutase, putative SSA_2015Phosphoglycerate mutase, putative0.828(192/232)8.00E-90
SSA_2156Conserved uncharacterized protein SSA_2155hypothetical protein SSA_21550.818(189/231)2.00E-87

Reference

1. Graur and Li. 2000. Genome Evolution, p. 372. In Fundamentals of Molecular Evolution. Sinauer Associates, Sunderland, MA.