Transcription in all cellular organisms is performed by multi-subunit, DNA-dependent RNA polymerases that synthesize RNA from DNA templates. RNA polymerases. In order to overcome technical difficulties inherent to the large subunit sequences, including large sequence length, small and large lineage-specific insertions, split subunits, and fused proteins, we created an automated 1352226-88-0 supplier and customizable sequence retrieval and processing system. In addition, Rabbit Polyclonal to XRCC4 we used our alignments to create a more expansive set of shared sequence regions and bacterial lineage-specific domain insertions. We also analyzed the intergenic gap between the bacterial and genes. (( subunit and its eRNAP II and III homologs, along with the eRNAP II homologs from mouse and ( confirmed that both were SBHM domain repeats involved in important protein-protein and/or protein-nucleic acid interactions 23. In this paper we present a large scale sequence analysis of the multi-subunit RNAP large subunits. We created comprehensive multiple sequence alignments (MSAs) for the two large subunits from the following multi-subunit RNAPs: bRNAP, pRNAP, aRNAP, eRNAPs I, II, III, as well as vRNAP. To aid in the creation of the alignments we also developed a sequence retrieval and processing system termed BlaFA (BLAST to FASTA File to Alignment). We used the alignments to better define the shared sequence regions common to all multi-subunit RNAPs. We also analyzed the intergenic gap between the bacterial and genes (encoding the and subunits, respectively), uncovering interesting examples of gene overlap and extreme spacing. In addition, we located and analyzed the bacterial lineage-specific insertions, identifying both new inserts as well as additional species and domain organizations for some of the previously identified insertions. Results BLAST to FASTA File to Alignment (BlaFA) Due to the inherent complexities associated with aligning the multi-subunit 1352226-88-0 supplier RNAP large subunits, the process of sequence selection required many steps and special considerations. For example, some sequences needed to be joined since some RNAPs harbor split large subunits that are encoded by two gene products (chloroplast and cyanobacterial , aRNAP subunits A and B). Some sequences needed to be split since a small number of bacteria, such as K12 and as representative sequences. This was followed by sequence selection, where the downloaded sequences were processed to: i) join split gene products, ii) split fused gene products, and iii) remove incorrect and partial sequences. Sequences were initially aligned using the program PCMA 26 followed by manual adjustments using PFAAT 27 to fix alignment errors and remove the lineage-specific insertions. We used BlaFA plus manual alignment adjustments to create MSAs for the bacterial and subunits, as well as for all identifiable / homologs (Table 2). Table 2 Number of sequences in multi-subunit RNAP MSAs Phylogenetic Analysis of the All RNAP Large Subunit MSA A phylogenetic tree for the all RNAP large subunit MSA (with more than 1000 large subunit sequences; Table 2) shows that each class of RNAP was clearly segregated (Fig. 2A), indicating that RNAP class assignments were accurate. As expected, the analysis showed that, although the aRNAPs clearly belong to the RNAP II class, they represent an intermediate between the eukaryotic and bacterial RNAPs. The vRNAPs from the NCLDVs are related to eukaryotic RNAPs. To our knowledge it has not been appreciated that the Iridoviridae, Phycodnaviridae, and Mimivirus families seem to have acquired an eRNAP II-like RNAP, while the Poxviridae family seems to have acquired an eRNAP I-like RNAP (Fig. 2B). It should be noted that another member of the NCLDVs, the Asfarviridae, were not included in this analysis as their RNAP sequences were relatively highly divergent. Close examination of the bRNAP branch showed that the pattern of segregation correlated with established bacterial taxonomy, demonstrating that our alignment contained sequences from a large and diverse set of bacterial species (Fig. 2C). Furthermore, it also highlighted the previously established close relationship between the cyanobacterial and pRNAPs. Fig. 2 Phylogenic analysis of the All RNAP Large Subunits MSA. The two All RNAP Large Subunit alignments 1352226-88-0 supplier were combined by species and the residue positions pruned to only keep the regions shared among all the sequences. The phylogenic trees were calculated using … Bacterial Large Subunit Fusions The naturally occurring fusion of and in the species 24; 25 has been implicated in the fitness for bacterial infection as well as in the.