J. Cent. South Univ. Technol. (2007)05-0612-06
DOI: 10.1007/s11771-007-0117-x ![](/web/fileinfo/upload/magazine/135/4992/image002.jpg)
Homology modeling and evolutionary trace analysis of superoxide dismutase from extremophile Acidithiobacillus ferrooxidans
LIU Yuan-dong(刘元东), WANG Hai-dong(王海东), QIU Guan-zhou(邱冠周)
(School of Resources Processing and Bioengineering, Central South University, Changsha 410083, China)
Abstract: The gene sod in Acidithiobacillus ferrooxidans may play a crucial role in its tolerance to the extremely acidic, toxic and oxidative environment of bioleaching. For insight into the anti-toxic mechanism of the bacteria, a three-dimensional (3D) molecular structure of the protein encoded by this gene was built by homology modeling techniques, refined by molecular dynamics simulations, assessed by PROFILE-3D and PROSTAT programs and its key residues were further detected by evolutionary trace analysis. Through these procedures, some trace residues were identified and spatially clustered. Among them, the residues of Asn38, Gly103 and Glu161 are randomly scattered throughout the mapped structure; interestingly, the other residues are all distinctly clustered in a subgroup near Fe atom. From these results, this gene can be confirmed at 3D level to encode the Fe-depending superoxide dismutase and subsequently play an anti-toxic role. Furthermore, the detected key residues around Fe binding site can be conjectured to be directly responsible for Fe binding and catalytic function.
Key words: bioleaching; superoxide dismutase; Acidithiobacillus ferrooxidans; homology modeling; evolutionary trace; molecular dynamics
1 Introduction
Acidithiobacillus ferrooxidans, a Gram-negative and chemolithotrophic bacterium, is one of the most applied bacteria in industrial mineral processing to extract metals such as copper, lead, zinc, uranium, gold and nickel from their insoluble sulfide minerals[1]. The environments for its survival are extreme acidity and contain a large range of high concentration of toxic or oxidative heavy cations and oxyanions[2]. One can wonder at how life is possible in such extreme environments. In fact, A. ferrooxidans can not only survive in those poisonous surroundings[3-4], but also take advantage of these conditions to gain energy for growth through oxidizing ferrous ion and reduced sulfur compounds as well as convert the insoluble metal sulfides to their soluble metal sulfates[5]. It is this capability of it that is utilized to extract metals from minerals. Therefore, understanding the anti-toxicity mechanism of A. ferrooxidans is of vital importance from economic, scientific and environmental point of view[1].
Superoxide dismutase (SOD, EC?1.15.1.1) catalyses the conversion of superoxide radicals to molecular oxygen and hydrogen peroxide[6-8]. Its function is to destroy the radicals that are toxic to biological systems. It is a key enzyme for the protection of organisms against toxic radicals produced during oxidative processes. Recent studies suggest that SOD is also important for resistance to arsenite[9] and heavy metal[10].
It was reported that the genome sequence of A. ferrooxidans ATCC 23270 includes a sod gene which may encode the SOD, so the gene may play a crucial role in its tolerance to the extremely acidic, toxic and oxidative environment of bioleaching and acidic mine drainage (AMD). But till now, there are no theoretical efforts being made in the protein encoded by this gene and its three-dimensional (3D) structure remains to be elucidated. Knowledge of the detailed 3D structure and hereby the key residues identified are, however, essential for understanding its catalytic mechanism and function in the unique physiology of A. ferrooxidans.
In this study, with homology modeling techniques and molecular dynamics simulations, an integral 3D molecular structure of the protein encoded by gene sod from A. ferrooxidans (afSOD) was built, refined and assessed. The obtained structure was further used to perform evolutionary trace analysis to identify the key residues.
2 Theory and methods
The primary amino acidic sequence of the afSOD was retrieved from the A. ferrooxidans ATCC 23270 genome in the Institute for Genomic Research. All simulations were performed on the Dell Precision 470 workstation with Redhat Linux system using INSIGHT II software package developed by Accelrys Software Inc.
2.1 3D model building
Homology modeling is usually the method of choice when a clear relationship of homology between the sequence of target protein and at least one known structure is found[11]. The HOMOLOGY module in INSIGHT II was used to build the initial model of afSOD. Firstly, a sequence similarity search by BLAST program was carried out with each of the sequence separately against proteins whose solved structures had been deposited in the protein data bank (PDB) to find related proteins as templates[12]. Then, MODELER program was performed to build the 3D structure of afSOD. MODELER is an implementation of an automated approach to comparative modeling by satisfaction of spatial restraints[13]. For the remaining side chains, library values of ROTAMERS were adopted[14]. Through the procedure, an initial molecular structure model was completed.
The initial model was further improved through energy minimization(EM). After performing 800 steps of conjugate gradient (CG) minimization, molecular dynamics(MD) simulation was carried out to check their stability via performing 70 ps simulations at a constant temperature 298 K. An explicit solvent model (TIP3P water)[15] was used, and the homology solvent model was constructed with a 2 nm water cap from the center of mass of afSOD. Finally, a conjugate gradient energy minimization of full protein was performed until the root mean square (RMS) gradient energy was lower than 42 GJ/(mol?m). All calculations mentioned above were accomplished by using DISCOVER_3 module in INSIGHT II. The consistent-valence forcefield (CVFF) was used for EM and MD simulations. In this step, the quality of the initial model was improved.
After the optimization procedure, the structure was assessed using PROFILE-3D and PROSTAT programs of INSIGHT II. The PROFILE-3D program was used to measure the compatibility of an amino acid sequence with a known 3D protein structure[16]. The PROSTAT program identified and listed the number of instances where structural features differ significantly from the average values calculated from known proteins[17].
2.2 Evolutionary trace analysis
The evolutionary trace (ET) analysis is a method for extracting the mutational information from those sequences homologous to a protein of interest and then mapping the information onto the 3D structure of the protein and hereby inferring which residues are likely to be important to its certain functions[18].
The EVOLUTIONARY TRACE program in INSIGHT II was used in this study. Firstly, the family of homologous sequences to a protein were aligned using the multiple sequence alignment (MSA) program ALIGN123. Based on the alignment, the sequences were clustered using a hierarchical clustering method [19] according to the average percentage sequence identity. The sequence cluster was a representation of the evolutionary or functional relationship of the sequences in the sequence family. Based on the cluster, sequences in a family could be divided into subfamilies at a given sequence percent identity cutoffs (PIC). At different cutoffs, the subfamily represented different levels of functional resolution. Residues conserved across the family of protein sequences were identified as conserved residues and were assumed to be essential for maintaining protein functions. Residues that were conserved within subgroups and different between subgroups were identified as class-specific residues and were assumed to be responsible for functional specificity to make distinction during evolution. The conserved residues and class-specific residues were called trace residues.
To define the functional interfaces, trace residues were mapped onto the structure of one of the proteins in the sequence family and clustered in 3D space using single linkage hierarchical clustering algorithms. Then the residue clusters were represented by dendrogram for visualization analysis.
3 Results and discussion
3.1 Modeling structure of SOD from A. ferrooxidans
The amino acid sequence of the enzyme encoded by the sod gene from A. ferrooxidans ATCC 23270 was compared with each of the sequence of the known proteins in PDB by FASTA program, the results showed that the crystal structure of a Fe-superoxide dismutase from the hyperthermophile Aquifex pyrophilus (PDB code 1COJ) [20] had the best sequence identity (36.10%) with it, so 1COJ was used to model the 3D structure of the protein. The sequence alignment between the target protein and the template protein is shown in Fig.1.
![](/web/fileinfo/upload/magazine/135/4992/image004.jpg)
Fig.1 Sequence alignment of afSOD with 1COJ
Based on the sequence alignment, an automated homology model building was performed using program MODELER. All the side chains of the modeled protein were set by ROTMERS. With these procedures, the initial molecular model was completed.
Then the energy minimizations of 800 iterations and dynamics simulations of 70 ps were performed. The variation of potential energy with time during 70 ps of molecular dynamics simulations is shown in Fig.2. In the figure, the potential energy falls rapidly in the first 30 ps, and then decreases with very low deviation between two steps, and the dynamics simulations tend to equilibrium at 45 ps. Thus, the conformation at 70 ps was chosen as the final 3D structure for the further study. This model was further refined by EM optimization of 4 000 iterations, and then the final stable structure of afSOD was obtained after the root mean square (RMS) gradient energy was lower than 42 GJ/(mol?m).
![](/web/fileinfo/upload/magazine/135/4992/image006.jpg)
Fig. 2 Variation of total potential energy during 70 ps of MD on afSOD( Total potential energy is averaged over 0.1 ps interval)
The final structure was further checked through PROFILE-3D and PROSTAT programs. The PROSTAT program was used to calculate the percent of backbone (Φ–Ψ) angles within the allowed Ramachandran region. The result is that 85.4% of the Φ–Ψ angles in the afSOD model lie in the core region of the Ramachandran plot. For the X-ray structure of 1COJ, the percent of backbone Φ–Ψ angles is 83.7%. PROSTAT program was also used to identify and list the number of instances where structural features differ significantly from average values in known proteins. The cutoff used for significant difference was six standard deviations from the reference values. The analytical results show that there are no significant differences between the calculated values for structure features in the modeled protein and the average values for those in known proteins for the total residues.
When checked by PROFILE-3D, the self- compatibility score for this protein is 76.67, which is higher than the low score (41.85) and close to the top score (93.00). This means that the structure model of afSOD is reasonable at the present level of theory. The checked detail results of PROFILE-3D are also presented in Fig.3. The compatibility scores above zero correspond to acceptable side chain environment, so Fig.3 indicates that all of the residues are reasonable. Both of the above results from PROFILE-3D and PROSTAT indicate that the modeled structure is reliable.
![](/web/fileinfo/upload/magazine/135/4992/image008.jpg)
Fig.3 Verified results of afSOD model(Residues with positive compatibility score are reasonably folded)
The modeled afSOD monomer consists of 205 amino acids, has a relative molecular mass of 23.02×103 by calculation. It can be subdivided into two domains, an α N-terminal domain and an α/β C-terminal domain, connected by a loop (Fig.4(a)). The structure of the N-terminal domain consists of a three helices in an anti-parallel hairpin with a left-handed twist. The structure of the C-terminal domain is of the α/β type, and consists of a three-stranded anti-parallel β-sheet along consists of a three-stranded anti-mparallel β-sheet along with four helices.
There is a Fe binding site in the structure of the enzyme, in which iron is liganded by His26, His75, Asp158, His162 and a solvent water molecule with distorted trigonal bipyramidal geometry (Fig.4(b)). Three atoms, NE2 of His751, OD2 of Asp158 and NE2 of His162, form a trigonal basal plane, and NE2 of His26 and the solvent molecule fill the two axial positions in the trigonal bipyramid. Till now, three evolutionarily distinct families of SODs are known, of which the Fe-binding family is one[8]. These geometric features of afSOD are well consistent with those of the general Fe-depending SODs. Combining the information of overall structure and active center, this gene sod in A. ferrooxidans can be confirmed at 3D level to encode the Fe-SOD and subsequently play an anti-toxic role in organism of the bacteria.
3.2 Evolutionary trace analysis
In the evolutionary trace(ET) analysis, the above modeled structure of afSOD was used as the mapped protein, some distinct SOD protein sequences from 31 species deposited in the SWISSPROT database and TrEMBL database were selected as the homologous sequences. The obtained protein sequences were aligned and then a dendrogram tree was generated to illustrate the relation of individual sub-family members (Fig.5). In the context of the known structure, ET identified patterns of sequence variations that correlate systematically with functional change during evolution.
![](/web/fileinfo/upload/magazine/135/4992/image010.jpg)
Fig.4 Final 3D-structure of SOD from A. ferrooxidans
(a) Overall structure; (b) Fe binding center
![](/web/fileinfo/upload/magazine/135/4992/image012.jpg)
Fig.5 Sequence identity dendrogram of SODs from all kinds sources(The right tree leaf nodes represent sequences from all kinds species; the left vertical bar, whose x coordinate represents the 38% PIC threshold, divides these sequences into five subfamilies)
A critical step in this method is to distinguish functional variations so as to divide a family into functional subgroups. In the absence of detailed biochemical insight into functional variations, sequence identity clustering which partitions a family into functional subgroups is relied additionally on. The PIC parameter defines the extent of sequence similarity within each group, and by varying PIC the functional resolution of the evolutionary trace can be controlled. In practical analysis, however, if a sequence is not similar to any other sequences and is a subgroup by itself at selected PIC, this sequence will be ignored; it will be included until the PIC is high enough to make it a member of a subgroup. Here, in all of the sequences in this protein family, the Fe-SOD from A. ferrooxidans was distant from others, which limits the most functional resolution level for this analysis. So the PIC value in this analysis was selected as 38%, which divided the family sequence members into five main distinct subgroups (Fig.5). Among them, the sequences from A. ferrooxidans with S. aurantiaca, A. dehalogenans, G. sulfurreducens, T. carboxydivorans, G. bemidjiensis and D. ethenogenes were clustered together.
Some trace residues were thus obtained. The conserved residues were His26, Asn38, His75, Gly103, Trp123, Asp158, Glu161 and His162. The class-specific residues were Tyr30, Tyr33 and Phe79. The obtained trace residues were mapped onto the modeled 3D structure of afSOD (Fig.6). In Fig.6, all of the class-specific residues and most of the conserved residues are distributed around the Fe-binding site, several conserved residues are scattered in other parts. In order to accurately observe these results, the trace residues in 3D space of the mapped protein were clustered. With the cutoff of 0.65 nm, the residues were divided four sub-clusters (Fig.7). Surprisingly, each trace residue of Asn38, Gly103 and Glu161 is a subgroup by itself; all of the remaining trace residues (His26, Tyr30, Tyr33, His75, Phe79, Trp123, Asp158 and His162) are in one subgroup related to Fe-binding area.
Although these trace residue positions may be conserved or class-specific by coincidence, they are more likely to be functionally important. The positions of Asn38, Gly103 and Glu161 are scattered throughout the protein structure, they may be important to structural stability as well as other functions, but will not be directly involved in catalytic function, so no clear inference can be made about them. The residues in the subgroup around Fe binding area are spatially clustered together and form a site characterized by an unusually low rate of mutation, all of which occur in concert with functional divergence. Such spatial cluster of conserved and class-specific residues can be made sure to identify evolutionarily privileged functional sites. These sites represent ancestral functional regions that remain common to all protein descendants. Hence, the cluster is an excellent active site candidate direct for catalytic function and Fe binding in afSOD.
![](/web/FileInfo/upload/magazine/135/4992/2010-7-14 13-26-22.jpg)
Fig.6 Evolutionary trace residues of SODs mapped onto modeled structure of Fe-SOD from A. ferrooxidens(The mapped overall structure is represented by thin ribbon; the class-specific residues of Tyr30, Tyr33 and Phe79 are represented by lines; all of the conserved residues, Fe atom and OH- are represented by sticks and balls; the trace residues cluster around the Fe binding site are cycled by dash line)
![](/web/fileinfo/upload/magazine/135/4992/image016.jpg)
Fig.7 Dendrogram of trace residues clustered using single linkage method(The right tree leaf nodes represent the trace residues; the left vertical bar, whose X coordinate represents the distance threshold of 0.65 nm among trace residues, divides the trace residues into four subclusters)
Previous experimental results have demonstrated that iron metal is very important both for catalytic and for stability of Fe-SOD in other species, which is liganded by the highly conserve residues of three histidines and one aspartic acid and one water molecule[20-21]. The residues of His26, His75, His162 and Asp158 in our results of ET analysis are obviously good in line with those facts. However, the residue of Trp123, which may be also essential for common catalytic function, the residues of Tyr33, Tyr30 and Phe79, which may be responsible for functional specificity to make distinction during evolution, have not been identified before. The results are helpful for guiding site-directed mutagenesis investigation and understanding the structure–function relationships of the enzyme.
4 Conclusions
1) The integral 3D molecular model of the enzyme encoded by gene sod from extremophile A. ferrooxidans is built and refined. The evaluation results by PROFILE-3D and PROSTAT programs indicate that this model is reliable. From the modeled structure information, this gene can be confirmed at 3D level to encode the Fe-SOD and subsequently play an anti-toxic role in organism of the bacteria.
2) Through ET analysis, some trace residues in this enzyme are identified, the conserved residues are His26, Asn38, His75, Gly103, Trp123, Asp158, Glu161 and His162, and the class-specific residues are Tyr30, Tyr33 and Phe79. Among these residues, Asn38, Gly103 and Glu161 may be important to structural stability as well as other functions, but will not be directly involved in catalytic function; The remain residues are directly responsible for Fe binding and catalytic function in SOD from A. ferrooxidans.
3) The detailed 3D structure and the key residues identified are helpful for guiding site-directed mutagenesis investigation and understanding the structure–function relationships of the enzyme and subsequently insight into the anti-toxicity of the A. ferrooxidans so as to finally serve for industrial bioleaching.
References
[1] RAWLINGS D E, KUSANO T. Molecular genetics of Thiobacillus ferrooxidans[J]. Microbiological Reviews, 1994, 58(1): 39-55.
[2] BAKER B J, BANFIELD J F. Microbial communities in acid mine drainage[J]. FEMS Microbiology Ecology, 2003, 44(2): 139-152.
[3] TUOVINEN O H, NIEMELA S I, GYLLENBERG H G. Tolerance of Thiobacillus ferrooxidations to some metals[J]. Antonie Van Leeuwenhoek(International Journal of General and Molecular Microbiology), 1971, 37(4): 489-496.
[4] BANERJEE P C. Genetics of metal resistance in acidophilic prokaryotes of acidic mine environments[J]. Indian Journal of Experimental Biology, 2004, 42(1): 9-25.
[5] QUATRINI R, APPIA-AYME C, DENIS Y, et al. Insights into the iron and sulfur energetic metabolism of Acidithiobacillus ferrooxidans by microarray transcriptome profiling[J]. Hydrometallurgy,2006, 83(1/4):263-272.
[6] ZELKO I N, MARIANI T J, FOLZ R J. Superoxide dismutase multigene family: a comparison of the Cu Zn-SOD (SOD1), Mn-SOD (SOD2), and EC-SOD (SOD3) gene structures, evolution, and expression[J]. Free Radical Biology and Medicine,2002, 33(3):337-349.
[7] BANNISTER J V, BANNISTER W H, ROTILIO G. Aspects of the structure, function, and applications of superoxide dismutase[J]. CRC Critical Reviews in Biochemistry, 1987, 22(2): 111-180.
[8] PARKER M W, BLAKE C C. Iron- and manganese-containing superoxide dismutases can be distinguished by analysis of their primary structures[J]. FEBS Letters, 1988, 229(2): 377-382.
[9] PARVATIYAR K E, ALSABBAGH U A, OCHSNER M A, et al. Global analysis of cellular factors and responses involved in Pseudomonas aeruginosa resistance to arsenite[J]. Journal of Bacteriology, 2005, 187(14): 4853-4864.
[10] GESLIN C, LLANOS J, PRIEUR D, et al. The manganese and iron superoxide dismutases protect Escherichia coli from heavy metal toxicity[J]. Research in Microbiology,2001, 152(10): 901-905.
[11] BLUNDELL T L, SIBANDA B L, STERNBERG M J E, et al. Knowledge-based prediction of protein structures and the design of novel molecules[J]. Nature, 1987, 326: 347-352.
[12] LIPMAN D J, PEARSON W R. Rapid and sensitive protein similarity searches[J]. Science, 1985, 227: 1435–1441.
[13] SALI A, POTTERTON L, YUAN F, et al. Evaluation of comparative protein modeling by MODELLER[J]. Proteins, 1995, 23(3): 318-326.
[14] PONDER J W, RICHARDS F M. Tertiary templates for proteins: Use of packing criteria in the enumeration of allowed sequences for different structural classes[J]. Journal of Molecular Biology,1987, 193(4): 775-791.
[15] JORGENSEN W L, CHANDRASEKHAR J, MADURA J D, et al. Comparison of simple potential functions for simulating liquid water[J]. Journal of Chemical Physics, 1983,79(2): 926–935.
[16] LUTHY R, BOWIE J U, EISENBERG D. Assessment of protein models with three-dimensional profiles[J]. Nature, 1992, 356: 83-85.
[17] MORRIS A L, MACARTHUR M W, HUTCHINSON E G, et al. Stereochemical quality of protein structure coordinates[J]. Proteins, 1992, 12(4): 345-364.
[18] LICHTARGE O, BOURNE H R, COHEN F E. An evolutionary trace method defines binding surfaces common to protein families[J]. Journal of Molecular Biology, 1996, 257(2): 342-358.
[19] HARTIGAN J A. Clustering Algorithms[M].New York: Wiley Press, 1975.
[20] LIM J H, YU Y G, HAN Y S, et al. The crystal structure of an Fe-superoxide dismutase from the hyperthermophile Aquifex pyrophilus at 1.9 ? resolution: Structural basis for thermostability[J]. Journal of Molecular Biology, 1997, 270(2): 259-274.
[21] HATCHIKIAN E C, HENRY Y A. An iron-containing superoxide dismutase from the strict anaerobe Desulfovibrio desulfuricans[J]. Biochimie, 1977, 59(2): 153-161.
Foundation item: Project(2004CB619201) supported by the National Basic Research Program of China; Project (50321402) supported by the National Natural Science Foundation of China
Received date: 2007-04-20; Accepted date: 2007-05-23
Corresponding author: QIU Guan-zhou, Professor, PhD; Tel: +86-731-8879815; E-mail: lydcsu@yahoo.com.cn
(Edited by CHEN Wei-ping)