Amino acid residue substitution function
[Overview]
Some amino acid residues of proteins present in the body may be mutated due to some influence. Even the protein with mutations of only a few residues can have dramatic changes in function and activity. In order to study such the phenomena by a simulation, using pdb file of a protein which has amino acid sequence and can be got from Protein Data Bank (PDB), you should apply amino acid substitutions to the protein for making a mutated protein.
When you use the pdb file provided by the PDB, you may need to correct data in the file due to lack of data on side chains of amino acids. CONFLEX allows you to replace or supplement the lacking side chains of amino acids in the pdb file before starting structure optimization or other calculations. Furthermore, CONFLEX provides you information on the lacking data in the pdb file regardless of side chain and back bone.
[Substitution of side chain]
We use a pdb file having missing atoms, correct the data, and perform a calculation.
First, search “1OHR” from PDB web site, and download 1ohr.pdb file.
This structure is HIV-1 protease containing an inhibitor obtained by X-ray crystal structure analysis and is dimeric protein with 99 amino acid residues. In the 1ohr.pdb file, the line starting with “REMARK 470” contains information on the missing atoms.
Lines starting with “REMARK 470” in the 1ohr.pdb file
REMARK 470 REMARK 470 MISSING ATOM REMARK 470 THE FOLLOWING RESIDUES HAVE MISSING ATOMS (M=MODEL NUMBER; REMARK 470 RES=RESIDUE NAME; C=CHAIN IDENTIFIER; SSEQ=SEQUENCE NUMBER; REMARK 470 I=INSERTION CODE): REMARK 470 M RES CSSEQI ATOMS REMARK 470 GLN A 7 CD OE1 NE2 REMARK 470 LYS A 14 CE NZ REMARK 470 GLU A 34 CD OE1 OE2 REMARK 470 GLU A 35 CD OE1 OE2 REMARK 470 ARG A 41 CG CD NE CZ NH1 NH2 REMARK 470 LYS A 43 CG CD CE NZ REMARK 470 LYS A 45 CG CD CE NZ REMARK 470 LYS A 55 CD CE NZ REMARK 470 GLN A 61 CG CD OE1 NE2 REMARK 470 LYS A 70 CE NZ REMARK 470 GLN B 7 CD OE1 NE2 REMARK 470 LYS B 14 CG CD CE NZ REMARK 470 ARG B 41 CG CD NE CZ NH1 NH2 REMARK 470 LYS B 55 CE NZ REMARK 470 GLN B 61 CD OE1 NE2
[REMARK 470 GLN A 7 CD OE1 NE2] means that the Cδ, Oε1, and Nε2 of glutamine at the 7th residue of the A chain are missing. CONFLEX uses lines starting with “ATOM” to get structure data. Therefore, if you modify the data of “REMARK 470”, the modifications do not effect on the calculation.
Execute CONFLEX using the 1ohr.pdb file.
[Execution by Interface]
Open the 1ohr.pdb file by CONFLEX Interface.
Select [CONFLEX] in Calculation menu, and click
in the calculation setting dialog displayed.Next, click
at bottom right in the detail setting dialog.After the click, a dialog with the keywords for the calculation settings is displayed.
Delete all “PDB_CONECT=” keywords that automatically created by Interface, and add information of double bonds in the inhibitor by “PDB_CONECT=” keyword. In “PDB_CONECT=(i,j,n)”, i and j are serial number of atoms described in the pdb file, and n is bond order. When you complete the modifications, click
. The calculation will start.[Execution by command line]
The calculation settings are defined by describing keywords in the 1ohr.ini file.
1ohr.ini file
MMFF94S PDB_CONECT=(1768,1774,2) PDB_CONECT=(1781,1782,2) PDB_CONECT=(1783,1784,2) PDB_CONECT=(1785,1786,2) PDB_CONECT=(1787,1788,2) PDB_CONECT=(1792,1793,2) PDB_CONECT=(1794,1795,2) PDB_CONECT=(1796,1797,2)
[MMFF94S] keyword means to use MMFF94s force field.
[PDB_CONECT=] keywords set information of double bonds in the inhibitor. In “PDB_CONECT=(i,j,n)”, the i and j are serial number of atoms described in the pdb file, and the n is bond order.
Store the two files of 1ohr.pdb and 1ohr.ini in an one folder, and execute below command. The calculation will start.
C:\CONFLEX\bin\flex9a_win_x64.exe -par C:\CONFLEX\par 1ohrenter
The above command is for Windows OS. For the other OS, please refer to [How to execute CONFLEX].
Calculation results
After the calculation finished, if you check the bso file outputted, you will get the following messages.
The calculation has some errors. Make sure that these errors correspond to the missing atoms mentioned above.
Error messages in the bso file
PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 7 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,A,7) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 14 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,14) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 34 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLU,A,34) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 35 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLU,A,35) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 41 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(ARG,A,41) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 43 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,43) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 45 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,45) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 55 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,55) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 61 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,A,61) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 70 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,70) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 7 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,B,7) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 14 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,B,14) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 41 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(ARG,B,41) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 55 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,B,55) PDB_EXT/CHECK_AND_BUILD: ERROR -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 61 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,B,61)
Here, there are two ways to avoid the errors and perform the calculation.
Execution using original coordinates
One of the ways is to execute the calculation using original coordinates.
In this case, add [PDB_NOMUTATE] keyword to the calculation setting. [PDB_NOMUTATE] means to directly use original data that has the missing atoms in the amino acid residues.
[Execution by Interface]
Add [PDB_NOMUTATE] keyword to the dialog displayed by click
. When you complete the modification, click . The calculation will start.[Execution by command line]
Add [PDB_NOMUTATE] keyword to the 1ohr.ini file.
1ohr.ini file
MMFF94S
PDB_NOMUTATE
PDB_CONECT=(1768,1774,2)
PDB_CONECT=(1781,1782,2)
PDB_CONECT=(1783,1784,2)
PDB_CONECT=(1785,1786,2)
PDB_CONECT=(1787,1788,2)
PDB_CONECT=(1792,1793,2)
PDB_CONECT=(1794,1795,2)
PDB_CONECT=(1796,1797,2)
Store the two files of 1ohr.pdb and 1ohr.ini in an one folder, and execute below command. The calculation will start.
C:\CONFLEX\bin\flex9a_win_x64.exe -par C:\CONFLEX\par 1ohrenter
The above command is for Windows OS. For the other OS, please refer to [How to execute CONFLEX].
Calculation results
After the calculation finished, if you check the bso file outputted, you will get the following messages. You can see that the error messages have changed to the warning ones.
* Note that it takes a very long time to finish the calculation because it is the large molecule.
Warning messages in the bso file
PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 7 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,A,7) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 14 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,14) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 34 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLU,A,34) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 35 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLU,A,35) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 41 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(ARG,A,41) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 43 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,43) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 45 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,45) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 55 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,55) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 61 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,A,61) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN A 70 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,A,70) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 7 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,B,7) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 14 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,B,14) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 41 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(ARG,B,41) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 55 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(LYS,B,55) PDB_EXT/CHECK_AND_BUILD: WARNING -- INCLUDE UNKNOWN RESIDUE IN CHAIN B 61 *PLEASE SET KEYWORD(S) :PDB_MUTATE=(GLN,B,61)
Execution by substitution of side chain
The other way is to make up for the missing atoms and perform the calculation.
You can see “PDB_MUTATE=” keywords in the warning messages above. Delete “PDB_NOMUTATE” keyword, and add “PDB_MUTATE=” keywords shown above.
“PDB_MUTATE=” is a keyword for replacing or supplementing a side chain. The first element in the parenthesis is name of amino acid, the second one is ID of protein chain, and the third one is residue number. The name of amino acid supports 1-character, 3-characters, and full-characters notations (refer to below table). “PDB_MUTATE=(GLN,A,7)” means to replace the 7th residue of the A chain to glutamine.
Name of amino acid | 1-Character | 3-Characters | Full name |
---|---|---|---|
Alanine | A | ALA | ALANINE |
Cysteine | C | CYS | CYSTEINE |
Aspartic acid | D | ASP | ASPARTIC_ACID |
Glutamic acid | E | GLU | GLUTAMIC_ACID |
Phenylalanine | F | PHE | PHENYLALANINE |
Glycine | G | GLY | GLYCINE |
Histidine | H | HIS | HISTIDINE |
Isoleucine | I | ILE | ISOLEUCINE |
Lysine | K | LYS | LYSINE |
Leucine | L | LEU | LEUCINE |
Methionine | M | MET | METHIONINE |
Asparagine | N | ASN | ASPARAGINE |
Proline | P | PRO | PROLINE |
Glutamine | Q | GLN | GLUTAMINE |
Arginine | R | ARG | ARGININE |
Serine | S | SER | SERINE |
Threonine | T | THR | THREONINE |
Valine | V | VAL | VALINE |
Tryptophan | W | TRP | TRYPTOPHAN |
Tyrosine | Y | TYR | TYROSINE |
[Execution by Interface]
Add [PDB_MUTATE] keywords to the dialog displayed by click
. When you complete the modification, click . The calculation will start.[Execution by command line]
Add [PDB_MUTATE] keywords to the 1ohr.ini file
1ohr.ini file
MMFF94S PDB_MUTATE=(GLN,A,7) PDB_MUTATE=(LYS,A,14) PDB_MUTATE=(GLU,A,34) PDB_MUTATE=(GLU,A,35) PDB_MUTATE=(ARG,A,41) PDB_MUTATE=(LYS,A,43) PDB_MUTATE=(LYS,A,45) PDB_MUTATE=(LYS,A,55) PDB_MUTATE=(GLN,A,61) PDB_MUTATE=(LYS,A,70) PDB_MUTATE=(GLN,B,7) PDB_MUTATE=(LYS,B,14) PDB_MUTATE=(ARG,B,41) PDB_MUTATE=(LYS,B,55) PDB_MUTATE=(GLN,B,61) PDB_CONECT=(1768,1774,2) PDB_CONECT=(1781,1782,2) PDB_CONECT=(1783,1784,2) PDB_CONECT=(1785,1786,2) PDB_CONECT=(1787,1788,2) PDB_CONECT=(1792,1793,2) PDB_CONECT=(1794,1795,2) PDB_CONECT=(1796,1797,2)
Store the two files of 1ohr.pdb and 1ohr.ini in an one folder, and execute below command. The calculation will start.
C:\CONFLEX\bin\flex9a_win_x64.exe -par C:\CONFLEX\par 1ohrenter
The above command is for Windows OS. For the other OS, please refer to [How to execute CONFLEX].
Calculation results
After the calculation finished, if you check the bso file outputted, you will get the following messages.
Information on "MUTATE RESIDUE" in the bso file
PDB_EXT: MUTATE RESIDUE FROM 7GLN TO GLN PDB_EXT: MUTATE RESIDUE FROM 14LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 34GLU TO GLU PDB_EXT: MUTATE RESIDUE FROM 35GLU TO GLU PDB_EXT: MUTATE RESIDUE FROM 41ARG TO ARG PDB_EXT: MUTATE RESIDUE FROM 43LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 45LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 55LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 61GLN TO GLN PDB_EXT: MUTATE RESIDUE FROM 70LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 7GLN TO GLN PDB_EXT: MUTATE RESIDUE FROM 14LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 41ARG TO ARG PDB_EXT: MUTATE RESIDUE FROM 55LYS TO LYS PDB_EXT: MUTATE RESIDUE FROM 61GLN TO GLN
In order to correct the missing atoms, you should specify name of the original amino acid by PDB_MUTATE keyword. If you want to change the sequence of amino acids in the protein, you should specify name of a different amino acid from the original by PDB_MUTATE keyword.