Generation of solvent molecules
At the final stages of refinement with sufficiently high resolution data, solvent molecules can be identified and added to a macromolecular structure. When properly inserted, they can improve density and lower the R-value, In many cases adding solvent moelcules is fitting the noise. So the number of solvent molecules you should or can insert into a macromolecular is usually a number defined with its ratio against the number of residues in the macromolecule. Higher resolution means in general more solvent.
A click on "GEN_SOLV" on "REFINE" page invokes a macro that does the whole thing. Before invoking the macro, you need to genarate the two difference maps ("MAP_2FOFC" and "MAP_FOFC").
Present solvent molecules are checked for their B values and 2Fobs-Fcalc map consistency and symmetry overlap. The segment name of already existing (old) solvent molecules must be "WAT". The newwly created ones will get the segment name "WAT2". At the beginning of each trial all atoms with segment name "WAT2" are deleted, so you should rename the WAT2 segment into WAT after one cycle of generation of solvent molecules is completed.
The basic creation for generation of the new ones is based on a difference map peak search procedure. The number of peaks is reduced on the basis of their consistency within a 2Fobs-fcalc map density, distance from macromolecular atoms (overlap and vicinity of a possible hydrogen donnors or acceptors) and symmetry overlap.
Kicking atoms before calculating the maps, may help you to differentiate between the noise peaks and the real solvents.
In the final stages of a macromolecular structure refinement well ordered solvent molecules are placed around the structure. Their positioning and later verification of their positions often result in a substantial amount of work. In order to avoid most of it, the below described procedure was developed. The procedure
The simplest way to create and update your "gen_solvent.com" is through "create_config" tools using the MAIN_CONF:create_gen_solv.pl.
create_gen_solv.pl -h|--help) prints this message with available options and current status -b|--b_value) B-value cut for previous rounds in WAT [70.0] -p|--peak) peak threshold in diff density map (fo-fc) [2.3] -d|--density) density map (2fo-fc) cutoff [1.1] -c|--close) the closest density pick existing atom distance [2.3] -f|--far) the farthest density pick distance to an existing atom [4.5] -m|--merge) remove symmetry related peaks if they get closer than [2.3] -l|--list) key "check_list" contains residues closer than [2.5] to peaks -i|--ion) key "ion_list" conatins WAT atoms with fo-fc dens higher than [1.2] -w|--what) what to do: [CHECK/NEW/BOTH] [BOTH] --doit) create the macro
Old solvent molecules should be included in the segment WAT and their residue names should be H2O. The newly generated ones will be included in the segment WAT2. When rerunning the procedure all solvents in WAT2 will be deleted and generated under the (new) critearia specified in a "gen_solvent.com."
The SUBROUTINE commands converts the passing parameters (maps and segment names) to local variables
subroutine int 2FOFC int FOFC char SEGMENTS char PROTEIN
These are the cutoffs you should probably tailor through "main_config" or via an text editor.
set vari TOO_CLOSE local real = 2.3 set vari TOO_FAR local real = 4.5 set vari PAIR_CUT local real = 2.3 set vari PAIR_LIST local real = 2.5
set vari 2FO-FC_CUT local real = 1.1 set vari FO-FC_CUT local real = 2.3 set vari FO-FC_ION local real = 1.2 set vari TEMP_CUT local real = 70.0
Two or more peaks become one if they are closer than PEAK_SIZE
set vari PEAK_SIZE local real = 1.5
Below here is the area of interest for programers and curious people who want to know how MAIN deals with solvent generation.
Since peaks are searched only in a map, the procedure below generates a new map out of the Fo-Fc map to asure that the difference map really lies around your molecular model.
set vari MAP_DIFF local int = FOFC + 1 if ( 2FOFC .gt. FOFC ) set vari MAP_DIFF local int = 2FOFC + 1 make map MAP_DIFF from FOFC init 9999 \ around 8 sele segm name SEGMENTS end copy
All atoms included in segment name WAT2 are deleted, so that a new rerun of the procedure does not cause any conflicts with the previous runs. It is assumed that the rerun will be done with modified peak criteria.
delete atom sele segment name WAT2 end
Assign to keys old water molecules, protein nonhydrogen atoms, all nonhydrogen atoms and all dummy nonhydrogen protein atoms:
Assignes atoms to keys used below (old water molecules, protein nonhydrogen atoms, all nonhydrogen atoms, all dummy protein atoms).
key wat select .not segment name PROTEIN end key protein select segment name PROTEIN .and. .not. atom name H* end key orig select segment name SEGMENTS .and. .not. atom name H* end key dummy select weight -0.1 0.1 .and. protein end key old select all end
Remove all water molecules which oxygens are placed at positions with lower electron density than 2FO-FC_CUT:
key dw select map 2FOFC -20.0 2FO-FC_CUT from wat .and. atom name O* end delete atom select by resi dw end
Removes all water molecules with oxygen temperature factors higher than TEMP_CUT:
key dw select temp TEMP_CUT 1000.0 .and. atom name O* .and. wat end delete atom select by resi dw end
Writes the solvent molecules with still positive difference electron density to the file ion.list. Their electron number seems to be too small for their position.
write over file ion.list select .not. atom name H* .and. \ map MAP_DIFF FO-FC_ION 100. from wat end coordinates xplor
make point init from map MAP_DIFF extrem FO-FC_CUT 200.0 show point dens
Makes atom from points (peaks in the Fo-Fc map) and assignes the newly created peaks to the key xxx.
make atom from point reindex key xxx select .not old end show key xxx
Deletes all the peaks that are placed at positions where the electron density map (2FOFC) ,2Fo-Fc, is lower than 2FO-FC_CUT sigma.
delete atom select map 2FOFC -300. 2FO-FC_CUT from xxx end
Deletes peaks closer than TOO_CLOSE to any "orig" atom, the peaks further than TOO_FAR drom any "protein" atom and peaks closer than TOO_FAR to any dummy protein atom.
delete atom select around key orig dist TOO_CLOSE from xxx end delete atom \ select .not. around key protein dist TOO_FAR from xxx .and. xxx end delete atom select around key dummy dist TOO_FAR from xxx end
Since the newly created atoms (peaks) were created as dummy atoms (X), they had to be renamed to oxygen(OH2) further treatment. The peaks are merged according to their cluster formation.
rename select xxx end atom OH2 calc bond distance PEAK_SIZE select xxx end make segm select xxx end make atom merge resi select xxx end
Give the peaks a single segment (WAT2) and residue (H2O) name, set their temperature factors to 30.0 and crystallographical weights to 1.0.
rena select xxx end resi H2O rena select xxx end seg WAT2 set temp select xxx end = 30.0 set weigh select xxx end = 1.0
Prepare symmetry overlap check (for further application in the macro MAIN_UTILS:gen_solvent_remove_symm.com where the symmetry equivalent solvent atoms are removed ): Initializes the variable ii and sets its value to the current number of segments, selects all current atoms, generates the symmetry related atoms from all solvent oxygen atoms that are closer than TOO_FAR any atom in key 'nnn'.
set vari ii int = nsegm key nnn select all end key nnn select all end symmetry select xxx .or. wat .and. atom name O* end \ around select nnn end dist TOO_FAR cut delete on key symm select .not. nnn end
calculates the pair list between the solvent oxygen atoms and their symmetry related ones. A pair of atoms (peaks) is assumed to be identical if closer than PAIR_CUT.
calc pair select symm end select xxx .or. wat .and. atom name O* end - rang 0.0 PAIR_CUT initialize
All atoms generated with the same symmetry operation obtain a unique segment name starting with the character # and each symetry operation creates another segment. The following command file runs iteratively until it empties the pair list
The user can see from the pair list of interfering atoms in which segments the atoms are included. The interfering atoms appear two times in the generated peak atoms and two times in the symmetry related ones. Therefore they should be removed so that one of them remains.
@MAIN_UTILS:gen_solvent_remove_symm.com ii
Removes the symmetry related peaks by checking the protein atoms.
key out sele by pair orig .and. ( xxx .or. symm ) end del atom sele by sequence out .and. ( xxx .or. symm ) end
Calculates the remaining peaks closer than PAIR_LIST to any protein atom. The positions of the paired protein residues is stored in the file pair.list and is supposed to be checked at the display.
Calculates the remaining peaks closer than PAIR_LIST to any protein atom. The positions of the paired protein residues should be checked at the display and writes the 'pair.list' file.
calc pair select protein end - select ( symm .or. xxx .or. wat ) .and. atom name O* end - rang 0.0 PAIR_LIST initial
write over file pair.list pair
Perform hydrogen bond tests: all solvent atoms that have no hydrogen donnor or acceptor atom (O*, N*) in their vicinity (3.4A) are deleted with the command file ATOM_NEIGH.COM.
@>doc/gen_solvent/atom_neigh 3.4
RENAME the solvent sequence IDs, so that they start with W1 and go to Wnn.
rename seq W sele wat .or. xxx end auto
Clean keys etc...
delete atom sele segm name #* end key xxx drop key dw drop key dummy drop key nnn drop key out drop key protein drop key symm drop key orig drop
It remains to generate the missing hydrogens and update the WORK_SEGM variable.
build sele segm name WAT2 end fill exit calc coor append sele segm name WAT2 end set vari WORK_SEGM global char segm sele segm name WORK_SEGM WAT2 end return