Generation of solvent molecules

Macro "gen_solvent.com"

At the final stages of refinement with sufficiently high resolution data, solvent molecules can be identified and added to a macromolecular structure. When properly inserted, they can improve density and lower the R-value, In many cases adding solvent moelcules is fitting the noise. So the number of solvent molecules you should or can insert into a macromolecular is usually a number defined with its ratio against the number of residues in the macromolecule. Higher resolution means in general more solvent.

A click on "GEN_SOLV" on "REFINE" page invokes a macro that does the whole thing. Before invoking the macro, you need to genarate the two difference maps ("MAP_2FOFC" and "MAP_FOFC").

Present solvent molecules are checked for their B values and 2Fobs-Fcalc map consistency and symmetry overlap. The segment name of already existing (old) solvent molecules must be "WAT". The newwly created ones will get the segment name "WAT2". At the beginning of each trial all atoms with segment name "WAT2" are deleted, so you should rename the WAT2 segment into WAT after one cycle of generation of solvent molecules is completed.

The basic creation for generation of the new ones is based on a difference map peak search procedure. The number of peaks is reduced on the basis of their consistency within a 2Fobs-fcalc map density, distance from macromolecular atoms (overlap and vicinity of a possible hydrogen donnors or acceptors) and symmetry overlap.

Kicking atoms before calculating the maps, may help you to differentiate between the noise peaks and the real solvents.

In the final stages of a macromolecular structure refinement well ordered solvent molecules are placed around the structure. Their positioning and later verification of their positions often result in a substantial amount of work. In order to avoid most of it, the below described procedure was developed. The procedure

removes the already existing water molecules placed at too low electron density (2Fobs-Fcalc),
removes ones with too high temperature factor,
inserts oxygen atoms at positions of peaks found in the difference electron density map (Fobs-Fcalc),
removes peaks placed at too low electron density in 2Fobs-Fcalc) maps,
removes all peaks that are closer than 1.5A to any already present atom,
removes all peaks that are further than 5.0A to any protein atom,
removes all peaks that are closer than 5.0A to any dummy protein atom,
removes all by crystal symmetry related peaks,
gives a list of possibly wrongly placed atoms,
removes all peaks that have no hydrogen donor or acceptor atom in their vicinity.

The simplest way to create and update your "gen_solvent.com" is through "create_config" tools using the MAIN_CONF:create_gen_solv.pl.


 create_gen_solv.pl
  -h|--help)      prints this message with available options and current status
  -b|--b_value)   B-value cut for previous rounds in WAT [70.0]
  -p|--peak)      peak threshold in diff density map (fo-fc) [2.3]
  -d|--density)   density map (2fo-fc) cutoff [1.1]
  -c|--close)     the closest density pick existing atom distance [2.3]
  -f|--far)       the farthest density pick distance to an existing atom [4.5]
  -m|--merge)     remove symmetry related peaks if they get closer than [2.3]
-l|--list)      key "check_list" contains residues closer than [2.5] to peaks
-i|--ion)       key "ion_list" conatins WAT atoms with fo-fc dens higher than [1.2]
  -w|--what)      what to do: [CHECK/NEW/BOTH] [BOTH]
     --doit)      create the macro

Old solvent molecules should be included in the segment WAT and their residue names should be H2O. The newly generated ones will be included in the segment WAT2. When rerunning the procedure all solvents in WAT2 will be deleted and generated under the (new) critearia specified in a "gen_solvent.com."

Macro "gen_solvent.com"

The SUBROUTINE commands converts the passing parameters (maps and segment names) to local variables


 subroutine int 2FOFC int FOFC char SEGMENTS char PROTEIN

These are the cutoffs you should probably tailor through "main_config" or via an text editor.


 set vari TOO_CLOSE local real = 2.3
 set vari TOO_FAR local real = 4.5
 set vari PAIR_CUT local real = 2.3
 set vari PAIR_LIST local real = 2.5


 set vari 2FO-FC_CUT local real = 1.1
 set vari FO-FC_CUT local real = 2.3
 set vari FO-FC_ION local real = 1.2
 set vari TEMP_CUT local real = 70.0

Two or more peaks become one if they are closer than PEAK_SIZE


 set vari PEAK_SIZE local real = 1.5

Below here is the area of interest for programers and curious people who want to know how MAIN deals with solvent generation.

Since peaks are searched only in a map, the procedure below generates a new map out of the Fo-Fc map to asure that the difference map really lies around your molecular model.


 set vari MAP_DIFF  local int = FOFC + 1
 if ( 2FOFC .gt. FOFC ) set vari MAP_DIFF local int = 2FOFC + 1
 make map MAP_DIFF from FOFC init 9999 \
         around 8 sele segm name SEGMENTS end copy

All atoms included in segment name WAT2 are deleted, so that a new rerun of the procedure does not cause any conflicts with the previous runs. It is assumed that the rerun will be done with modified peak criteria.


 delete atom sele segment name WAT2 end

Assign to keys old water molecules, protein nonhydrogen atoms, all nonhydrogen atoms and all dummy nonhydrogen protein atoms:

Assignes atoms to keys used below (old water molecules, protein nonhydrogen atoms, all nonhydrogen atoms, all dummy protein atoms).


 key wat select .not segment name PROTEIN end
 key protein select segment name PROTEIN .and. .not. atom name H* end
 key orig select segment name SEGMENTS .and. .not. atom name H* end
 key dummy select weight -0.1 0.1 .and. protein end
 key old select all end

Remove all water molecules which oxygens are placed at positions with lower electron density than 2FO-FC_CUT:


 key dw select map 2FOFC -20.0 2FO-FC_CUT from wat .and. atom name O* end
 delete atom select by resi dw end

Removes all water molecules with oxygen temperature factors higher than TEMP_CUT:


 key dw select temp TEMP_CUT 1000.0 .and. atom name O* .and. wat end
 delete atom select by resi dw end

Writes the solvent molecules with still positive difference electron density to the file ion.list. Their electron number seems to be too small for their position.


 write over file ion.list select .not. atom name H* .and. \
   map MAP_DIFF FO-FC_ION 100. from wat end coordinates  xplor

make point init from map MAP_DIFF extrem FO-FC_CUT 200.0 show point dens

Makes atom from points (peaks in the Fo-Fc map) and assignes the newly created peaks to the key xxx.


 make atom from point reindex
 key xxx select .not old end
 show key xxx

Deletes all the peaks that are placed at positions where the electron density map (2FOFC) ,2Fo-Fc, is lower than 2FO-FC_CUT sigma.


 delete atom select map 2FOFC -300. 2FO-FC_CUT from xxx end

Deletes peaks closer than TOO_CLOSE to any "orig" atom, the peaks further than TOO_FAR drom any "protein" atom and peaks closer than TOO_FAR to any dummy protein atom.


 delete atom select around key orig dist TOO_CLOSE from xxx end
 delete atom \
    select .not. around key protein dist TOO_FAR from xxx .and. xxx end
 delete atom select around key dummy dist TOO_FAR from xxx end

Since the newly created atoms (peaks) were created as dummy atoms (X), they had to be renamed to oxygen(OH2) further treatment. The peaks are merged according to their cluster formation.


 rename select xxx end atom OH2
 calc bond distance PEAK_SIZE select xxx end
 make segm select xxx end
 make atom merge resi select xxx end

Give the peaks a single segment (WAT2) and residue (H2O) name, set their temperature factors to 30.0 and crystallographical weights to 1.0.


 rena select xxx end resi H2O
 rena select xxx end seg WAT2
 set temp select xxx end = 30.0
 set weigh select xxx end = 1.0

Prepare symmetry overlap check (for further application in the macro MAIN_UTILS:gen_solvent_remove_symm.com where the symmetry equivalent solvent atoms are removed ): Initializes the variable ii and sets its value to the current number of segments, selects all current atoms, generates the symmetry related atoms from all solvent oxygen atoms that are closer than TOO_FAR any atom in key 'nnn'.


 set vari ii int = nsegm
 key nnn select all end
 key nnn select all end
 symmetry select xxx .or. wat .and. atom name O* end \
    around select nnn end dist TOO_FAR cut delete on
 key symm select .not. nnn end

calculates the pair list between the solvent oxygen atoms and their symmetry related ones. A pair of atoms (peaks) is assumed to be identical if closer than PAIR_CUT.


 calc pair select symm end select xxx .or. wat .and. atom name O* end -
   rang 0.0 PAIR_CUT initialize

All atoms generated with the same symmetry operation obtain a unique segment name starting with the character # and each symetry operation creates another segment. The following command file runs iteratively until it empties the pair list

The user can see from the pair list of interfering atoms in which segments the atoms are included. The interfering atoms appear two times in the generated peak atoms and two times in the symmetry related ones. Therefore they should be removed so that one of them remains.


 @MAIN_UTILS:gen_solvent_remove_symm.com  ii

Removes the symmetry related peaks by checking the protein atoms.


 key out sele by pair orig .and. ( xxx .or. symm ) end
 del atom sele by sequence out .and. ( xxx .or. symm ) end

Calculates the remaining peaks closer than PAIR_LIST to any protein atom. The positions of the paired protein residues is stored in the file pair.list and is supposed to be checked at the display.

Calculates the remaining peaks closer than PAIR_LIST to any protein atom. The positions of the paired protein residues should be checked at the display and writes the 'pair.list' file.


 calc pair select protein end -
          select ( symm .or. xxx .or. wat ) .and. atom name O* end -
          rang 0.0 PAIR_LIST initial


 write over file pair.list pair

Perform hydrogen bond tests: all solvent atoms that have no hydrogen donnor or acceptor atom (O*, N*) in their vicinity (3.4A) are deleted with the command file ATOM_NEIGH.COM.


@>doc/gen_solvent/atom_neigh  3.4

RENAME the solvent sequence IDs, so that they start with W1 and go to Wnn.


 rename seq W sele wat .or. xxx end auto

Clean keys etc...


 delete atom sele segm name #* end
 key xxx drop
 key dw drop
 key dummy drop
 key nnn drop
 key out drop
 key protein drop
 key symm drop
 key orig drop

It remains to generate the missing hydrogens and update the WORK_SEGM variable.


 build
 sele segm name WAT2 end
 fill
 exit
 calc coor append sele segm name WAT2 end
 set vari WORK_SEGM global char segm sele segm name WORK_SEGM WAT2 end
 return