9. Use of non-crystallographic similarity (NCS)
Keyword: local symmetry, density averaging, non-crystallographic similarity, NCS

Multiplicity of equivalent molecules in an asymmetric unit of a
crystal is a blessing for a crystal structure determination.  In such
cases electron density defining molecular structures can be averaged,
atomic positions and B-values of equivalent atoms can be subjected to
additional constraints (NCS Non-Crystallographic-Similarity
constraints).  Density averaging results in reduction of model bias
and allows to interpret electron density maps with rather poor initial
phase information as well as to solve structrues with low completness.

Before continuing this chapter you are advised to be acquainted with
the chapter "1 molecule case" ("MAIN_DOC:1mol/1mol.txt") that
describes the other MAIN tools underlaying also cases with multiple
equivalent subunits.

There are a few assumptions in this chapter and herein mentioned macros
that you should bare in mind: 
- each subunit should have its own segment name,
- all equivalent subunits should be included in the same group,
- the residue names and sequence IDs must be the same throughtout all
  equivalent molecules.

9.1. Run main_config first
Keyword: main_config

The cathepsin L - p41 fragment (Guncar et al., 1999, EMBO J. 8,
793-803) allows to explain the MAIN philosophy and most of tools
dealing with non-crystallographic similarity. The complete case with
all data and macros is available in "../cases/catl_p41".  Macros shown
are all configured and created using "main_config" utilities.


 > create_main_config.pl -m MOLA MOLB IA IB -g "MOLA MOLB" "IA IB" --doit
 creating ".main" 
 creating "read.com" 
 creating "save_file.cmds" 
 creating "re_image.cmds" 
 creating "symmetry.cmds" 
 creating "symmetry_ca.cmds" 
 creating "refine.cmds" 
 creating "refine_b.cmds" 
 creating "gen_solvent.com" 
 creating "make_masks.cmds" 
 creating "dm_prep.cmds" 
 creating "dm_next.cmds" 
 creating "dm_loop.com" 
  1 strategy EACH group MOLA MOLB 
  2 strategy EACH group IA IB 
 creating "load_4mol.com" 
 creating "rms_fit.cmds" 
 creating "create_all_others.cmds" 
 
 MAIN configuration files have now been generated.  
 
 Now type "mainps" to start your MAIN session or 
 if you are not happy with defaults see the chapter below 
 adjust your input data using "menu_read.sh" and other scripts.
 

The macros are now created and "MAIN" interactive session can be
started by invoking MAIN (use mainps)

 > mainps


9.2 Non-default setup

This case has two enzyme molecules with segment names "MOLA" and "MOLB"
and two inhibitors attached to them with segment names "IA" and "IB".

9.2.2.  Changing the averaging strategy
Keyword: density averaging strategy, EACH, ONE, LINK, WHOLE

Currently there are two strategies available in MAIN "
By default MAIN switches between:

Typing "create_main_config.pl" without any parameters writes the
current main configuration setup (similarly as any other configuration
"create_*.pl" Perl  "menu_dens_mod.sh" shell scripts)

 create_main_config.pl

   -g|--group )     defines NCS groups:
                   specify the list of segments belonging to groups
                   for each group embrace the list in " "
                   1 EACH  | MOLA MOLB
                   2 EACH  | IA IB
  -s|--strategy)   defines averaging scripts strategy -
                   specify parameters for each group separately [1 EACH]
                   EACH: average each molecule separately (default)
                   ONE: average one and distribute the averaged density to others
                   LINK: use operators from another group [ 2 LINK 1 ]
                   WHOLE: the whole group builds a single mask for proper (spherical) symmetry


Strategy "EACH" means that density for each member of the group will
be calculated as average of all others, choosing "ONE" means that the
density will be averaged only for the first member of each group and
then distributed to all equivalent ones by map rotation and
translation.  "LINK" will use geometrical operators from the linked
group and whole will do averaging within a single mask defined by all
members of the group.  By deafult the strategy "EACH" is applied
untill the number of griup members is smaller than 5, wheras "ONE" is
chosen for groups with a higher number of group members.

For example, to choose "ONE" for the first group and "EACH" for the
second one should after the stratgegy "-s" specifier type the group
number followed by the strategy. 

 create_main_config.pl -s 1 ONE 2 EACH

9.2.3.  Group inter-dependancy 
Keyword: NCS groups

When an inhibitor or cofactor is attached to an enzyme or some other
part of well known structure there is no need to change the well
defined part each time you actually want to update the ligand
part only. 

the "LINK" strategy will cover thuis in the future, however, now one
still need do do some typin explicitly.

Ligand is presumably bound to the enzyme with the same orientation in
all equivalent complexes, so it makes sense only to use the
superposition parameters from the enzymatic parts and not calculate
them from (sometimes even partial) ligand models.  

So one can edit the created "rms_fit.cmds" file and simply copy
"mol_MOLA_to_MOLB.com" into the "mol_IA_to_IB.com" and correspondingly
create the file "mol_IB_to_IA.com".  (One can also change the
"rms_fit.cmds" macro, so that it writes the inhibitor superposition
files "mol_IB_to_IA.com" and "mol_IA_to_IB.com" based on
superpositions of "MOLA" and "MOLB".

9.3. Summary of differences between 1mol and nmol case

There creation procedure is essentialy the same as in the 
elementary real case
("MAIN_DOC:1mol/1mol.txt") chapter, however, several macros
differ and some additoinal ones are created.  The differences 
reflect the relations between the topologically identical segments
"MOLA" and "MOLB".  

- "read.com" macro contains a line, which loads an additional
menu block ("load_4mol.com").
- "refine.cmds" and "refine_b.cmds" macros contains code defining
NCS constrains between different molecules.
- "load_4mol.com" macro is a new file. It loades additional menu block
"N_MOLECUL" to the page 7 (see also "MAIN_MENU:nmol.txt"). It contains
items that assist in model building of topologically equivalent
molecules.
- "rms_fit.cmds" macro has been created to calculate superposition
parameters and save the results in macros for a later use.
- "dm_loop.com" now containes density averaging.
- The "create_all_others.cmds" macro is the only building macro created
in advance.  Other macros are created at run time.  For details see
menu block "N_MOLECUL" description. "MAIN_MENU:nmol.txt".

9.4. Startup file for N molecules
Keyword: read.com

There is only one difference between the 1 molecule startup file
("MAIN_DOC:1mol/read.com") and this one ("read.com"), namely in order
to enable manipulation of several topologicallly equivalent molecules an additional
menu block ("load_4mol.com") is loaded.

9.5. Different segments have different colors
Keyword: re_image.cmds, symmetry.cmds, symmetry_ca.cmds

The "re_image.cmds" macro displays all molecules.
Crystallographically weighted atoms are colored close to yellow,
whereas the ones with occupancy 0.0 are colored green.

The symmetry mates ("symmetry.cmds" and "symmetry_ca.cmds") are also
differently colored.  Each different segment gets a new color close to
red regardless of the crystallographic weight of the atoms:

9.3. Refining with NCS
Keyword: refine.cmds, refine_b.cmds

Refining with NCS constraints helps.  It can also improve your
parameters (geometrical superposition of maps) for electron density
averaging.  Here NCS constraints are applied to atomic positions.

It is important especially towards the end of refinement, that you
recognize, which parts of various molecules are different and so
exclude them from the NCS constraints.  Their sequence IDS are to be
included in the selection key "out".  The "refine.cmds" macro must be
therefore modified using a text editor.  The NCS groups are defined
only once, B-factor refinement ("refine_b.cmds") simply uses the keys
defined for positional refinement.  The most apropriate is the use of
SEQUENCE keyword. In order to use an equivalent selection of residues
throughout the session in macros "rms_fit.cmds", "refine.cmds" and
"refine_b.cmds" it is advisable to write the "define_key_out.com"
macro.

 key out sele  .not all end

 ! include all the non equivalent residues here
 ! key out sele ( seq  x1 x2 .or seq X7 .or seq 167 : 190 ) \
 !       end

 <define_key_out.com

 ! defining groups

 define init constr ncs
 define constr ncs sele atom name CA N C O H CB %G* %D* \ 
           .a ( segm name MOLA .a .not out ) end
 define constr ncs sele atom name CA N C O H CB %G* %D* \ 
           .a ( segm name MOLB .a .not out ) end
 define constr ncs force  5.
 define constr ncs b-force 0.004

 ener ncs on

9.4. Density modification 
Keyword: density modification, nmol, averaging

Having several identical subunits allows you to utilize the benefits
electron density averaging.  Besides averaging procedure based on
superimposed regions of molecular masks, do not forget that they may
not be complete.  Therefore it makes sense to keep regions of density
in the cyclic procedure, although you are not sure to which molecule
they belong. See in "MAIN_DOC:1mol/1mol.txt" and
"MAIN_MENU:map_atom.txt" for instructions.  Besides, you may
skeltonize these regions or build dummy models for mask creations and
superposition ("MAIN_DOC:mol_repl/mol_repl.txt").

The same menu items and files with the same names that run density
modification procedures for 1 molecule ("MAIN_DOC:1mol/1mol.txt") are
used also when electron density averaging is involved.  Procedures are
invoked via menu block "DENS_MOD" items ("MAIN_MENU:dens_mod.txt").
The only item that is here new is the "RMIS_FIT" calling the
"rms_fit.cmds" macro calculates RMS fit based superposition matrices
for all possible combinations of segments within each group.


9.4.1. Generation of superposition parameters (RMS_FIT)
Keyword: rms_fit.cmds

The item "RMS_FIT" from "DENS_MOD" menu block calls the macro
"rms_fit.cmds".  As the number of files increases with the number of
molecules in each group, it may make sense to store the rotational and
translational parameters files one some other directory.

For details see "MAIN_MENU:dens_mod.txt".

The following options of the density modification procedure effect the
macro:

    [rot_tran/rt] directory for rotation and translational macros: 

If you have no model (MIR case) you still need these parameters only
the rotational and translational part will be probably constructed
from a self rotation function, heavy atom positions or density map
positions (see "MAIN_DOC:intro/intro.txt").

9.4.2. Molecular envelope creation
Keyword: make_masks.cmds

For each molecule (with a unique segment name) a separate mask is
created and saved to a file.  It is taken care for overlap with other
molecules as well as their symmetry equivalents.  For details about
parameters see "MAIN_MENU:dens_mod.txt".

As masks based on molecular models are precalculated and saved to
files, it may make sense to use some other directory to store them.
Let "main_config" to create and modify the file "make_masks.cmds".
The following options of the density modification procedure effect the
macro:

    [atom/a]      mask atom maximal radius: 6.0
    [mask_dir/m]  directory for mask files: 


9.4.6. Density modification loop
Keyword: dm_loop.com

The "dm_loop.com" deals with protein region of electron density maps.

An whole averaging cycle is encrypted into the "dm_loop.com" macro,
which you don't realy wish to edit.  Normaly "main_config" or a
shortcut "MAIN_CONF:menu_dens_mod.sh" should be used for its
modification and creation.

Averaging parts involve copying density into the mask,

 ! first copy density into mask of MOLA
 set vari FILE_MASK = mask_MOLA.xmap
 read file FILE_MASK map xpl over MAP_WORK
 make map MAP_WORK from MAP_FROM copy
 
adding rotated densities to the "MOLA" mask one by one,

 <mol_MOLA_to_MOLB.com
 <?MAIN_UTILS:add_a_map FILE_MASK MAP_ADD MAP_FROM MAP_WORK
 
rescaling of the resulting summ of the density maps and building
the unit cell using the symmetry operators.

 make map MAP_WORK rescale
 make map MAP_TO from MAP_WORK cell

In a case a large number of local repetitions of local subunits (more
than five), it makes sense to average only one molecule and then
distribute the averaged density of one monomer the others.
followed by creation of a "dm_loop.com", which differs from the "each"
case

 ! distribute MOLA density into mask of MOLB
 set vari FILE_MASK = mask_MOLB.xmap
 read file FILE_MASK map xpl over MAP_WORK
 make map MAP_WORK set 9000 100000 0.0
 <mol_MOLB_to_MOLA.com
 <?MAIN_UTILS:add_a_map FILE_MASK MAP_ADD MAP_TO MAP_WORK
 make map MAP_WORK rescale
 make map MAP_TO from MAP_WORK cell

The strategy "one" is computationally faster than the default strategy
"each" as the number of molecules within a group increases, although it
is less accurate due to double interpolation errors based on
interpolation of interpolated density. I recommend to use the "each"
method within groups of more than    4 segments.

9.5. Auxiliary macros for model building
Keyoword: load_nmol.com

The "N_MOLECUL" menu block is described in "MAIN_MENU:nmol.txt", the
here present "load_4mol.com" is only an example.

9.5.1. Pick your working model

The idea of a working model is that you during model building modify
only a single molecule and distribute the changes to the others. So
you can decide which molecule can become your working model. The
images of the other molecules are displayed superimosed to the working
model as background images.

In this case there are only two possible working models, MOLA and
MOLB.  In more complicate case their number may increase and with them
the number of the new menu items, but it is obvious that there are
limits.  It makes no sense to deal independently with 60 molecules
within an icosahedron.  However, "main_config" does not cover that
yet.

9.5.2. Create all related molecules 
Keyword: create_all_others.cmds

The "create_all_others.cmds" macro creates from your current working
model all related segments using the rotation and translational
parameters as last calculated by the "rms_fit.cmds" procedure.  The
"PUT_TO" and "GET_FROM" commands will only work for topologically
equivalent moelcules, whereas this one simply inforces topological and
geometrical equivalence.

9.5.3. Distribute parts of your model from one to another segment

The "PUT_TO" and "GET_FROM" are commands that transfer coordinates,
occupancies (weights) and temperature factors between equivalent parts
of the working model and the other molecules.  Equivalency is based of
sequence ID matching between the key "active" selection of the working
model and the other molecule.  "PUT" and "GET" indicate direction of the
data transfer.

Coordinates are transferred on a basis of a RMS superposition between
the working and the other model.

9.5.4. Map rotations
Keyword: rotate_maps.cmds

Rotates maps of equivalent molecules to the map of the current working
segment.

The macro "rotate_maps.cmds" is created on the fly in the moment when
a work segment has been defined.

9.99. Advices for undescribed cases 
Keyword: phase extension, proper symmetry, undescribed, unknown

If your case belongs here, study the files what you can get with
"main_config" learn the MAIN programing language.  In particular the
commands MAKE ("MAIN_COM:make.txt"), REFLECT ("MAIN_COM:reflect.txt"),
FOURIER ("MAIN_COM:fourier.txt"), ENERGY ("MAIN_COM:energy.txt") and
SELECT ("MAIN_COM:select.txt") and some others.

For dealing with several crystal forms see ("MAIN_DOC:2crys/2crys.txt").

For use of phase extension modify resolution limits by redefining
reflection key "WORK_REFL" in "dm_next.cmds" and then invoke about 4
cycles in "dm_loop.com" after each resolution range.

For use of proper symmerty you have to provide your own "rot_tran"
parameter files and generate your own masks.

If you don't know how to solve your problem use E-mail.