9. Use of non-crystallographic similarity (NCS) Keyword: local symmetry, density averaging, non-crystallographic similarity, NCS Multiplicity of equivalent molecules in an asymmetric unit of a crystal is a blessing for a crystal structure determination. In such cases electron density defining molecular structures can be averaged, atomic positions and B-values of equivalent atoms can be subjected to additional constraints (NCS Non-Crystallographic-Similarity constraints). Density averaging results in reduction of model bias and allows to interpret electron density maps with rather poor initial phase information as well as to solve structrues with low completness. Before continuing this chapter you are advised to be acquainted with the chapter "1 molecule case" ("MAIN_DOC:1mol/1mol.txt") that describes the other MAIN tools underlaying also cases with multiple equivalent subunits. There are a few assumptions in this chapter and herein mentioned macros that you should bare in mind: - each subunit should have its own segment name, - all equivalent subunits should be included in the same group, - the residue names and sequence IDs must be the same throughtout all equivalent molecules. 9.1. Run main_config first Keyword: main_config The cathepsin L - p41 fragment (Guncar et al., 1999, EMBO J. 8, 793-803) allows to explain the MAIN philosophy and most of tools dealing with non-crystallographic similarity. The complete case with all data and macros is available in "../cases/catl_p41". Macros shown are all configured and created using "main_config" utilities. > create_main_config.pl -m MOLA MOLB IA IB -g "MOLA MOLB" "IA IB" --doit creating ".main" creating "read.com" creating "save_file.cmds" creating "re_image.cmds" creating "symmetry.cmds" creating "symmetry_ca.cmds" creating "refine.cmds" creating "refine_b.cmds" creating "gen_solvent.com" creating "make_masks.cmds" creating "dm_prep.cmds" creating "dm_next.cmds" creating "dm_loop.com" 1 strategy EACH group MOLA MOLB 2 strategy EACH group IA IB creating "load_4mol.com" creating "rms_fit.cmds" creating "create_all_others.cmds" MAIN configuration files have now been generated. Now type "mainps" to start your MAIN session or if you are not happy with defaults see the chapter below adjust your input data using "menu_read.sh" and other scripts. The macros are now created and "MAIN" interactive session can be started by invoking MAIN (use mainps) > mainps 9.2 Non-default setup This case has two enzyme molecules with segment names "MOLA" and "MOLB" and two inhibitors attached to them with segment names "IA" and "IB". 9.2.2. Changing the averaging strategy Keyword: density averaging strategy, EACH, ONE, LINK, WHOLE Currently there are two strategies available in MAIN " By default MAIN switches between: Typing "create_main_config.pl" without any parameters writes the current main configuration setup (similarly as any other configuration "create_*.pl" Perl "menu_dens_mod.sh" shell scripts) create_main_config.pl -g|--group ) defines NCS groups: specify the list of segments belonging to groups for each group embrace the list in " " 1 EACH | MOLA MOLB 2 EACH | IA IB -s|--strategy) defines averaging scripts strategy - specify parameters for each group separately [1 EACH] EACH: average each molecule separately (default) ONE: average one and distribute the averaged density to others LINK: use operators from another group [ 2 LINK 1 ] WHOLE: the whole group builds a single mask for proper (spherical) symmetry Strategy "EACH" means that density for each member of the group will be calculated as average of all others, choosing "ONE" means that the density will be averaged only for the first member of each group and then distributed to all equivalent ones by map rotation and translation. "LINK" will use geometrical operators from the linked group and whole will do averaging within a single mask defined by all members of the group. By deafult the strategy "EACH" is applied untill the number of griup members is smaller than 5, wheras "ONE" is chosen for groups with a higher number of group members. For example, to choose "ONE" for the first group and "EACH" for the second one should after the stratgegy "-s" specifier type the group number followed by the strategy. create_main_config.pl -s 1 ONE 2 EACH 9.2.3. Group inter-dependancy Keyword: NCS groups When an inhibitor or cofactor is attached to an enzyme or some other part of well known structure there is no need to change the well defined part each time you actually want to update the ligand part only. the "LINK" strategy will cover thuis in the future, however, now one still need do do some typin explicitly. Ligand is presumably bound to the enzyme with the same orientation in all equivalent complexes, so it makes sense only to use the superposition parameters from the enzymatic parts and not calculate them from (sometimes even partial) ligand models. So one can edit the created "rms_fit.cmds" file and simply copy "mol_MOLA_to_MOLB.com" into the "mol_IA_to_IB.com" and correspondingly create the file "mol_IB_to_IA.com". (One can also change the "rms_fit.cmds" macro, so that it writes the inhibitor superposition files "mol_IB_to_IA.com" and "mol_IA_to_IB.com" based on superpositions of "MOLA" and "MOLB". 9.3. Summary of differences between 1mol and nmol case There creation procedure is essentialy the same as in the elementary real case ("MAIN_DOC:1mol/1mol.txt") chapter, however, several macros differ and some additoinal ones are created. The differences reflect the relations between the topologically identical segments "MOLA" and "MOLB". - "read.com" macro contains a line, which loads an additional menu block ("load_4mol.com"). - "refine.cmds" and "refine_b.cmds" macros contains code defining NCS constrains between different molecules. - "load_4mol.com" macro is a new file. It loades additional menu block "N_MOLECUL" to the page 7 (see also "MAIN_MENU:nmol.txt"). It contains items that assist in model building of topologically equivalent molecules. - "rms_fit.cmds" macro has been created to calculate superposition parameters and save the results in macros for a later use. - "dm_loop.com" now containes density averaging. - The "create_all_others.cmds" macro is the only building macro created in advance. Other macros are created at run time. For details see menu block "N_MOLECUL" description. "MAIN_MENU:nmol.txt". 9.4. Startup file for N molecules Keyword: read.com There is only one difference between the 1 molecule startup file ("MAIN_DOC:1mol/read.com") and this one ("read.com"), namely in order to enable manipulation of several topologicallly equivalent molecules an additional menu block ("load_4mol.com") is loaded. 9.5. Different segments have different colors Keyword: re_image.cmds, symmetry.cmds, symmetry_ca.cmds The "re_image.cmds" macro displays all molecules. Crystallographically weighted atoms are colored close to yellow, whereas the ones with occupancy 0.0 are colored green. The symmetry mates ("symmetry.cmds" and "symmetry_ca.cmds") are also differently colored. Each different segment gets a new color close to red regardless of the crystallographic weight of the atoms: 9.3. Refining with NCS Keyword: refine.cmds, refine_b.cmds Refining with NCS constraints helps. It can also improve your parameters (geometrical superposition of maps) for electron density averaging. Here NCS constraints are applied to atomic positions. It is important especially towards the end of refinement, that you recognize, which parts of various molecules are different and so exclude them from the NCS constraints. Their sequence IDS are to be included in the selection key "out". The "refine.cmds" macro must be therefore modified using a text editor. The NCS groups are defined only once, B-factor refinement ("refine_b.cmds") simply uses the keys defined for positional refinement. The most apropriate is the use of SEQUENCE keyword. In order to use an equivalent selection of residues throughout the session in macros "rms_fit.cmds", "refine.cmds" and "refine_b.cmds" it is advisable to write the "define_key_out.com" macro. key out sele .not all end ! include all the non equivalent residues here ! key out sele ( seq x1 x2 .or seq X7 .or seq 167 : 190 ) \ ! end