One molecule in an asymmetric unit

This chapter is the elementarty tutorial to MAIN. It gives you an overview of a working session. The objective here is to build and rebuild a model, calculate its structure factors and maps, refine and analyze (validate) it.

This is the case with 1 molecule in the asymmetric unit of a crystal, so it does not deal with local symmetry (Non-Crystaloographic-Symmetry or Non-Crystallographic-Similarity). If you have some, you are still advised to get through this basic chapter, becasue the cases dealing with more molecules per asymmetric unit (MAIN_DOC:nmol/nmol.html) and various crystal forms (MAIN_DOC:2crys/2crys.html) only address the features connected with NCS.

The section "Run main_config first" guides you through the creation of your initial MAIN macros. The section "The read.com file" presents a read.com file usually used to start a MAIN session. The section "My first MAIN session" explains how to start and debug the first errors and then do some elementary things as how to display, model, refine and analyze a structure using as many defaults as possible. The "config" scripts are ment to be self explainable. The section MAIN_DOC:config/config.html gives a few hints "main_config utilities in general and the section "Learning more about main_config" explains organization and principles of "main_config" utilities.

Run main_config first

First use most of defaults that "main_config" tools provide (MAIN_DOC:config/config.html). If you have Tk-perl installed, the GUI "tk_main_config.pl" provides an easy start and can be used instead of the command line "create_main_config.pl". It does, however, use defaults for extraction of structure factor data, including the space group and resolution range. When you want to use other labels, then the "use_...pl" scripts are the way to go.

tk_main_config.pl

Run


$ tk_main_config.pl

A graphics window with fields and options will appear. It contains four frames

Through these frames and additional menu items whole main session can be configured. Alternatively, the command line syntax can be used as described below.

Crystal forms data

First import the diffraction data file. The configure tool recognizes the format from the file extension:

When default labels (the first column is taken) seem unappropriate, use "use_mtz.pl" or "use_cif.pl" to choose your own.

The "cif" files may not contain the unit cell constants, which ar enecessary for defining the resolution range of the data. When so import a PDB file first.

SHELX files can be read in MAIN dirrectly, however, the format must be defined in "create_read.pl" or from the "tk_read" button.

When you have an "mtz" or "sca" diffraction data file - use it:


 $ create_main_config.pl -u HamSe_i.mtz -m MOLA --doit

The "use_sca.pl", "use_mtz.pl" and "use_cif.pl" assume several defaults. From the "scalepack" file IOBS are tarnsformed into FOBS, whereas from the "mtz" files the first "F" is accepted as fobs, the first "P" record as the phase, the first "W" record as the weight, the first HL label coefficients are used as labels for a later use in phase combination. Resolution and cell constants and space group are extracted too. (To change the defaults use "use_mtz.pl" and "create_read.pl" separetely.)

Note that "use_mtz.pl" own macro can only extract diffraction data from native binary files. For other computer architectures use CCP4 scripts (see MAIN_DOC:config/config.html).


reading mtz file "HamSe_i.mtz"
  CELL    96.0000  120.0000  166.1300   90.0000   90.0000   90.0000
             CELL_DATA:   96.0000 120.0000 166.1300 90.0000 90.0000 90.0000
       FILE_MTZ_NATIVE:   OFF
              INIT_MAP:   NONE
              MTZ_FOBS:   F
               MTZ_HLA:
               MTZ_HLB:
               MTZ_HLC:
               MTZ_HLD:
             MTZ_PHASE:   P
             MTZ_SIGMA:   J
              MTZ_TEST:   FreeR_flag
            MTZ_WEIGHT:   W
            RESOLUTION:   83.05 2.50
           SPACE_GROUP:   c2_2_21

Labels of unused MTZ columns are printed here:


 UNUSED MTZ FILE COLUMNS:

Then the main working environment files are created in the current directory:


creating cell.dat
creating "c2_2_21.symm"
creating "mtz_2_main.com"
executing "mtz_2_main.com" > mtz_2_main.com.log
creating .main
creating read.com
creating save_file.cmds
creating re_image.cmds
creating symmetry.cmds
creating symmetry_ca.cmds
creating refine.cmds
creating refine_b.cmds
creating gen_solvent.com
creating make_masks.cmds
 ! EACH MASK crystal form 0 segment: MOLA from map: 1
creating make_masks.cmds
 ! EACH MASK crystal form 0 segment: MOLA from map: 1
creating "map_mask_score_map.cmds"
creating dm_prep.cmds
creating dm_next.cmds
creating dm_loop.com


 MAIN configuration files have now been generated.


Now type "mainps" to start your MAIN session or
 if you are not happy with defaults
adjust your input data using "menu_read.sh" and other scripts.

How to end

Save you structural data by clicking "SAVE_FILE" and then "PROMPT" to exit the "image dialogue" mode and enter the "MAIN>" command mode.

Only from the MAIN command mode you can type "QUIT" to end your MAIN session.

To enter "image dialogue" mode from "MAIN> mode type"ima dial"".

Input data: files and parameters

All input files are read within the read.com macro. They are ASCII files and therefore editable with a text editor as Emacs.

To extract data from an MTZ file apply "use_mtz.pl" tool, if you haven't done it already. This will create a proper file with unit cell, symmetry operations and diffraction data. MAIN also reads "SCALEPACK" (".sca") files as well as a variety of "SHELX" files. These formats you can specify by use of the "create_read.pl".

Atom coordinate file

By default PDB files are taken. Default file name is "SAVE_FILE.PDB". Note that only ATOM, HETATM and CRYST records are read. (Be careful with the CRYST records as they may change your cell constants and space group.)


 ATOM      2  CA  SER      1     61.851  10.648  -2.722  1.00 27.00      MOLA

Cell constants file

Six numbers cell.dat represent unit cell lengths a, b, c and alpha, beta and gamma angle:


 120.5 120.5 63.15 90. 90. 120.

Diffraction data file

The default diffraction data file name is "diffract.dat". The default format "main", which assumes keyword based records based on old X-plor version format.

For molecular replacement cases it is sufficient to read FOBS and corresponding INDEX parameters:


 INDEX    8     1   -20  FOBS  1262.150  SIGMA 344.3 PHA 132.23 WE 0.345 TEST 0

Use "use_mtz.pl" to change the coulumn labels you want to extract or CCP4 tools. You can also tailor local "main_2_mtz.com" or MAIN_UTILS:mtz_phase_2_main.com to your needs by adjusting file names and record labels.

For SCALEPACK (".sca") files the format identifier "SCALEPACK" word has to be set by the "create_read.pl".

For SHELX (".php" and ".f2c") files the format identifier word "SHELX" has to be set by the "create_read.pl".

Sequence information data file

The sequence file has no default name. Each segment requires its own entry with it own segment identifier. In the case of NCS, multiple repetitions of the same molecule should each obtain their own entries in the protein sequence file.

MAIN format

Each new segment starts with the "MOLECULE" keyword immediately followed by the protein sequence in the single letter code. Delimiter between various molecules is a blank line. The end of sequence information is given by the keyword "END" or simply "end of file".


 molecule SIG
 MGAGPSLLLA ALLLLLSGDG AVRC


 molecule PRO
 DTPANCTYLD LLGTWVFQVG SSGSQRDVNC SVMGPQEKKV VVYLQKLDTA
 YDDLGNSGHF TIIYNQGFEI VLNDYKWFAF FKYKEEGSKV TTYCNETMTG
 WVHDVLGRNW ACFTGKKVG


 molecule PRO2
 		     T ASENVYVNTA HLKNSQEKYS NRLYKYDHNF
 VKAINAIQK SWTATTYMEYE TLTLGDMIRR SGGHSRKIPR PKPAPLTAEI
 QQKILH


 end

FASTA format

Each new segment ID is extracted from the first four characters following the ">" character at the beginning of the line.

The auto start file .main

The .main file automatically calls the read.com scripts and lunches session into the "image dialog" mode.


 -h|--help)     prints this message with available options and current status
  -c|--command)  define macro with MY COMMANDS [?MAIN_UTILS:my_commands.com]
  -i|--input)    set file name of the data input macro [read.com]
  -g|--groups)   define NCS groups by [NONE/CHAINS/SEGMENTS] [SEGMENTS]
  -o|--output)   set file name of the created macro [.main]
  -p|--pdb)      extract information from a PDB file []
  -x|--crystal)  PDB CRYSTAL record use [USE/IGNORE] [USE]
  -d|--depp)     depp pages [NEW/OLD] menu [NEW]
  -s|--display)  display position and sizes [0 0 1280 1024] menu []
     --doit)     create the macro

If you want to work in the old fashioned way turn of the "auto" mode in the "create_main_config.pl" or remove the .main file.

Changing the read.com

create_read.pl


  -h|-help|--help) prints this message with available options and current status
  -a|--atom|--pdb) file with molecular model [SAVE_FILE.PDB]
  -b|--back)       file of background molecules []
  -c|--ctab)       file with molecular connectivity table [SAVE_FILE.CTAB]
  -u|--cell)       file with unit cell constants [cell.dat]
  -d|--diffr )     file with diffraction data [diffract.dat]
                             format [MAIN/SCALEPACK/SHELX] [MAIN]
  -i|--init )      initial maps to be calculated [NONE/MODEL/FOBS/COMB] [MODEL]
  -l|--limit|--hkl)  set HKL limits for diffraction data [0, 0, 0]
  -m|--map)        file names of additional maps to be read []
  -o|--output)     file name of the created macro [read.com]
  -q|--seq)        file name with sequence information []
  -r|--resol)      resolution limits for diffraction data [10.0 - 3.0]
  -s|--symm)       space group symmetry [r3_2]
  -t|--topo)       basic topology libraries []
                   more than one can be specified:
                   [ENGH-HUBER/PURY/PURY_ALLH/SUGAR/RCSB_NUC]
  -y|--my_topo)    additional topology libraries []
     --doit)     create the macro read.com

Space group

The "use_mtz.pl" extractes the symmetry operators from the "mtz" file to your working directory. If you do not start with an "mtz" file specify the sspace group yourself. The default space group is "p1". A change to "p21" requires the use of a keyword "-s" or "--symm".


 -s p21

Note that "p21_21_21" is the correct form a a space group definition and not "p212121".

Resolution

The "use_mtz.pl" takes the complete resolution range from the "mtz" file. Usually it makes sense to start with a lower resolution.

The default resolution of a data from a non-mtz file is 3.0A, to change it define the new range by


 -r 20. 2.5

Diffraction data and formats

Default form is "MAIN" (old form of X-PLOR), Scalepack and shelx (3 forms) are supported. The SHELX form is recognized form the file name itself.


  -d|--diffr )     file with diffraction data [diffract.dat]
                             format [MAIN/SCALEPACK/SHELX] [MAIN]
  -d diffract.dat MAIN

Diffraction data HKL limits

Monoclinic crystals usually have H indices also negative with two-fold axes on K. In order to skip double reading of data HKL limits can be set also explicitly:


 -l -999 0 0

Initail map calculation

Initial map will be calculated from the input phases, otherwise it is assumed that atomic model is the source of your phase information. (Of course you can also read in maps instead of structure factors. )


  -i|--init )      initial maps to be calculated [NONE/MODEL/FOBS/COMB] [MODEL]


   -i model

External maps

Reading of external maps (in the XPLOR ASCII format) can be included into the read.com macro. A list of files follows the -m keyword:


 -m 2fo-fc_cns.xmap fo-fc_cns.xmap

Additional topology (for sugar rings)

Besides amino acids there can be also other topologies. Some additional residue topology such as sugar rings are already available.


  -t|--topo)       basic topology libraries []
                   more than one can be specified:
                   [ENGH-HUBER/PURY/PURY_ALLH/SUGAR/RCSB_NUC]


 -t SUGAR

Additional topology (for sugar rings)

The macro such as MAIN_UTILS:get_top_par_sugar.com is needed to read additional topology and parametar files.


 -y

Finish the read.com

Submitting read.com creation script


 create_read.pl
  -h|-help|--help) prints this message with available options and current status
  -a|--atom|--pdb) file with molecular model [SAVE_FILE.PDB]
  -b|--back)       file of background molecules []
  -g|--target)     file of target molecules []
  -u|--cell)       file with unit cell constants [cell.dat]
  -d|--diffr )     file with diffraction data [diffract.dat]
  -f|--format_dfr) diffraction data format [MAIN/SCALEPACK/SHELX] [MAIN]
  -i|--init )      initial maps to be calculated [NONE/MODEL/FOBS/COMB] [NONE]
  -l|--limit|--hkl)  set HKL limits for diffraction data [0, 0, 0]
  -m|--map)        file names of additional maps to be read []
  -o|--output)     file name of the created macro [read.com]
  -q|--seq)        file name with sequence information [sequence.dat]
  -r|--resol)      resolution limits for diffraction data [10.0 - 3.0]
  -s|--symm)       space group symmetry [p1]
  -t|--topo)       basic topology libraries more than one can be specified
                   [ENGH-HUBER/PURY_ALLH/SUGAR/RCSB_NUC]
                   current []
  -y|--my_topo)    additional topology libraries []
     --doit)     create the macro read.com


creating read.com

Understanding the read.com file

MAIN by itself is essentially empty. It knows no data nor file names. The mission of the read.com macro is to enable MAIN to read in data it is supposed to handle. It is not mandatory that a session starts with all listed files successfully read, I expect, however, that without data main can not use them.

A bit of understanding of what is going on in a read.com file is not essential for a beginner, it may, however, help you to run MAIN more smoothly and with less frustrations, especially if you see a flood of error messages at the beginning. It is wrong to ignore them!!!

The file starts with assigning file names to variables.


 set vari FILE_ATOM = SAVE_FILE.PDB
 set vari FILE_CTAB = SAVE_FILE.CTAB
set vari FILE_SYMM = MAIN_SYMM:p31_2_1.symm
 set vari FILE_FOBS = diffract.dat
 set vari FILE_CELL = cell.dat

Then comes resolution range.


 set vari RESOL_MIN global real = 35.81
 set vari RESOL_MAX global real = 2.40

The map variables help you to keep control of the maps on a descriptive level. Usually two maps are sufficient. Some variable names as are MAP_2FOFC, MAP_FOFC, MAP_FO and MAP_WORK are coded into macros, so they should point to an existing map. Two or more of them can even point to the same map, however, you should then remember to which command addresses which map.


 set vari MAP_2FOFC global = 1
 set vari MAP_FO global = 1
 set vari MAP_FOFC global = 2
 set vari MAP_WORK global = 2


 ! normally you do not edit below this line
 ! ==== end of variables data section  ==================

The next step is to load the topology and force field libraries. The "CSD" stands for Engh& Huber parameter set.


< ?MAIN_UTILS:get_top_par_19_csd.com

Read a file with atomic coordinates and immediatelly store the segment names of your structure in the character variable "WORK_SEGM". Many MAIN macros invoked by a menu click accept this variable as an argument so do not forget to set it. You can redefine it at any point.


 read file FILE_ATOM coor xpl
 set vari WORK_SEGM char global segm sele all end

If you worked with MAIN before, than you have probably saved your connectivity table. If no bonds have been read, the connectivity table is calculated from the interatomic distances. (Comment out the option you don't want.)


 read file FILE_CTAB ctable
 show bonds
if ( IRESULT_0 .le. 0 )<>utils/calculate.bonds WORK_SEGM

If you have during processing of a read.com macro successfully passed to here, then MAIN can already display your molecular model, regardless of the rest.

The next step is to enable energy calculations. After the connectivity table is created, atom types and charges set for nonbonding interactions and bonding energy terms lists can be created. The "DEF_ALL" is a character variable defined in a "get_top_par" file that points to a command file. You can redefine it and use your own file. When using crystallographic constraints it make sense to turn the side chain charges to zero.


<DEF_ALL WORK_SEGM
 define init group
 set charge sele .not atom name C CA N H O .a \
     resi name ASP GLU LYS ARG end = 0.0

If you get a bunch of warning and error messages, look carefully at each of them. They essentially tell you that atom and residue names do not match with the topology library descriptions and which force field parameters are missing. The simplest way to proceed is to go to the depp page 7 and click on "xpl2MAIN" menu item, which will probably fixed most of misunderstandings among various standards. Then return to menu page 9 and click "DEFINE". If the errors are still present there are two possibilities. Either some atoms are missing or you have problems with residues. The missing atoms can be inserted with the "FILL_ATO" item on depp page 7. (Be aware that atoms not belonging to a residue topology are deleted during this step.) For residues, you should check, which additional topology and parameter library files you have to read in. (For further instructions see the section below "Extended read.com parameters" and the topology and parameter files chapter MAIN_DOC:top_par/top_par.html).

WARNING: Without successfully defining atom types (CLASSES) and energy lists you will be able to perform energy calculations on all parts of the structure, however, the guessed parameters may not be accurate enough.

Then crystallographics data for each crystal from are read and maps created.

As soon as the programs gets data about the unit cell and symmetry operators you can generate and display the symmetry related molecules.


 read file FILE_CELL cell
 read file FILE_SYMM symm

The file of reflection records should be correctly processed. Pay attention particularly to the number of rejected reflections. Reflections may get rejected because of resolution but not because HKL lower limits, therefore HKL limits are important parameters. They are symmetry group dependent. Try to limit the reflection space by setting as strict HKL limits as still meaningful. Only the index with negative HKL indices should be set to some small enough value, all other HKL limits should be set to zero.

The HKL limits require some attention. The default setting here is to try to make a guess, however a second pass through reflection file may be necessary in order to get the HKL limits correct.


 read file FILE_FOBS refl init limit 0 0 0 re_read friedel \
     reso RESOL_MIN RESOL_MAX

After correct limits have been established and the file successfully read, it may makes sense to replace appropriate zeros with -999 in order to avoid reading the reflection file twice.

After reading the "WORK_REFL" key and "TEST" reflection keys are specified. If the TEST key does not exist then the it is defined beneath by taking randomly one out of each 20 reflections.


 refl key WORK_REFL sele defined end
 set vari IRESULT_0 global = -1
 refl show key TEST
 if ( IRESULT_0 .le. 0 ) then
 refl key TEST sele random 20 1 end
 end_if

When the "TEST" set originates from an "mtz" file it works properly only when the "use_matz.pl" native form has been used. Going through "mtz_2_main.com" the key test has to be inversed before the "WORK_REFL" is defined.


 refl key TEST sele defined .and. .not. TEST end


 refl key WORK_REFL sele defined .a .not TEST end

In order to enable structure factor calculations two whole units cell maps are created. Number of grid points per unit cell length is chosen automatically from the resolution of your data and unit cell length. You may wish, however, to place your own values. In such a case replace the command word AUTO_GRID by GRID and provide the 3 numbers. A grid step is supposed not to be shorter than 1/3 of maximal resolution, which makes their number approximately equal to cell length / RESOL_MAX * 3). (The largest prime number FFT can chew up is 19, so you should chose grid numbers so that they can be decomposed to prime factors not acceding 19) These number of grid points will be applied below in the two unit cell maps generation. The two maps will in general be enough for the most of map calculations you will perform.


 ! map section


 make map 1 from 0 init -9999 auto_grid cell real
 make map 2 from 1 init -9999
 ! copy map 2 3

The initial SCALE for the DENSITY ENERGY is set to 200, which pulls your model into the density and also starts to introduce geometry distortions. You should essentially keep changing the density scale during a modeling session. Enlarge it in order to enlarge convergence of a minimization run and make it smaller in order to bring the model geometry closer to ideal values.


 ener all on dens on dens map MAP_2FOFC dens scal 25.

Map calculation usually follows. This is for phases from file


 ! map calculation
<?MAIN_CMDS:calc_fo_map.cmds MAP_FO

and this commands calculate phases from model. Both are defined via "menu_read.sh" and "create_read.pl" scripts and may be also guessed based on "mtz" file records.


<?MAIN_CMDS:re_phase.cmds WORK_SEGM MAP_FOFC
<?MAIN_CMDS:calc_2fofc_map.cmds MAP_2FOFC

At this stage an interactive model building session can be started. So IMAGE is INITIALIZED (opened) and its center adjusted to the arithmetic middle of your atomic coordinates. The macro "model.view" is a result of a save command from your previous session. It includes complete information of the last saved view and retrieves it. When not found, nothing really happens.


 !      final part: opening display, unit cell, chain trace ...


 ima init cent calc
<model.view

At this point you may wish to load a Ramachandran plot on the screen. If so, remove the exclamation mark. You can do it also later by clicking "RAMACHAN" on menu page 7 (see MAIN_MENU:utils.html).


!<>utils/rama

The show.cell macro displayes a part of the crystal lattice of your unit cell, the chan trace of your starting model in white and stores the images into the background (FRODO MOL images are an equivalent).


<>utils/show.cell


 ima sele atom name CA C N end col 1 bond
 image from erase
 ima col 245 map 1 dens 0.8 16 16 16
 ima col  90 map 2 dens 2.0 16 16 16

Here we meet the re_image.cmds command non-interactively. The question mark in front of the file name indicates, that the file is first searched locally and when the file is not found MAIN picks the file from the MAIN cmds files installation directory.


<?MAIN_CMDS:re_image.cmds WORK_SEGM
 return

My first MAIN session

A MAIN interactive session usually starts by typing "mainps". See local instructions when it is not so.

If you haven't learn yet how to use menu items, see MAIN_MENU:depp_pages.html before you proceed.

My first trouble shooting

Before you actually start working with your structure, it makes sense to be sure that the data have been read into MAIN correctly.

It is quite likely that you will get a bunch of error messages. If files are missing, then it is the best to start again. "QUIT" a MAIN session by typing "Q" or "QUIT" - command words are not case sensitive. If you have entered the dialog mode then exit first (click menu item "PROMPT" or press "e" while mouse is in the image window, to get back the MAIN command prompt. Move or copy the files in correct places and then start again. "main_config" tools print out warnings about missing files, so don't disregard them.

If list of error messages resulting from your read.com is too long, most likely you have problems with atom and residue name definitions. See hints in the above section "The read.com file". A click on "xpl2MAIN" menu item on depp page "REN/SHOW", will resolve most of your problems. For special topologies (synthetic inhibitors) clisking will not help - so get PURY topology files from http://pury.ijs.si/.

If you don't see an image of your molecule, it is probably out of sight. Atom crosses indicate that the bonds were not calculated or that the atoms are too far apart. If also the list of image objects is missing - image window on the right contains only two blue object names, then the coordinate file is either empty or not read. Clicking menu item "SHW_SEGM" on page "REN/SHOW" gives you an idea about that. Typing the commands


> show atom
> show segments
> show vari WORK_SEGM

is an alternative way of doing the same thing.

So, if none of the above ifs is fulfilled, and you still see nothing, you should find the image. Try to center it on an atom (for more see the "IMAGE CENTER" command in reference manual MAIN_COM:image.html).


> image center atom 1

or


> image center seq P65

Crystal lattice image is presented as blue lines. If you see the lattice smaller than your model, then probably your cell constants file is missing and MAIN defaults to 1 A cell.

Check the symmetry operators by clicking an atom and display the images of symmetry mates (click "SYMMETRY" and "SYMM_CA" items on main depp page). To get read of the symmetry images click "RE_IMAGE". If cell constants are OK and symmetry image is missing or overlapping, either space group operators are wrong or not read. It could also mean that positioning of your molecular model in the crystal is wrong.

Diffraction data must be read properly. To get the diffraction data file in the right format see MAIN_DOC:intro/intro.html. If the data has not been read properly, you must think about HKL limits (see "READ REFLECTION" in MAIN_COM:read.html).

If there is a sufficinet number of model atoms, the next test is R-value calculation. When resonable is is safe to proceed with the structure determination tasks.

Displaying a molecule and its symmetry mates

Molecular object should be clearly presented and clearly differentiated from each other. The default presentation is that all bonded atom are presented as lines and the rest as crosses. Color is used to differentiate between atomic occupancy. Color coding is such that yellow are the real atoms and green the dummies. Left mouse butto click on "RE_IMAGE" (re_image.cmds) erases the images and redisplays the current model.

create_re_image.pl


 -h|--help)      prints this message with available options and current status
 -o|--output)    set file name of the created macro [re_image.cmds]
 -b|--bonds)     display bond type style [LINE/STICK] [LINE]
 -c|--color)     starting color index for molecules [160]
 -y|--hydrogens) display [ON/OFF] of hydrogen atoms [ON]
 -i|--increment) increment of next molecule color [5]
 -s|--style)     styles [WEIGHTS/TYPES/CHAINS] of molecular images [WEIGHTS]
    --doit)    creates the macro

Click with the right mouse buttonm opens the GUI window for the re_image.cmds configuration.

You can manipulate occupancy ("WEIGHT") with the "WEIG" items manually. "WEIGH_0" and "WEIGH_1" define the current weigh variable statos to "0.0" or "1.0". Subsequent clicks of "WEIG_ATO", "WEIG_SID", "WEIG_RES" and "WEIG_ACT" sets crystalographic weight (occupancy) of the last clicked atom, the side chain of an amino acid, the whole clicked residue and of all atoms selected within the key active respectively.

The symmetry mates are generated and displayed in a certain radius around the last clicked atom. By default they are shown in red. "SYMMETRY" (symmetry.cmds) displays all atoms in the vicinity and "SYMM_CA" (symmetry_ca.cmds) displays a CA plot showing crystal packing.

Several symmetry generation modes as well as other parameters are available:


 create_symmetry.pl
  -h|--help)    prints this message with available options and current status
  -o|--output)  set file name of the created macro [symmetry.cmds]
  -c|--color)   starting color index for molecules [105]
  -s|--step)    increment of next molecule color [5]
  -r|--radius)  symmetry generation in specified sphere radius
                around the last picked atom [30]
  -g|--gener)   symmetry generation of symmetry mates
                by [SPHERE/WHOLE/CLOSEST/EXACT]  current [SPHERE]
                SPHERE: within a sphere of the above specified radius
                CLOSEST: whole segments positioned closes to the last clicked atom
                WHOLE: whole segments positioned into the unit cell
                EXACT: segments are cut and shiftd to fit exactly the unit cell edges
     --doit)    create the macro

Click with the right mouse buttonm opens the Tk_config window for the "symmetry...cmds" configuration.

For CA plots only the "SPHERE" generation mode of symmetry atoms is available:


 create_symmetry_ca.pl
  -h|--help)    prints this message with available options and current status
  -o|--output)  set file name of the created macro [symmetry_ca.cmds]
  -c|--color)   starting color index for molecules [105]
  -s|--step)    increment of next molecule color [5]
  -r|--radius)  sphere radius for symmetry atoms generation [60]
     --doit)    create the macro

A subsequent click on "RE_IMAGE" deletes all the symmetry mates and redraws the image of the molecule.

The following options of the "main_config" effect the macros:


    [molec/m]      modifies the file containing the list of molecular segments
                   names and their total number:

See also MAIN_MENU:xray_build.html.

Map calculations

The "RE_PHASE" item (MAIN_CMDS:re_phase.cmds) calculates structure factors from your model, its R-value and subsequent clicks on the "2FO-FC_M" and "FO-FC_MA" items (MAIN_CMDS:calc_fofc_diff_map.cmds) calculate each its difference maps (2Fobs - Fcalc, Fobs - Fcalc). The "RE_PHASE" step is unnecessary after a "REFINE" or "REFINE_B" run, as structure factors are calculated therein as well.

For more see MAIN_DOC:calc_map/calc_map.html and MAIN_MENU:map_atom.html.

Displaying electron density maps

Displaying a map for the first time you have to choose it ("MAP_ACT?"), specify its contour level ("MAP_CONT") and its box in grid points away from the center ("MAP_BOX").

The middle of the screen is the implicit center for contouring of the maps. Image center can be moved manually using mouse or dials or set to an atom. If you want to contour at some distant atom, click the atom first, hit "CENTER" and "RE_MAP" or press keyboard "c" and "r".

Maps have default colors, you can change them using dials: Click the color mode in the image window and then a map. A dial will be attached to the clicked map color.

For more see MAIN_MENU:image_map.html and MAIN_DOC:calc_map/calc_map.html.

Moving atoms around

The depp page contains the five elementary functions for dial or mouse driven manual modifications of model geometry. "MOV_ATOM" moves the the last clicked atom around. "MOV_RESI" moves (translates and rotates) the last clicked residue around, where the clicked atom serves as the center of rotation. In both cases no bonds need to be broken in order to manipulate the atoms. "ROT_TRAN", however, accesses the whole network of covalently bond atoms. The last clicked atom is the center of rotation. If you want to move only a couple of residues around, you have to break the connections ("DELE_BON") to the rest of the structure.

You can rotate atoms about each bond in the chain between two clicked atoms using "RT_CHAIN". Each bond rotation gets a dial. You can load the rotations on the top of a "MOV_SELE" or "MOV_RESI" functions or other "RT_CHAIN" segments. You can rotate about the same bond in both directions etc... "RT_BETWE" rotates atoms, that are part of the chain between the last two clicked atoms, about the line connecting them. Use this function to flip peptide bonds.

"OBJ_ACCEP", "OB_REJEC" and "OB_START" accept the current geometry, reject the changes or return you to the initial stage without deleting the invoked dial functions governing the geometry changes.

The "DIAL_INI", "DIAL_SAV" and "DIAL_RST" allow you to exchange dial definitions. "DIAL_RST" scrolls through the dial sets by calling one after another. "DIAL_INI" sets the default dial definition which control the image rotation, center, scale and clipping planes.

For more details about the items see MAIN_MENU:modeler.html and MAIN_MENU:dials.html.

Energy calculations and minimization

The "ENERGY" menu block provides switches of energy terms used in a modeling session, the "SHOW_ENE" item calculates the energy of the "active" atomic selection against the "passive" atomic selection using the turned on energy terms. Besides the generally known energy terms, you can use also atomic pair distance constraints ("ENE_PAIR"), hydrogen bonds (EN_H_BON) and a density map correlation term ("ENE_DENS").

The density term ("ENE_DENS") requires special attention. It pulls your model during a minimization run more or less into the density and thereby more or less distorting it. In a rescaled map (1.0 sigma of the map is equal 1.0), the "ENERGY DENSITY SCALE" of less than 100. pulls your model only moderately into the density - it essentially prevents to the other energy terms to pull it out. Scales above 2000. may severely distort your model, however the model atoms may travel although across longer distances during a minimization run. So you should according to your needs change the "DENSITY SCALE" and "MINIMIZE" against various map during your working session:


> energy density map MAP_2FOFC density scale 100.

The "DEFINE" item invokes the "DEF_ALL" procedure and redefines the bonding lists and nonbonding atom types ("CLASSES"). Whenever you change the model topology including hydrogen bond pattern you should redefine the energy terms.

After keys "active" and "passive" have been defined, "MINIMIZE" will try to improve the geometry of your "active" atoms using the energy terms turned on. "UN_DO" can restore the atoms positions to the status before the last minimization run. Only one level of "UN_DO" exist. If you want you can however create your own internal copies of atomic model, and so save and retrieve geometry to and from them at your command.

Structure refinement

Menu items "REFINE" and "REFINE_B" invoke macros refine.cmds and refine_b.cmds, respectively. Both files are created with "main_config". You can choose between the target functions - either "DENSITY" of "X-TARGET" and the click on "REFINE" or "REFINE_B" to refine positions or atomic B-factors.

The following options of the "main_config" effect the macros only when you have local symmetry present (NCS):


    [molec/m]      modifies the file containing the list of molecular segments
                   names and their total number:
    [groups/g]     modifies group arrangements: averaging scripts are created
                   only for groups containing more than one segment

Parameters in both ".cmds" file are to be adjusted using a text editor. For detailed information see MAIN_MENU:map_atom.html and the chapter "Crystallographic refinement" (MAIN_DOC:refine/refine.html).

Defining 'active' and 'passive' keys

The two selection keys are arguments to quite a few MAIN menu functions as are, for example, energy calculations and secondary structure manipulations.

Use "ACT_SEGM" if you want to work with a whole molecule. If you want to work with fragments, you have first to break covalent bonds between them and the rest of the structure. "ACT_SEGM" selects all atoms that are included of the covalent bond network of the last clicked atom.

Use "ACT_2RES" if you want to work on a part of a chain, but keep its covalent bond attachments to the rest of the molecule. "ACT_2RES" selects all atoms between the two clicked residues in the key "active" and extends the key "passive" to all residues attached by a covalent bond to any of the active atoms.

When rebuilding a few residues only, "ACT_NEIGH" is probably the most desirable key definition that helps you to refine the local geometry. "ACT_NEIGH" selects in the key "active" all atoms that are within a certain number of steps through covalent bond network away from the last clicked atom.

When "ADD_MODE" is off (empty square) the keys are redefined, when it is on (filled square) the selections are added (appended) to the current ones.

"FIX_ATOMS" allows you to fix all clicked atoms (removes them from the key "active") so that a minimizer run can not effect their positions.

For details see MAIN_MENU:nice_sel.html, MAIN_MENU:select.html and MAIN_COM:select.html.

Changing topology of your model

Topology of a model is changed when number of atoms or bonds changes.

The creative commands are part of the "BUILD" model block. "START", "ATTACH", "INSERT" and "CHANGE" are flags, which tell MAIN what kind of change will be invoked after click the "RESID ??" item and respond with a list of residue names.

"START" mode requires no atomic argument (the history list may be empty). MAIN will start building a chain from the middle of the screen. The other three modes require the last clicked atom as a reference point. "ATTACH" builds a chain after the clicked residue, "INSERT" before it and "CHANGE" changes topology of a single residue only. "CHANGE" takes on a single residue from the "RESID ??" arguments.

When "START"-ing a new chain you should provide MAIN a segment name (via a click the item "SEGMENT") and a sequence root ("CHAIN ?"). A root A results in sequence names A1, A2 etc.

For more see MAIN_MENU:build.html and MAIN_DOC:build/build.html.

After inserting ("INSERT") a chain you should attach its terminal to a subsequent residue by clicking the two atoms that should become covalently bound and the item "CONNECT" positioned just below the "ROT_TRAN" item on the page 9.

Destructuve items delete the clicked atom ("DELE_ATO"), residue ("DELE_RES") or all the atoms selected within the key "active" ("DELE_ACT").

For more see MAIN_MENU:make_delete.html.

Using the secondary structure library

There was no room on the Depp page 9 for the secondary structure menu block, so it is placed on page 8. The secondary structure commands have two modes FORWARD or BACKWARD. When forward mode is active, the dihedral angles along the chain are adjusted forward (from N to C termini of a polypeptide chain for example). Adjustment of each dihedral rotates the rest of the attached structure about a bond, so that you may need to break bonds if you don't want to move the attached residues.

Secondary structure motifs (helices, strands etc) act on the selected group of atoms using the key "active", while the turns apply from the last clicked residue forwards or backwards the chain.

The item "DIHEDRAL" is an exception. It requires four atoms as arguments and it asks you for the dihedral you want to set to these four atoms. The value is than applied to the whole "active" selection. This way you can set secondary structure also to other polymers than peptides, to DNA and RNA chains for example.

For more see MAIN_MENU:secondary.html.

Saving the current model to disk

The SAVE item invokes save_file.cmds macro. If you change it it can write out anything. Usually it is enough to "SAVE" the atomic coordinates, their connectivity table and the current view.

MAIN doesn't save anything automatically, so it is you who has to hit the item.

Solvent generation

Towards an end of refinement of a structure with suffuciently high resolution of diffraction data (beyond 2.8 A) solvent molecules are build into the peaks of difference electron density map. For description and explanation of the utility by itself see MAIN_DOC:gen_solvent/gen_solvent.html.

Here only the way your local gen_solvent.com macro can be adjusted using "main_config". So choose the gen_solvent option in the "main_config" menu.

For the first run the defaults should be fine. So just "go":

and then click the "GEN_SOLV" item on page 10.

See also MAIN_MENU:map_atom.html, the "GEN_SOLV item.

Density modification

The default "main_config" procedure actually creates all the macros for density modification, however, it skips the details of the settings.

Density modification refers to tasks that modify electron density maps based on some prior knowledge. The major effort is directed into separation of region occupied by (diffracting) macromolecular atoms and that of solvent and than treat each separately from the other.

The scripts MAIN_CONF:create_rms_fit.pl, MAIN_CONF:create_make_masks.pl, MAIN_CONF:create_dm_prep.pl,MAIN_CONF:create_dm_next.pl and MAIN_CONF:create_dm_loop.pl create the necessary macros.

These files contain a cyclic density modification procedure. They are invoked via menu block "DENS_MOD" items (MAIN_MENU:dens_mod.html).

For details see MAIN_MENU:nmol.html.

Molecular envelope creation from atomic model

When atomic model is at least partially available it can provide the best possible mask over molecular region.

Let "main_config" to create and modify the file make_masks.cmds. The following options of the density modification procedure effect the macro:


    [atom/a]      mask atom maximal radius: 6.0
    [mask_dir/m]  directory for mask files:

Molecular envelope creation from electron density

When molecular model is more or less complete, we can accept all the defaults - in this case with one molecule in an asymmetric unit this actually means that density modification procedure is unnecessary. Quite often substantial parts (or even all atoms in an MIR case) of a molecular model are missing, so it makes sense to adjust the parameters as well as the density modification procedure to each particular case and optimize them by trying.

During each density modification cycle a procedure recalculates distribution of electron density by statistical means and demarks the protein from solvent region.

Here use of "solomon" method with its default radius is demonstrated. See in MAIN_DOC:1mol/1mol.html, MAIN_MENU:dens_mod.html and MAIN_MENU:map_mask.html for instructions how to chose a proper solvent ratio.

Treatment of solvent region

Here two approaches of solvent region modification are mentioned. It can be "flattened" or "flipped". (For other ways you are refered to the command syntax manual MAIN_COM:make.html.)

A zero flip makes solvent flat, wheras a number flips electron density of the solvent region around its average value. Flipping requires good envelopes and good flipping factors, otherwise it will only tear your density appart. Try various values. 0.4 is a good starting point.


 enter your choice - f


 flip calculated from solvent content flip = value / (1 - value)
 type new flip value - current [0.0]=0.4

Flipping factor is applied in re_fft_map.com.

Preparation step for density modification cycles

Bare in mind that without a starting map there will be nothing to modify. Here you have to decide, where to start from (your molecular model, file containing experimental phases or the current map 1)

create_dm_prep.pl


  -h|--help)    prints this message with available options and current status
  -o|--output)  set file name of the created macro [dm_prep.cmds]
  -p|--phases)  using [MODEL/FILE/MAP1] phases for map calculation [MODEL]
  -m|--map_in)  input map [3FO2FC/2FOFC/FOFC/COMB] (valid for MODEL phases only) [2FOFC]
     --doit)    create the macro

After succesfull creation, the map 1 is coppied into map 2, which will be used during averaging cycles and will at the end of each cycle contain the modified map resulting from the final fourier transformation.

The macro is invoked by a click on "DM_PREP". See also MAIN_MENU:dens_mod.html.

Density modification cycles

Here you actually only specify the number of density modification cycles. If you only want to see the density in the masked area, set the cycle counter to 0.


 create_dm_next.pl
  -h|--help)    prints this message with available options and current status
  -o|--output)  set file name of the created macro [dm_next.cmds]
  -e|--extend)  phase extension range [ - ]
  -c|--cycles)  number of full density modification cycles at given resolution [1]
                0 exits before re_fft
  -s|--step)    steps of phase extensions for
                a single reciprocal space lattice point [2]
A click on DM_NEXT invokes the macro dm_next.cmds. See also MAIN_MENU:dens_mod.html.

Density modification loop


 create_dm_loop.pl
  -h|--help       prints this message with available options and current status
  -o|--output     set file name of the created macro [dm_loop.com]
  -b|--bval       mask function spread (B-value) [1.0]
  -c|--content    solvent content [0.5]
  -x|--expression expression form of protein mask [LINEAR]
  -p|--procedure  solvent flattening procedure [WANG/SOLOMON] [WANG]
  -m|--map_out    output kind of map [FOBS/3FO2FC/2FOFC/COMB] [2FOFC]
  -f|--flip)      solvent flip (flip = value / (1 - value)) [0.0]
  -s|--sphere     solvent flattening sphere radius [10.0]
  -t|--treat      mask generation [ATOMS/STATISTICS/BOTH] [NONE]
  -w|--way        solvent flattening way [REAL_SPACE/FFT] [FFT]
     --doit       create the macro

The dm_loop.com deals with protein region of electron density maps. Besides using the explicit masks created from your model "ATOMS" as created my a clik on "MAK_MASK", you have to decide, which statistical method to use. "STATISTICS" for statistical approaches based on "WANG" or "SOLOMON" procedures. Use "BOTH" for deriving "ATOMS" and "STATISTICAL" envelopes.

Use of phase combination

In the density modification menu of "main_config" there is among others also an option to inlcude phase combination scripts in "input" and "output" map calculations:


  -m|--map_in)  input map [3FO2FC/2FOFC/FOFC/COMB] (valid for MODEL phases only) [COMB]
  -m|--map_out    output kind of map [FOBS/3FO2FC/2FOFC/COMB] [COMB]

The interface with CCP4 "sigma_A" program, is provided. If you use something else you should replace the "main_2_sigmaa_2_main" shell script for use with other programs. If you have done it, please send me an example so that it will be included in the following releases.

You can however also apply and invoke it independently from a phase combination procedure, via "PHAS_CMB" item. If you don't apply any density modification procedure you should get the shell script expilcitly via "main_config" menu:

create_phase_comb.pl


  -h|--help)    prints this message with available options and current status
  -a|--hlabcd)  HL ABCD coefficients [HLA HLB HLC HLD]
  -m|--mtz)     mtz file name with HL ABCD coefficients [eden.mtz]
  -f|--fobs)    fobs LABEL from mtz file [F2]
  -p|--phase)   phase on fobs from mtz file [PHIB]
  -r|--resol)   set resolution limits for diffraction data [35.81 - 2.40]
  -s|--sigma)   sigma fobs LABEL from mtz file [SIGF2]
  -w|--weigh)   figure of merrit (weight) on fobs from mtz file [FOM]
  -o|--output)  set file name of the created macro [phase_comb.cmds]
     --doit)    create the macro(s)

Undo

There are three UNDO levels. With the first one atomic coordinates can be retrieved from the intenal backup set. Either mennu item "UNDO" on pages "BLD_MAIN" and "BLD_RESI" or presisng the keyboard shortcut "u" will invoke it. The "UNDO" exchanges coordinates between the saved and working set. Each time when coordinates are modified by a geometry modification commands including model building, energy minimization or kicking the atomic coordinates are copied from the working into the save set. If the number of atoms is changed, the saved coordinates data are lost.

The second level: it is possible to reread the atomic coordinates from the input pdb file by clicking "REST_FIL".

The third level is the "input.cop" file in which all invoked commands are being stored. Each session starts with the "!date" record. The desired commands are to be copied cut from file input.cop"file into a macro, which is then invoked either from the main prompt or inserted at the end of the".main"file before the"return"line"line.