3.3.1. Cathepsin-B

The lysosomal cysteine proteinases play an important role in
intracellular protein degradation (see Barrett et al., 1988). Of these
proteinases, cathepsin-B is the most abundant and the most thoroughly
studied. Besides its involvement in intracellular protein turnover, it
has been implicated in tumor metastasis and in other disease states.
cathepsin-B exhibits optimal activity in slightly acidic media and is
irreversibly inactivated at alkaline pH-values.  It acts as an
endopeptidase with relatively broad specificity and a has slight
preference for basic residues or phenylalanine at P2 (using the
nomenclature of Schechter and Berger, 1967). Bulky side chains at P1
are disfavoured see Shaw et al., 1990).  A remarkable feature of
cathepsin B is its distinctive peptidyl dipeptidase activity (Aronson
and Barrett, 1978; Bond and Barrett, 1980; Takahashi et al., 1986;
Polgar and Csoma, 1987) at the carboxy terminus.  cathepsin-B is
inhibited by tipical cystein proteinase protein inhibitors such as
cystatins and stefins (see Biol. Chem. Hoppe-seyler 371, Suppl).

The complete amino acid sequences of rat (Takio et al., 1983), human
(Ritonja et al., 1985) and bovine (Meloun et al., 1988) cathepsin-B
and the partial sequence of the porcine (Takahashi et al., 1984)
cathepsin-B have been communicated.  According to the nucleotide
sequences (Chan et al., 1986; Fong et al., 1986; Ferrara et al.,
1990), cathepsin-B from human, rat or mouse is synthesized as a 339
amino acid residues polypeptide chain, which is processed to the
mature single-chain molecule of 254 amino acid residues. In mammalian
tissues, most of the active cathepsin-B is found as a two-chain
molecule consisting of 47 (or 49) and 205 (or 204) residue polypeptide
chains (light and heavy chain) covalently cross-linked by a disulfide
bridge.

The cathepsin-B sequence indicates a close structural homology with
the plant proteinase papain (Takio et al., 1983).  Comparisons of
the sequences of cathepsins -L and -H with those of papain and
actinidin resulted in alignment proposals for cathepsin-B (Takio et
al., 1983; Kamphuis et al., 1985).  Based on the 3-dimensional
structures of papain (Kamphuis et al., 1984) and actinidin (Baker,
1980), the common structural features as well as sites of insertions
and deletions were made more precise (Kamphuis et al., 1985; Baker &
Drenth, 1987). cathepsin-B is considerably larger than papain or actinidin,
and the acomodation of some of the longer polypeptide insertions and
the arrangment of the active site residues remained unclear.
A clear understanding of its specificity and of its catalytic
properties requires the availability of an experimental structure as
provided by X-ray crystallography.  

Human and rat liver cathepsin-B are the first crystallographically
determined structures of lysosomal cysteine proteinases (Musil et al.,
1991; Zucic et al., 1992).  They are structurally related to cysteine
proteinases of plant origin papain and actinidin.  The monoclinic
crystals of both proteins had P21 symmetry, though quite different
cell constants (human cathepsin-B: a= 86.23A, b= 34.16A, c= 85.56A,
$\beta$= 102.9o ; rat cathepsin-B a= 59.98A, b= 128.01A, c= 59.12A,
$\beta$= 121.47o ).  The human cathepsin-B crystallized with two
molecules per asymmetric unit in a quasi-tetragonal form and the rat
cathepsin-B with three molecules per asymmetric unit in an almost
perfect hexagonal form (FIGURES 3.3.1.1 and 2).

The structures were solved by a molecular replacement procedure, using
a molecular model based on papain and actinidin structures.  It merit
to mention that approximate position of the second molecule in the
human cathepsin-B crystals was deduced from heavy atom positions.

First we tried to refine the structure of human cathepsin-B.  The
electron density inside the mask of each separate molecule (FIGURE
3.3.1.3) was averaged applying the improper symmetry operations as
obtained by superposition of molecular models.  The cyclic averaging
procedure (including iterative fourier transformations of the density
into structure factors and back) wasn't applied, since we assumed that
averaging of two molecules does not suffice to improve the phases.
Unfortunately the model could not be refined bellow an R-factor of
0.30.

However, the current model of human cathepsin-B was applied to solve
the structure of rat cathepsin-B.  After successful rotational and
translational search, the models were crystallographically refined and
the residues adjusted to rat cathepsin-B sequence.  The resulting
electron density was averaged over all three molecules within a cyclic
procedure.  CHAR_LONG procedure was applied.  The procedure is
described in detail in APPENDIX C.  In the resulting electron
density map the loop 129 ...  140 could be immediately traced (FIGURE
3.3.1.4 and 5).  Afterwards the electron density averaging procedure
was applied as long as the molecular models during course of refinement
didn't start to diverge from each other.


3.3.2. Riboflavin synthase 

Riboflavin synthase is enzyme active in final steps of riboflavin
(vitamin B2) synthesis (review M\"uller et al., 1988).  See FIGURE
3.3.2.1.  Riboflavin is synthetised in microorganisms and plants.
Heavy riboflavin synthases from \it Bacillus subtillis \rm is a
complex of two enzymes quite different in their molecular weight.  The
complex consist of three $\alpha$-subunits and 60 $\beta$-subunits.
Actually the $\alpha$ subunit, and not the $\beta$ subunit, is
catalyzing the final step in riboflavin synthesis.  Therefore the
appropriate name for $\beta$ subunit should be lumazine synthase and
not riboflavin synthase.  The complete sequences of $\beta$-subunit
(Ludwig et al., 1987) and $\alpha$-subunit (Schott et al., 1990a) have
been communicated.  The crystal structure of heavy riboflavin synthase
(Ladenstein et al., 1988) has shown that the enzyme forms an
icosahedral capsid consisting of 60 $\beta$ subunits.  The
investigated hexagonal crystals belonged to P6322 symmetry group with
cell constants a=b= 156.4A, c= 298.5A and $\gamma$ =120o (Ladenstein
et al., 1983) with 10 $\beta$-subunits in an asymmetric unit.  That
structure had an R-factor of 0.399 at 3.3A resolution.

Later, lumazine synthase-riboflavin synthase complex was decomposed
into subunits, and its icosahedral capsid, consisting of $\beta$
subunits only could be rebuilt and three crystal forms of "riboflavin
synthase" were communicated (Schott et al., 1990b): A monoclinic
modification belonged to space group C2 with cell constants of
a=235.5A, b= 191.2A, and c= 165.4A and $\beta$= 134.5o and 30
molecules per asymmetric unit.  The crystals diffracted to 2.8A
resolution.  A hexagonal form belonged to space group P6322 with cell
constants of a=b=157.2, c= 300.8A and $\gamma$= 120o with 10 molecules
per asymmetric unit similar to the heavy riboflavin hexagonal form.
FIGURE 3.3.2.2 shows the spatial distribution of all 60 molecules.

This new hexagonal form of $\beta$ subunit was refined further to an
R-factor of about 0.32 (Ladenstein, unpublished results).  This model
was used in electron density averaging and refinement of the
monoclinic form, for which data to 2.45A resolution were collected.
The structure was subjected to rigid body refinement applying X-PLOR
including reflection data to 3.5A resolution (Ritsert, unpublished
results).  At this stage I joined the project by adapting MAIN
routines for electron density manipulation to handle electron density
maps of large crystal cells.

First only the array sizes have been changed and the CHAR_FAST
procedure as described in APPENDIX C was applied.  However, the
R-factor of the averaged map did not converge.  Therefore, the procedure was
reexamined and programmed with many modifications.  Finally, the
REAL_LONG procedure enabled us to start with a successful electron
density averaging procedure at 3.0A resolution gradually expanding the
phases to 3.0A.  Then the grid size was changed from 1.0A to 0.8A
and phase extension continued until reflections to 2.45A
resolution were included.  The procedure was essentially the same as the
cathepsin-B REAL_LONG procedure.  The only significant difference
introduced was that maps were added immediately after being
transformed.  They were not first stored on a disk and afterwards
averaged in a separate procedure (AVER_MAPS.COM).  During electron
density averaging of "riboflavin synthase" proper local symmetry 
operations were applied.
The model was adapted to the resulting averaged density,
crystallographically refined and new electron density maps were
calculated by further extending the phases via an electron density
averaging procedure.  This procedure was repeated several times until
all reflections were included. The current R-factor of the model is
0.23 including data to 2.45A resolution.  FIGURE 3.3.2.3 shows an
averaged 2Fo-Fc electron density map.  The complete description of 
the monoclinic form refinement will be presented by Ritsert et al..


3.3.3. Carbamoylsarcosine hydrolase

Here only a brief review of applied methods without structure
description is presented in order to manifest usage of MAIN in this
particular case.  The complete work is described by Rom\~{a}o et al. (1992).

3.3.3.1. Introduction

The crystals of carbamoylsarcosine hydrolase were obtained from the cloned
gene.  The crystals diffract beyond 
3.0A resolution and belong to the monoclinic space group C2, with cell
dimensions  a= 136.22A, b= 122.29A, c= 70.87A, $\beta$= 91.82o.
The self-rotation function of the Patterson map was used to search for
local two-fold axes, employing PROTEIN search routines.  
The peak at
$\psi$=0o, $\phi$=0o corresponds to the crystallographic b axis.  The
other large peaks indicate local diads relating the subunits within
the tetramer.  Peaks show up for polar angles ($\kappa$=180o ) $\psi$=90o,
$\phi$=82o; $\psi$=90o, $\phi$=172o; $\psi$=45o, $\phi$=172o; $\psi$=46o,
$\phi$=352o, with correlation values of
0.583, 0.575, 0.521 and 0.503 respectively, relative to the origin peak
(see FIGURE 3.3.3.2).

From crystal density measurements and auto-correlation of the native
Patterson map, it became evident that there are four molecules of
carbamoylsarcosine hydrolase per asymmetric unit.
Since there was no molecular model of a related enzyme available,
heavy atom derivatives had to be prepared.  Combination of two
uranium-, rhodium- and mercury- and osmium-derivative phase sets served
in the calculation of the first 3.0A resolution map.  Phases were weighted
by the figure of merit.
The obtained map was noisy and no secondary structural elements or 
molecular boundaries could be recognized. The m.i.r.
phases were then modified by solvent-flattening at 3.0A resolution.
The density in the solvent regions was set to zero (Wang, 1985) using
programs of M.Schneider.  The unit cell was sampled at 130x120x70 grid
points.  The radius of the averaging sphere was 9A and the solvent
level was adjusted to 0.51.

The modified electron density was Fourier transformed, and the resulting
phases were combined with m.i.r.  phases by applying the phase combination
procedure from Hendrickson and Lattman (1970).  Seven cycles of such
calculations were performed until convergence (R=0.233).  The quality
of the solvent-flattened density map allowed us to
define the boundaries of the tetramer in one asymmetric unit.  However,
polypeptide chains could still not be identified.


3.3.3.2. Determination of the local symmetry of the tetramer

The presence of four crystallographically independent subunits in one
asymmetric unit allows averaging of the electron density and
improvement in the quality of the final map.  In order to perform this
calculation, we needed to know the exact orientation and position of
the local symmetry axes.  Their orientations were obtained from the
self-rotation function, although the
intramolecular and crystal symmetry-generated axes were still 
ambiguos. The correct
position of the rotational axes was found from the electron density
as follows:

In the first step, a molecular envelope of an asymmetric unit  was
defined in the solvent-flattened density, using the program X-CONTOUR
(Buchberger,1991).  The density inside the selected envelope was
placed in an oversize P1 cell (204x183x105A) in order to avoid
intermolecular contacts.  This cell was Fourier-transformed, and,
using the newly calculated structure factors, a Patterson synthesis
(now of a single asymmetric unit) was performed.  As before, a
self-rotation function of the Patterson map was calculated.  The
obtained solutions were consistent with the previously
determined orientation of the non-crystallographic axes.  The presence
of a peak  corresponding to the crystallographic diad b indicated that
the selected envelope still included crystallographically equivalent
parts of another asymmetric unit.

To position correctly the local symmetry axes in the asymmetric unit,
a translation function for each of the four possible local axes was
calculated using real-space routines of PROTEIN.

The peaks of electron density selected inside the mask were rotated
about each of the local axes and translated in small increments with
respect to the unchanged m.i.r.  density.  The calculated correlation
function indicated maxima for the best positioning of the three axes
inside the asymmetric unit.  This calculation showed the three genuine
intramolecular local axes, while for the fourth axis, defined by the
polar angles  $\psi$=90o, $phi$=172o, $\kappa$=180o, no maximum was found;
this axis is generated by the crystallographic b axis and the local
diad at $\psi$=90o, $\phi$=82o, $\kappa$=180o (see FIGURE 3.3.3.2).

The orientations and positions of the three  local diads were refined
with the final results indicating three mutually
perpendicular two-fold axes of symmetry : axis number (1), is 6o
away from the c-axis of the crystal,  while the two other two axes 
(2) and (3) make angles of 45o with the crystal b-axis.

Since the center of rotation about each of the local symmetry axes was
in the lower half of the masked area, and since molecular boundaries
were not clearly recognized in regions where crystallographically
equivalent molecules came into contact, the molecular
envelope had to be improved.

The solvent-flattened density, placed inside the current envelope, was
put in the large P1 cell.  This density was averaged by applying the
local symmetry operations, expecting that density areas not
belonging to the same asymmetric unit should smear out.  With the
transformed averaged density,  a second, more clearly defined, envelope
was  produced.  The local symmetry operations were again determined as
described above.  The self-rotation function of the Patterson map,
calculated as before, confirmed the three local axes as major peaks,
now sharper as in the case of the first envelope.  The following
rotational and translational search gave a more correct orientation
and position for each of the non-crystallographic axes.  Averaging of
the electron density inside the chosen asymmetric unit was now
possible.


3.3.3.3. Initial averaging with ideal 222 symmetry

With the new envelope and new symmetry operations, the first averaged
map was calculated.  The solvent-flattened electron density was
averaged inside the mask by applying ideal 222 symmetry.  Afterwards,
the whole unit cell was generated and its density Fourier-transformed.
The Fourier transformations were carried out using programs in
PROTEIN.  The procedure was repeated until convergence of the electron
density R-value, which dropped from 0.44 to 0.29 after 7 cycles of
averaging.  When comparing the first averaged map with the one
resulting after 7 cycles of the averaging procedure, it was obvious that
cyclic averaging did not improve the map, suggesting  that the
asymmetric unit does not fulfill  ideal 222 symmetry.  The first
averaged map, however, was markedly improved in comparison to the
original m.i.r.  map, but still only a few secondary structural
elements (two $\alpha$-helical segments and some $\beta$-strands)
could be recognized.  Since model building could not proceed, the map
has to be further improved.


3.3.3.4. Proper-improper symmetry averaging

Using MAIN, a simplified representation of the  solvent flattened
density corresponding to one selected asymmetric unit was displayed
on a PS300 Evans & Sutherland graphic system.  We observed that the
original masked region could be split into two separate (upper and
lower)  parts (see FIGURE 3.3.3.3), suggesting the possibility to use improper
averaging.  For each of the individual regions, a new envelope  was
defined using MAIN.  A self-rotation function was calculated for the
density inside each envelope, confirming axis 3 (see FIGURE 3.3.3.2).
Axes 1 and 2,
therefore, have to be located in the plane separating both identified halves of
the asymmetric unit (see FIGURE 3.3.3.3 ).  The positions of the local axes
were then optimized by an interactive translational search procedure
followed by combined rotational and translational gradient
optimization using MAIN.  First, the position and orientation of an
ideal two-fold axis (axis 3) for the upper and lower halves was optimized.
The correlations of the maxima were 0.167 and 0.176 for the upper and
lower half respectively; the autocorrelation values were 1.0.  The
density inside each half was then averaged by applying the obtained
parameters for proper two-fold averaging.  Afterwards, both halves
with averaged densities were superimposed by rotations about  axes 1
and 2 and  the correlations maximized.  In these calculations, ideal
two-fold symmetry was no longer maintained.  Four additional
transformations were obtained - two for the  superposition of the
averaged upper half density to the averaged lower half rotated about
axes 1 and 2, and two for the reverse transformations.  The maximal
correlations obtained were in all four cases higher than 0.3.

These parameters were then applied in a cyclic averaging procedure
combining proper and improper averaging.  Upper and lower halves were
first averaged by applying proper symmetry (axis 3), and the averaged
halves were then averaged with improper symmetry relations (axes 1 and
2).  Improper averaging was done by transforming the averaged density
of the lower half about both axes 1 and 2 to the upper half region.
The averaged upper half and both transformed lower half density maps
were then added together and averaged.  The analogous procedure was
applied for the lower half.

The density of the complete crystal cell was generated by applying
crystal symmetry operations to each averaged half separately.  The
resulting cell was Fourier-transformed.  This procedure was repeated
in cycles and converged after 8 cycles of proper-improper symmetry
averaging (R factor = 0.43-0.25, 20-3A).  The resulting electron
density map was markedly improved in comparison to the map obtained
after ideal 222 symmetry averaging.  Many segments of the main chain
could now be traced and were built as a polyalanine chain (using
the program system FRODO (Jones, 1978) on a PS300 Evans & Sutherland
graphic system).  About 60% of the total number of residues were
built in as unconnected segments and the tetramer was generated using
the previous local symmetry operations.


3.3.3.5. Improper symmetry averaging

With these partial models of the four subunits, masks for each of them
could be defined using MAIN.  Intermolecular contacts were taken into
account in the program in order to avoid overlap of the masks.
Optimization of rotational and translational parameters between the
four density areas was repeated.  At this stage, positioning of each
model was first optimized in the electron density.  Molecules A and D
remained in their positions, while molecules B and C were slightly
moved.  All four molecular models were then superimposed by minimizing
the r.m.s distances between equivalent atoms.  12 new rotational
matrices and translational vectors were thus determined.  These new
local symmetry operations enabled us to apply the non-ideal, improper
symmetry averaging to the tetrameric asymmetric unit.

In the first stages of improper symmetry averaging, averaging was
performed still without including phases from the partial model but
the masks were expanded according to the `growth' of the molecular
model.  The solvent-flattened density was four times averaged,
independently for each subunit of the tetramer.  The cell was
reconstructed from the four averaged densities and Fourier
back-transformed.  The new density was again averaged and the
procedure was repeated.  It converged after 12 cycles of averaging,
with R=0.42-0.24.

With these new density maps, the model could be further improved, and
its calculated phases were then combined with the original m.i.r.
phases ( Hendrickson & Lattman procedure).  Parts of the model which
were considered questionable were omitted from the phase calculation.
Model phases were weighted using Sim (1959) formulas.  The whole
averaging procedure was then carried out as follows:

The model was built in one of the subunits of the tetramer.  The other
three subunits were generated by applying the local- symmetry
operations.  The whole tetramer was refined with the program X-PLOR
(Br\"unger et al.  1989). 
With the refined model, four new masks were re-determined and new
local symmetry parameters re-calculated.  M.i.r.  phases were combined
with phases from the refined model and an electron density map was
calculated.  Cyclic averaging was then applied to this initial map
using the new masks and symmetry operations.  After convergence
the whole model was re-evaluated and refit to the electron
density on the graphics system.  Fitting was checked also against the
m.i.r.  map.  After several rounds of model building, crystallographic
refinement, phase combination and electron density averaging, the 
electron density slowly improved.

FIGURES 3.3.3.4 shows the improvement  of the electron density maps.
The final R-factor for 56641 reflections between 10.0 and 2.0A resolution 
and 8304 atoms of the tetramer is 0.186.

The procedure, stressing data manipulations done with MAIN, is
described in detail in APPENDIX D.


3.4. Maps

Map is a 3-dimensional array of grid points, where each of them has a 
a value. According to the value, they are treated as empty, density or
mask points. The grid points with their values inside the density interval
are the density points, the ones with their values below the density interval
are empty points, and the ones above are mask points.

Each map has size, its starting coordinates and its cell constants.
Grid points laying in different unit cells with the same fractional
coordinates are identical. That means, when a whole unit cell is 
defined in a map, the program can expand the density through the whole space.

There are 6 elementary operations that can be done with map data:
- Creation of a map,
- Rotations and translations of a map,
- Building the crystall unit cell from an assymetric unit or its parts,
- Seting values to selected grid points,
- Scaling a single map with a constant and 
adding two maps.

The smallest map conntains only a header where cell constants and crystal
symmetry operations and number of grids along the cell axes are stored.
Program can read PROTEIN and its own native formats of electron
density maps. The maps can be written in the Lyn Ten Eyck and MAIN native
formats. The native formats are ASCI files with record length 80
so that they can be edited and changed with editor.
Besides a variety of smaller conversion programs was written that allow
conversion of maps between PROTEIN, X-PLOR, P1SF, FRODO and native MAIN
formats.

Operations on a map grid points can be applied when they are
empty or masked. (The exception is the SET command that can set 
a value to a grid point in any specified range.)

A new map can be created from an already existing one by 
taking its cell constants and size or from scratch.
Map's size, origin, and number of grid points per cell length,
cell constants can be taken from an already existing map or
modified. The grid points are initialized to a specified value.
Map's origin and size can be defined from an atom selection
so that the selection plus some boundry grid points lie inside it.

A map can be transformed (rotated, translated or copied) into the mask 
points of another map by linear interpolation. This is done so that
the position of a mask point is transformed into the space of 
the density map. Its value is then obtained by interpolation from
the eight surrounding grid points. 

Empty points of a map can be filled with values of the density points
of another map by applying crystal symmetry operations. This routine is
independent on crystal symmetry group. In order to 
find the position of the grid point into which the density value 
should be copied, it doesn't apply
rules, but the rotation matrices and translation vectors.

Mask points can be defined in several ways:

- By setting all grid points with their values
in specified range to the mask value.
- By a distance criteria from a selection of atomic center positions
- By conversion grid points to real space points and then to atoms and
further to mask grid points.

 Real space points can be created from a density by
 - defining a range of grid points values in a box
 - searching for density piks via
     - a square function 
     - sum of density of surrounding grid points points 
             (solvent flattening algorithm)
     - weight center along the a axes