POINTLESS (CCP4: Supported Program)

NAME

pointless

SYNOPSIS

pointless [HKLIN] foo_in.mtz  [HKLREF reference_file.mtz] [HKLOUT output_file.mtz]
[Keyworded Input]

References
Input and Output files
Examples
Release Notes

DESCRIPTION

General disclaimer: this program is very much under development and is likely to change.

Pointless has at least two possible functions:

Mode 1. (MODE LAUEGROUP). This mode is selected if no HKLREF dataset is specified.

Given a test dataset of unmerged observations (file HKLIN), the program looks for possible symmetry based on the unit cell in order to determine the Laue group, ie the symmetry of the diffraction pattern. It then examines axial reflections (and zones for non-chiral crystals) to look for systematic absences to determine the space group.

Warning Notes
:

Mode 2. (MODE ALTERNATIVE). Given a test dataset, merged or unmerged (file HKLIN), and a merged reference dataset in a known space group (file HKLREF), the program tests any possible alternative indexing schemes of the test dataset to find which one best matches the reference set. Alternative indexing schemes  arise in high symmetry space groups when the lattice symmetry is higher than the point group symmetry (eg for trigonal space groups), but arise in any space group from special relationships between cell parameters (eg an orthorhombic cell with a=b).

In either mode, if a file is assigned to HKLOUT, then the reindexed file will be written out from the HKLIN file, using the best reindexing. In LAUEGROUP mode, the hklout file is assigned to the "best" space group or point group (or the SPACEGROUP if given), and in ALTERNATIVE mode, to the spacegroup of the reference file.

The program may also be used just to reindex or change the spacegroup, if both HKLIN and HKLOUT files are defined, and keywords SPACEGROUP and/or REINDEX are given.

Details of LAUEGROUP mode

  1. The maximum lattice symmetry consistent with the unit cell dimensions from the HKLIN file is determined, within an angular tolerence of 2 degrees (or that given on the TOLERANCE command). Alternatively, if the command ORIGINALLATTICE is given, the lattice symmetry corresponding to the space group in the HKLIN file is used, or the LAUEGROUP command may be used to specify a particular Laue group.
  2. The intensity data are read from the HKLIN file, reindexed in the asymmetric unit of the lattice symmetry (if necessary), and sorted to bring potentially equivalent observations together.
  3. The intensities are normalised to E2 , making <E2> = 1, using an overall B-factor and a further correction smoothed on resolution bins. Unless resolution limits are explicitly set (RESOLUTION command), an automatic high resolution limit is applied, at the approximate point where <I>/<sigmaI> < IsigLimit (default value 4.0, set with ISIGLIMIT command). It is best to exclude weak high resolution data from the scoring functions, as they contain no useful information for this purpose.
  4. All rotational symmetry elements of the lattice symmetry are first scored separately. For example, in a tetragonal lattice, the symmetry elements are: 4-fold axis along c; 2-fold axes along a, b, c, (110) and (1-10)
  5. The most useful scoring function seems to be a correlation coefficient (CC) on E2 , calculated for pairs of observations related by a particular symmetry element. Another score calculated is  Rmeas, the multiplicity-weighted R-factor. In order to allow for small samples, the CC score is converted to a "significance" score or Z-score by dividing by an estimated standard deviation. This is calculated by taking many pairs of observations at the same resolution which cannot be related by symmetry, dividing them into groups of the same size as the test sample (with a maximum of 200), the score calculated for each group and their mean & standard deviation calculated. Then

    Z(score) = [Score - Mean(UnrelatedScore)]/Sigma(UnrelatedScore)]

  6. All Laue groups which are sub-groups of the lattice group are generated by combining pairs of symmetry elements (including the identity) and completing the groups. The sub-groups are then scored by combining the scores for the individual elements, counting scores for elements present in the sub-group as positive & those elements not in the sub-group as negative. It is not clear what is the best way of combining the element scores: at present two methods are used
    1. Combined Score ("Zc" in logfile)
      The correlation coefficients are recalculated (summed) over all "for" and over all "against" elements, Z(for) and Z(against) are calculated, then NetZ = Z(for) - Z(against)
    2. RMS average score ("Za" in logfile)
      Z(for or against) = +/-Sqrt(Sum(+/-Z(element)2))               where "+/-" follows the sign of Z(element)
      NetZ = Z(for) - Z(against)
  7. The potential Laue groups are ranked according to scoring method (1), and tested for acceptance for further output and testing. A group is accepted if its score is:
    1. greater than (AcceptanceLimit * Maximum score)  where AcceptanceLimit is set by the ACCEPT command [default 0.9]   or
    2. greater than (Maximum score - AcceptanceDifference) where AcceptanceDifference is set by the ACCEPT command [default 1]   or
    3. the score from method (2) is greater than its value for the first ranked group.
    4. the group is the original Laue group from the HKLIN file.

    5. If too many of the symmetry elements have no observations, then only the Laue group from the HKLIN file is accepted, unless the command LAUEGROUP ALL is given.
  8. All accepted Laue groups are tested for relevent systematic absences. These are scored to produce a combined score for possible space groups. 
    1. Each relevent "zone" in the lattice group is tested for absences. Zones are typically axes tested for absences due  to screws, or (in the case of non-chiral spacegroups) zones to be tested for glide planes.
    2. Observations which lie in the zone, such as along an axis, are scored by a Fourier analysis of I/sigma(I). The score used is the peak height at the appropriate point in Fourier space, eg at 1/2 for a 2(1) screw, relative to the origin: a perfect score for exact absences is thus 1.0, a score for no absences might be 0.0
    3. For each spacegroup in the Laue group an unnormalised probability estimate is made from the Laue group score and the systematic absence score, and the space groups are ranked accordingly. In some case, more than one space group may have identical scores.

KEYWORDED INPUT - DESCRIPTION

Keywords are:
ORIGINALLATTICE, RESOLUTION, ISIGLIMIT, NONCHIRAL, LABREF, LABIN, LAUEGROUP,SPACEGROUP, REINDEX, TOLERANCE, ACCEPT

All input is optional. Only the first four characters of each keyword are significant.

ORIGINALLATTICE

Use the original lattice symmetry from the file instead of determining the maximum lattice symmetry from the cell dimensions.

RESOLUTION [[LOW] <ResMin>] [[HIGH] <ResMax>

Resolution limits in A, either order or with keys HIGH or LOW. If this command is absent, the program imposes an automatic high resolution limit based on a minimum value for <I>/<sigmaI> within resolution shells (see ISIGLIMIT). Limits given here override the I/sigma limits.

ISIGLIMIT <minimum<I>/<sigmaI>>

Minimum value for <I>/<sigmaI> within resolution shells. This is used to set the maximum resolution for inclusion of data in the scoring. This is overridden by explicit RESOLUTION limits. Default value 4.0.

NONCHIRAL [CENTROSYMMETRIC]

If this is present, the lists of possible space groups include non-chiral (or just centrosymmetric) ones as well as the [default] chiral ones.

LABREF  [F | I =]<columnlabel>

Only for MODE ALTERNATIVE (ie if HKLREF is assigned). For the reference dataset, this defines the column label for intensity or amplitude (which will be squared to an intensity). If this command is omitted, the first intensity or amplitude will be used. The next column is assumed to contain the corresponding sigma.

LABIN  [F | I =]<columnlabel>

Only for MODE ALTERNATIVE (ie if HKLREF is assigned) and if the test dataset is merged. For the test dataset, this defines the column label for intensity or amplitude (which will be squared to an intensity). If this command is omitted, the first intensity or amplitude will be used. The next column is assumed to contain the corresponding sigma.

LAUEGROUP  HKLIN || <Laue group name> || ALL

Select a Laue group instead of testing all possible ones, ie select one solution for further processing. A REINDEX command may be given to specify a particular reindexing operator. The keyword HKLIN indicates that the Laue group from the input HKLIN file should be used, otherwise the Laue group name.   The keyword ALL may be used to force the program to accept all possible Laue groups, even if there appear to be insufficient symmetry-related observations to distinguish them.

SPACEGROUP  HKLIN || <Space group name>

Select a space group to write to the output HKLOUT file. A REINDEX command may be given to specify a particular reindexing operator. The keyword HKLIN indicates that the Laue group from the input HKLIN file should be used, otherwise the space group name.  In this case all that the program does is to reindex the data into the given space group and write it out.

REINDEX  [LEFTHANDED] <reindex operator>

Specify a reindex operator (in the form eg "k,h,-l") to go with a specified Laue or space group. Note that there is no check that the operator is sensible and consistent with the Laue group.

Normally this must be a right-handed operator (ie correspond to a matrix with positive determinant), and the program will fail if it is not, but the keyword LEFTHANDED allows a negative-determinant transformation to be applied. Be very sure that you really want to do this! It is only valid if the hand of data has been inverted by some previous mistake in the integration program.

TOLERANCE <LatticeTolerance>

Tolerance in degrees for determination of lattice symmetry [default 2 degrees]. Tolerance is the maximum deviation from the expected angle between two-fold axes in the lattice group, eg for a putative tetragonal lattice where a~=b, the expected angle between the diagonals is 90 degrees, and the deviation   delta = 2 tan^-1(a/b) - 90

ACCEPT <AcceptanceLimit>  <AcceptanceDifference>

Parameters for acceptance criterion. A group is accepted if its score is:
  1. greater than (AcceptanceLimit * Maximum score)  where AcceptanceLimit is set by the ACCEPT command [default 0.9]   or
  2. greater than (Maximum score - AcceptanceDifference) where AcceptanceDifference is set by the ACCEPT command [default 1]

Input and output files

HKLIN

The file containing the test dataset.

Mode 1. This must be an unmerged file of intensities eg from Mosflm

Compulsory columns are H, K, L, M/ISYM, BATCH, I, SIGI
Optional columns are IPR, SIGIPR, TIME, XDET, YDET, ROT, WIDTH, MPART, FRACTIONCALC, LP, FLAG, BGPKRATIOS, SCALE, SIGSCALE

If a SCALE column is present it will be applied on input.

Mode 2. This may be unmerged (as above) or merged. Unless a column is specified in the control input, the first column of type J (intensity) or F (amplitude) will be used for comparison with the reference dataset. Amplitudes are squared to intensities on input.

HKLREF

The file containing the reference dataset for Mode 2 (alternative). This must be merged. Unless a column is specified in the control input, the first column of type J (intensity) or F (amplitude) will be used for comparison with the reference dataset. Amplitudes are squared to intensities on input.

HKLOUT

In LAUEGROUP mode, the test dataset reindexed in the "best" pointgroup (the Laue group without a centre of symmetry). In ALTERNATIVE mode, the test dataset with the best reindexing, in the spacegroup of the reference dataset. Note that for a merged test dataset, in ALTERNATIVE mode withe a merged test dataset(HKLIN), reindexed reflections are not reduced to the asymmetric unit, because reindexing may generate a Bijvoet-related index and  if there are anomalous differences these need to be inverted.

Examples

Simple usage, all defaults (mode Lauegroup):

 pointless [hklin] <filename.mtz>

With reference dataset (mode alternative):

pointless hklref amph_I.mtz hklin amph_scaled.mtz

With reference dataset and control input (mode alternative)
:

pointless hklref n20n6c1c2n6x2x14e10e7.mtz \
      hklin cd3_1_F.mtz << eof
resolution 4.0
labref  F_nat20
labin F_cd3_1
eof

Just reindexing

pointless hklin <infile> hklout <outfile> << eof
reindex k,h,-l
spacegroup P31
eof

Release notes

1.0.0, 1

Systematic absence analysis to choose spacegroup
Correction of various bugs

0.5.0,1,2

Alternative ways of combining Z+ & Z- (Za, Zc), remove combined RMSD printing, more input controls (ORIGINALLATTICE, LAUEGROUP, REINDEX, TOLERANCE, ACCEPT)

0.4.0

HKLOUT output added

0.3.0

Mode Alternative, labin, labref

0.2.0

User input, resolution, Isiglimit, Nonchiral