Using REDCRAFT

REDCRAFT is a fairly complex suite of many tools and scripts for processing RDC data and generating PDB files.

When REDCRAFT is first installed on UNIX or UNIX-like systems, a symbolic link is created from <REDCRAFT Install Directory>/bin/redcraft to /usr/local/bin.

The redcraft binary simply functions as a linker to the rest of the binaries in the bin/ folder.

Overview of binaries

To execute any REDCRAFT binary, simply execute redcraft <binary>. A list of REDCRAFT scripts and binaries is available with the command redcraft --help.

Examples:

  • stage1 : Stage 1 accepts an RDC file prefix and generates REDCRAFT-compatible .angles and .out files, which are used by stage2 to generate an interactive .pdb file.
  • stage2 : Stage 2 requires no additional commands and will use the .angles, .out, and .redcat.m* files to generate PDB files. It does, however, require a redcraft.conf config file.

Stage 1

The first stage of structure determination is based on eliminating torsion angles incompatible with Ramachandran space and J-coupling. Finally, the surviving torsion angles are ranked on the basis of fitness to the RDC data available from the juxtapositional peptide planes.

Stage 1 is executed by running redcraft stage1 <RDC prefix> [Ramachandran space] [GLY-Ramachandran space] [RDC RMSD cutoff].

  • <RDC prefix> - required
    • Each alignment medium data should be in its own file with the same file prefix ending with .1, .2 etc… stage1 needs to know the prefix (such as RDC if your file names are RDC.1 and RDC.2).

    It will find the count of RDC files automatically.

  • [Ramachandran space] (For non-GLY residues) - optional
    • 1 (default, most strict)
    • 2 (less strict)
    • 3 (entire space)

    Value of 1 is the most restrictive Ramachandran space. The torsion angle clusters for beta-sheet and alpha-helicies do not touch.

  • [GLY-Ramachandran space] - optional
    • 0 (default, most strict)
    • 1 (less strict)
    • all (entire space)

    Glycines are known to have a greater freedom of variation due to their lack of side chain. This is important to consider these extra possible torsion angles. Value of 0 utilizes a lesser restrictive space common for GLY residues. Value of 1 is the same restriction as non-glycine space.

  • [RDC RMSD cutoff] - optional
    • Any decimal value
    • skip

    Stage 1 ranks and sorts each torsion angle combination based on its local fitness to available RDC data. The user can specify a cutoff fitness after which torsion angles are discarded. By default no values are discarded. A keyword skip will tell stage1 not to compute the fitness at all. This does not hinder the quality of Stage 2, but may inconvenience further analysis. Skipping the fitness greatly speeds up Stage 1.

Stage 2

Stage 2 uses the ranked lists of the surviving local geometries (from Stage 1) to create and extend the fragment that most optimally fits the experimental data. Stage 2 begins with the starting peptide plane and continually adds on peptide planes until the ending residue has been reached. Each new list of predicted structures is ranked after each new peptide plane is added.

Below is an example of a typical redcraft.conf (used during Stage 2).

The Stage 2 configuration file is in a modified .INI format that allows for comments with #. Closing tags, i.e. [/Run_Settings] are optional.

This file can be generated by running redcraft stage2 --create-new Note: Order of section may vary when using the GUI

[Run_Settings]
Run_Type=new
Start_Residue=1
Stop_Residue=20
Media_Count=2 # number of <RDC file prefix>.x files present
Data_Path="."
RDC_File_Prefix=nefRDC
Default_Search_Depth=100
LJ_Threshold=50.0
[/Run_Settings]

[Depth_Search_Settings]
Residue_16=2500
Residue_19=1500
[/Depth_Search_Settings]

[Decimation_Settings]
#Cluster_Sensitivity=1
#Score_Threshold=2.5
#Decimation_Ranges=3, 4-6, 9
#Cluster_Count=100
#Maximum_Number_of_Additional_Structures=100
[/Decimation_Settings]

[OTEstimation]
# syntax for OrderTensorEstimation is S?=Sxx Syy Sxy Sxz Syz
#S1=Sxx Syy 0 0 0
#S2=Sxx2 Syy2 Sxy2 Sxz2 Syz2
#Tolerance=1.0
#Weight=1.0
#Estimation_Range=5-25,40,42
#Dmax
[/OTEstimation]

[Refinement]
script1=./unconstrained.prl
[/Refinement]

The first block [Run_Settings] will be the core for your REDCRAFT run. It includes the following parameters :

  • Run_Type {new | continue} :
    • new will start a run with 1 residue from residue number Start_Residue until Stop_Residue
    • continue allows you stop a run and pick up where you left off at a later time
  • Start_Residue {numerical, 1 or greater} :
    • This value specifies a starting point for angle files from Stage 1 and RDC data. (Useful for fragmented folding)
  • Stop_Residue {numerical} :
    • This parameter is the last residue number that will have predicted torsion angles. It can be any residue number that is less than or equal to the last residue in the RDC data files.

    Redcraft also enables reverse folding, which allows you to swap the start and stop residue. A tutorial demonstrating this feature is available in the tutoral section.

  • Media_Count :
    • This is the number of alignment media created for the protein; we shall call this parameter m. The alignment media must have the names Prefix.1, Prefix.2, etc. They must always be numbered in order from 1 to m. See the section on Data Generation to see how to create these files.
  • Data_Path :
    • Path to the data sets you are using
    • Default is “.” indicating that your data is in the same directory that you plan to run the program from
  • RDC_File_Prefix {string data file prefix} :
    • The datafiles containing the RDC information do not necessarily have to be named RDC.1, RDC.2 etc. This field allows a custom prefix. Stage 2 will expect the data files to be in format of Prefix.1, Prefix.2 etc.
    • Default is RDC (so if you have 3 data sets your file names would be RDC.1, RDC.2 and RDC.3)
  • Default_Search_Depth {numerical, greater than 0} :
    • This parameter refers to the depth of search and it is denoted as d. When extending a fragment by adding a peptide plane, the lists of angles are combined to create a large list (usually at least 10,000 angle combinations). REDCRAFT will take the top d angle combinations and eliminate the rest. The larger the depth of search, the better the results. However, a greater depth of search will increase computation time. We recommend a typical search depth of 2000. Deeper search depths may be required for more corrupt data, or lack of data.
    • Default is 100
  • LJ_Threshold | DEPRECATED
    • Lennard-Jones Threshold is not entirely functional and is thus not dependable.
    • Default is 50.0

The second block, [Depth_Search_Settings] allows you to increase the search depth of individual residues. You may include as many individual depths as you would like, separated by a new line.

  • Example usage: Residue_N=M where N is a residue between Start_Residue and Stop_Residue and M is a search depth, typically upper-bounded by 10000.

The third block, [Decimation_Settings] allows for the perservation of structures that have been rejected by your search depth. These structures are clustered and passed forward along with the structres that did meet the search criteria

  • Cluster_Sensitivity {numerical, 1 or greater} :
    • This sets the number of structures in each cluster. At 1, every structure is its own cluster effectively not clustering the structures at all. Cannot be used with Cluster_Count
  • Score_Threshold {numerical} :
    • This value acts as the cut off for accepting additional structures through decimation. The value entered here is a percent. REDCRAFT then takes that percent and uses it to calculate the exact score threshold by calculating X*(1+percent) where X is the pervious worst structure considered and percent is the value entered in the config file. For example, if the last worst score is 2.00 and the score threshold is 50, corresponding to 50%, then decimation will cluster structures with RDC rmsd up to 3.00. In general, 50% is much too high for most proteins
  • Decimation_Ranges :
    • This controls which residues decimation is preformed on. Seperate ranges with ‘,’ and indicate ranges with a ‘-’. 3, 4-6, 9 translates to residues 3,4,5,6, and 9
  • Cluster_Count {numerical} :
    • This feature tries to make the total number of clusters approximately the value entered. The actual number of clusters will vary. Cannot be used with Cluster_Sensitivity
  • Maximum_Number_of_Additional_Structures {numerical} :
    • This caps the number of structures added to the .out files. Decimation can get momory intensive so this feature is a failsafe. The total number of structures in the .out file is less than of equal to whatever your search depth is + this value

The fourth block is [OTEestimation] and at this time it is not currently functional

The fifth block, [Refinement] can run a number of different scripts to further curtail the output of REDCRAFT. More than one script can be used at the same time by adding script2=…, script3=…, and so on