predictiveinfo is a program within the sortseq_tools package which computes the mutual information between a linear energy matrix model and a data set. Models which are closer to reality will have a higher mutual information value.
After you install `sortseq_tools`_, this program will be available to run at the command line.
usage: sortseq predictiveinfo [-h] [-ds DATASET]
[-expt {None,sortseq,selex,dms,mpra}] [-s START]
[-e END] [-m M] [-o OUT]
| -ds, --dataset | Undocumented |
| -expt, --exptype | |
Undocumented Possible choices: None, sortseq, selex, dms, mpra | |
| -s=0, --start=0 | |
| Position to start your analyzed region | |
| -e, --end | Position to end your analyzed region |
| -m, --m | Model file, otherwise input through the standard input. |
| -o, --out | Undocumented |
The input table should be a sorted library data set (the model should be specified after the -m flag). You should use the –start and –end flags to specify the region in the data set that the model corresponds to.
Example Input Table:
seq ct_0 ct_1 ...
AGTT 20 13
CCTA 35 40
...
Example Output Table:
info
.94
Example command to run the analysis:
sortseq predictive info -i my_dataset.txt -m my_linear_matrix_model.txt -s 5 -e 20