profile_info

Overview

profile_info is a program within the sortseq_tools package which calculates the mutual information between base identity at a given position and expression for each position in the given data set.

After you install `sortseq_tools`_, this program will be available to run at the command line.

Command-line usage

usage: sortseq profile_info [-h] [-s START] [-i I] [-e END]
                            [-t {dna,rna,protein}] [-o OUT]
Options:
-s=0, --start=0
 Position to start your analyzed region
-i, --i Input file, otherwise input through the standard input.
-e, --end Position to end your analyzed region
-t=dna, --type=dna
 

Undocumented

Possible choices: dna, rna, protein

-o, --out Undocumented

Example Input and Output

The input to the function must be a sorted library a column for sequences and columns of counts for each bin. For selection experiments, ct_0 should label the pre-selection library and ct_1 should be the post selection library. For MPRA experiments, ct_0 should label the sequence library counts, and ct_1 should label the mRNA counts.

Example input table:

seq       ct_0    ct_1     ct_2...
ACATT     1       4        3
GGATT     2       5        5
...

Example output table:

pos    info    info_err
0      .02     .004
1      .04     .004
...

The mutual information is given in bits.

An example command to run this analysis is:

sortseq profile_info -i sorted_library.txt -s 20 -e 80 -o info_profile.txt

Table Of Contents

Previous topic

profile_enrichment

Next topic

fromto_mutrate

This Page