DeepRibo: browse model predictions
S. typhimurium (3,474 positive predictions)
E. coli (2,999 positive predictions)
C. crescentus (2,203 positive predictions)
M. smegmatis (4,607 positive predictions)
B. subtilis (2,779 positive predictions)
S. coelicolor (1,359 positive predictions)
S. aureus (852 positive predictions)
DISCLAIMER: For each organism, a model has been trained using the ribosome profiling data from the other five datasets. Afterwards, the trained model is used to make predictions on the listed dataset.
As the model gives probability scores, the cut-off to differentiate positive vs. negative predictions is set as to obtain an equal amount of positive predictions as to the amount of positively annotated genes in the test set. These predictions are visualised using the GWIPS-viz browser. More information is given in the full article and the GitHub repository.
Multiple tracks are visualised:
- ribo-seq signal: The ribo-seq signal which the model uses to make predictions.
- all_TIS_samples: These list the Translation Initiation Sites (TISs) of all Open Reading Frames (ORFs) on which the model has made predictions. TISs are visualized as the actual ORFs used as input would obscure the tracks.
- ann_TIS_samples: These list the positively labeled TISs.
- mutual_TISs: These list the TISs of the positive predictions given by DeepRibo in agreement with the putative label given by the NCBI assembly (single start site setting).
- mutual_ORFs: These list the ORFs of the positive predictions given by DeepRibo in agreement with the putative label given by the NCBI assembly (single start site setting).
- DeepRibo_TISs: These list the TISs of the positive predictions given by DeepRibo not in agreement with the putative label given by the NCBI assembly (single start site setting).
- DeepRibo_ORFs: These list the ORFs of the positive predictions given by DeepRibo not in agreement with the putative label given by the NCBI assembly (single start site setting)
- MS_predictions: These list the predicted TISs in a multiple start site setting. The top k ranked predictions are given a positive label, equal to the amount of positive labels present in the NCBI assembly.
- Mass_spectrometry: (Uniquely for E. coli) These list the aligned peptide fragments from the mass spectrometry data.
- Edman_Sequencing: (Uniquely for E. coli) These list the aligned peptide fragments from the verified protein set (hosted by EcoGene).