Title: | 'rifi' analyses data from rifampicin time series created by microarray or RNAseq |
---|---|
Description: | 'rifi' analyses data from rifampicin time series created by microarray or RNAseq. 'rifi' is a transcriptome data analysis tool for the holistic identification of transcription and decay associated processes. The decay constants and the delay of the onset of decay is fitted for each probe/bin. Subsequently, probes/bins of equal properties are combined into segments by dynamic programming, independent of a existing genome annotation. This allows to detect transcript segments of different stability or transcriptional events within one annotated gene. In addition to the classic decay constant/half-life analysis, 'rifi' detects processing sites, transcription pausing sites, internal transcription start sites in operons, sites of partial transcription termination in operons, identifies areas of likely transcriptional interference by the collision mechanism and gives an estimate of the transcription velocity. All data are integrated to give an estimate of continous transcriptional units, i.e. operons. Comprehensive output tables and visualizations of the full genome result and the individual fits for all probes/bins are produced. |
Authors: | Loubna Youssar [aut, ctb], Walja Wanney [aut, ctb], Jens Georg [aut, cre] |
Maintainer: | Jens Georg <[email protected]> |
License: | GPL-3 + file LICENSE |
Version: | 1.11.0 |
Built: | 2024-10-31 04:30:05 UTC |
Source: | https://github.com/bioc/rifi |
apply_ancova checks the variances between 2 segments showing either pausing site (ps) or internal starting site (ITSS) independently.
apply_ancova is a statistical test to check if fragments showing ps and ITSS events have significant slope using Ancova test.The function uses ancova test. Ancova is applied when the data contains independent variables, dependent variables and covariant variables. In this case, segments are independent variables, position is the dependent variable and the delay is the covariant.
apply_ancova(inp)
apply_ancova(inp)
inp |
SummarizedExperiment: the input data frame with correct format. |
the SummarizedExperiment with the columns regarding statistics:
ID: |
The bin/probe specific ID. |
position: |
The bin/probe specific position. |
strand: |
The bin/probe specific strand. |
intensity: |
The relative intensity at time point 0. |
probe_TI: |
An internal value to determine which fitting model is applied. |
flag: |
Information on which fitting model is applied. |
position_segment: |
The position based segment. |
delay: |
The delay value of the bin/probe. |
half_life: |
The half-life of the bin/probe. |
TI_termination_factor: |
String, the factor of TI fragment. |
delay_fragment: |
The delay fragment the bin belongs to. |
velocity_fragment: |
The velocity value of the respective delay fragment. |
intercept: |
The vintercept of fit through the respective delay fragment. |
slope: |
The slope of the fit through the respective delay fragment. |
HL_fragment: |
The half-life fragment the bin belongs to. |
HL_mean_fragment: |
The mean half-life value of the respective half-life fragment. |
intensity_fragment: |
The intensity fragment the bin belongs to. |
intensity_mean_fragment: |
The mean intensity value of the respective intensity fragment. |
TU: |
The overarching transcription unit. |
TI_termination_fragment: |
The TI fragment the bin belongs to. |
TI_mean_termination_factor: |
The mean termination factor of the respective TI fragment. |
seg_ID: |
The combined ID of the fragment. |
pausing_site: |
presence of pausing site indicated by +/-. |
iTSS_I: |
presence of iTSS_I indicated by +/-. |
ps_ts_fragment: |
The fragments involved in pausing site or iTSS_I. |
event_duration: |
Integer, the duration between two delay fragments. |
event_ps_itss_p_value_Ttest: |
p_value of pausing site or iTSS_I. |
p_value_slope: |
Integer, the p_value added to the inp |
delay_frg_slope: |
Integer, the slope value of the fit through the respective delay fragment |
velocity_ratio: |
Integer, the ratio value of velocity from 2 delay fragments |
data(stats_minimal) apply_ancova(inp = stats_minimal)
data(stats_minimal) apply_ancova(inp = stats_minimal)
apply_event_position extracts event time duration for pausing site or iTSS
apply_event_position is a short version of apply_Ttest_delay function to extract event time duration for pausing site or iTSS. Its adds a new column with the duration.
apply_event_position(inp)
apply_event_position(inp)
inp |
SummarizedExperiment: the input data frame with correct format. |
The SummarizedExperiment with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
p_value of pausing site or iTSS_I.
Integer, the p_value added to the inp.
Integer, the slope value of the fit through the respective delay fragment.
Integer, the ratio value of velocity from 2 delay fragments.
Integer, position of the event added to the input.
data(stats_minimal) apply_event_position(inp = stats_minimal)
data(stats_minimal) apply_event_position(inp = stats_minimal)
apply_manova checks if the ratio of hl ratio and intensity ratio is statistically significant.
apply_manova compares the variance between two fold-changes HL and intensity within the same TU (half-life frgA/half-life frgB/intensity frgA/intensity frgB). HL fragment could cover two intensity fragments therefore this function sets first fragments borders and uses manova_function. Manova checks the variance between 2 segments (independent variables) and two dependents variables (HL and intensity).
apply_manova(inp)
apply_manova(inp)
inp |
SummarizedExperiment: the input data frame with correct format. |
The probe data frame with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
p_value of pausing site or iTSS_I.
Integer, the p_value added to the inp.
Integer, the slope value of the fit through the respective delay fragment.
Integer, the ratio value of velocity from 2 delay fragments.
Integer, position of the event added to the input.
Integer, the fold change value of 2 HL fragments.
String, the fragments corresponding to HL fold change.
Integer, the p_value added to the input of 2 HL fragments.
Integer, the fold change value of 2 intensity fragments.
String, the fragments corresponding to intensity fold change.
Integer, the p_value added to the input of 2 intensity fragments.
Integer, the value correspomding to synthesis rate.
String, the event assigned by synthesis rate either Termination or iTSS.
Integer, the value corresponding to HL and intensity fold change.
String, the fragments corresponding to intensity and HL fold change.
Integer, the fold change of half-life/ fold change of intensity,position of the half-life fragment is adapted to intensity fragment.
Integer, the p_value added to the input.
data(stats_minimal) apply_manova(inp = stats_minimal)
data(stats_minimal) apply_manova(inp = stats_minimal)
apply_t_test uses the statistical t_test to check if the fold-change of half -life (HL) fragments and the fold-change intensity fragments respectively are significant.
apply_t_test compares the mean of two neighboring fragments within the same TU to check if the fold-change is significant.Fragments with distance above threshold are not subjected to t-test.Dataframes with less than 3 rows are excluded.
apply_t_test(inp, threshold = 300)
apply_t_test(inp, threshold = 300)
inp |
SummarizedExperiment: the input data frame with correct format. |
threshold |
integer: threshold. |
The functions used are:
fragment_function: checks number of fragments inside TU, less than 2 are excluded otherwise they are gathered for analysis.
t_test_function: excludes dataframes with less than 3 rows, makes fold-change and apply t-test, assign fragments names and ratio, add columns with the corresponding p_values.
the SummarizedExperiment with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
p_value of pausing site or iTSS_I.
Integer, the p_value added to the inp.
Integer, the slope value of the fit through the respective delay fragment.
Integer, the ratio value of velocity from 2 delay fragments.
Integer, position of the event added to the input.
Integer, the fold change value of 2 HL fragments.
String, the fragments corresponding to HL fold change.
Integer, the p_value added to the input of 2 HL fragments.
Integer, the fold change value of 2 intensity fragments.
String, the fragments corresponding to intensity fold change.
Integer, the p_value added to the input of 2 intensity fragments.
data(stats_minimal) apply_t_test(inp = stats_minimal, threshold = 300)
data(stats_minimal) apply_t_test(inp = stats_minimal, threshold = 300)
apply_t_test_ti compares the mean of two neighboring TI fragments within the same TU.
apply_t_test_ti uses the statistical t_test to check if two neighboring TI fragments are significant.
apply_t_test_ti(inp)
apply_t_test_ti(inp)
inp |
SummarizedExperiment: the input data frame with correct format. |
the SummarizedExperiment with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
p_value of pausing site or iTSS_I.
Integer, the p_value added to the inp.
Integer, the slope value of the fit through the respective delay fragment.
Integer, the ratio value of velocity from 2 delay fragments.
Integer, position of the event added to the input.
Integer, the fold change value of 2 HL fragments.
String, the fragments corresponding to HL fold change.
Integer, the p_value added to the input of 2 HL fragments.
Integer, the fold change value of 2 intensity fragments.
String, the fragments corresponding to intensity fold change.
Integer, the p_value added to the input of 2 intensity fragments.
Integer, the value correspomding to synthesis rate.
String, the event assigned by synthesis rate either Termination or iTSS.
Integer, the value corresponding to HL and intensity fold change.
String, the fragments corresponding to intensity and HL fold change.
Integer, the fold change of half-life/ fold change of intensity,position of the half-life fragment is adapted to intensity fragment.
Integer, the p_value added to the input.
Integer, the p_value added to the input.
String, the fragments subjected to statistical test.
data(stats_minimal) apply_t_test_ti(inp = stats_minimal)
data(stats_minimal) apply_t_test_ti(inp = stats_minimal)
apply_Ttest_delay checks the significance of the point between 2 segments showing pausing site (ps) and internal starting site (ITSS) independently
apply_Ttest_delay: is a statistical test to check the significance of the point between 2 segments showing pausing site (ps) and internal starting site (ITSS) independently. The function uses t-test. The last point from the first segment and the first point from the second segment are selected and added to the residuals of each model. The sum is subjected to t-test.
apply_Ttest_delay(inp)
apply_Ttest_delay(inp)
inp |
SummarizedExperiment: the input data frame with correct format. |
the SummarizedExperiment with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
p_value of pausing site or iTSS_I.
data(stats_minimal) apply_Ttest_delay(inp = stats_minimal)
data(stats_minimal) apply_Ttest_delay(inp = stats_minimal)
Check_input reviews the input given by the user
check_input stops the operation if the input data frame has severe faults. Less severe faults lead to the removal of wrong IDs and a warnings describing the problem. The Summarized Experiment colData must have the columns "timepoint" with the timepoints convertible to numeric and containing the timepoint 0. If replicates are used the column in colData must be called "replicate". The replicate must be convertible to numeric. In the RowRanges, optionally, IDs can be given as character (except ",","|","_"), but need to refer to a unique position/strand combination. Strand information needs to be given. The relative intensity in the assay must be numeric. The relative intensity for the first time point cannot be 0 or NA.
check_input(inp, thrsh = 0)
check_input(inp, thrsh = 0)
inp |
SummarizedExperiment: the input data frame with correct format. |
thrsh |
numeric: the minimal allowed intensity for time point "0". |
the SummarizedExperiment object: checked, and with position, ID and filtration added to the rowRanges.
data(example_input_minimal) check_input(inp = example_input_minimal, thrsh = 0)
data(example_input_minimal) check_input(inp = example_input_minimal, thrsh = 0)
dataframe_summary creates two tables relating gene annotation to fragments
dataframe_summary creates two tables summary of segments and their half-lives. The first output is bin/probe features and the second one is intensity fragment based.The dataframe_summary creates one table with feature_type, gene, locus_tag, position, strand, TU, delay_fragment, HL_fragment, half_life, intensity_fragment, intensity and velocity. The second table is similar to the first one but in compact form. It contains the same columns, the only difference is on position where a start and end position are indicated separately.
dataframe_summary(data, input)
dataframe_summary(data, input)
data |
SummarizedExperiment: the input data frame with correct format. |
input |
dataframe: dataframe from event_dataframe function. |
bin_df: |
all information regarding bins:
|
frag_df: |
all information regarding fragments:
|
data(stats_minimal) data(res_minimal) dataframe_summary(data = stats_minimal, input = res_minimal)
data(stats_minimal) data(res_minimal) dataframe_summary(data = stats_minimal, input = res_minimal)
dataframe_summary_events creates one table with all events between the segments
dataframe_summary_events creates one table with the following columns: event, features, p_value, event_position, event_duration, position, region, gene, locus_tag, strand, TU, segment_1, segment_2, length, velocity_ratio, FC_HL, FC_intensity, FC_HL/FC_intensity.
dataframe_summary_events(data, data_annotation)
dataframe_summary_events(data, data_annotation)
data |
SummarizedExperiment: the input data frame with correct format. |
data_annotation |
dataframe: dataframe from processed gff3 file. |
String, event type either pausing site, iTSS_I, iTSS_II, Termination, HL_event, Int_event, HL_Int_event and velocity_change
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
Boolean. The bin/probe specific strand (+/-)
String, The overarching transcription unit
String, the first segment of the event, includes the segment, TU, delay fragment in case of ps or iTSS_I. The rest of the events include HL fragment and intensity fragment
String, same description as segment_1 but is the second fragment of the event
Integer, the difference (min) between 2 delay fragment when ps or iTSS_I happen
Integer, length in position (nt), calculated by the difference between the last position of the first fragment and the first position of the second fragment.
Integer, number of fragements involved on the event
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) dataframe_summary_events(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) dataframe_summary_events(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
dataframe_summary_events_HL_int creates one table with all events between the segments
The dataframe_summary_events_HL_int creates one table with the following columns: event, features, p_value, event_position, position, region, gene, locus_tag, strand, TU, segment_1, segment_2, length, FC_HL, FC_intensity, FC_HL/FC_intensity.
dataframe_summary_events_HL_int(data, data_annotation)
dataframe_summary_events_HL_int(data, data_annotation)
data |
SummarizedExperiment: the input data frame with correct format. |
data_annotation |
dataframe: dataframe from processed gff3 file. |
String, event type.
Integer, p_value of the event.
Integer, p_value adjusted.
Integer, the fold change value of 2 HL fragments.
Integer, the fold change value of 2 intensity fragments.
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment.
Fold change of half-life/ fold change of intensity.
String, region annotation covering the fragments.
String, gene annotation covering the fragments.
String, locus_tag annotation covering the fragments.
Boolean. The bin/probe specific strand (+/-).
String, The overarching transcription unit.
String, the first segment of the event, includes the segment, TU, delay fragment in case of ps or iTSS_I. The rest of the events include HL fragment and could be extended intensity fragment.
String, the second fragment of the two of fragments subjected to analysis.
Integer, the difference (min) between 2 delay fragment when ps or iTSS_I happen.
Integer, length in position (nt), calculated by the difference between the last position of the first fragment and the first position of the second fragment.
Integer, number of fragements involved on the event
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) dataframe_summary_events_HL_int(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) dataframe_summary_events_HL_int(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
dataframe_summary_events_ps_itss creates one table with all events between the segments.
The dataframe_summary_events_ps_itss creates one table with the following columns: event, features, p_value, event_position, event_duration, position, region, gene, locus_tag, strand, TU, segment_1, segment_2, length, velocity_ratio.
dataframe_summary_events_ps_itss(data, data_annotation)
dataframe_summary_events_ps_itss(data, data_annotation)
data |
SummarizedExperiment: the input data frame with correct format. |
data_annotation |
dataframe: dataframe from processed gff3 file. |
String, event type.
Integer, p_value of the event.
Integer, p_value adjusted.
Integer, the position middle between 2 fragments with an event.
Integer, the ratio value of velocity from 2 delay fragments.
String, region annotation covering the fragments.
String, gene annotation covering the fragments.
String, locus_tag annotation covering the fragments.
Boolean. The bin/probe specific strand (+/-).
String, The overarching transcription unit.
String, the first segment of the event, includes the segment, TU, delay fragment in case of ps or iTSS_I. The rest of the events include HL fragment and could be extended intensity fragment.
String, the second fragment of the two of fragments subjected to analysis.
Integer, the difference (min) between 2 delay fragment when ps or iTSS_I happen.
Integer, length in position (nt), calculated by the difference between the last position of the first fragment and the first position of the second fragment.
Integer, number of fragements involved on the event
data(stats_minimal) if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } dataframe_summary_events_ps_itss(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
data(stats_minimal) if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } dataframe_summary_events_ps_itss(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
dataframe_summary_events_velocity creates one table with all events between the segments.
The dataframe_summary_events_velocity creates one table with the following columns: event, features, p_value, event_position, event_duration, position, region, gene, locus_tag, strand, TU, segment_1, segment_2, length, velocity_ratio.
dataframe_summary_events_velocity(data, data_annotation)
dataframe_summary_events_velocity(data, data_annotation)
data |
SummarizedExperiment: the input data frame with correct format. |
data_annotation |
dataframe: dataframe from processed gff3 file. |
String, event type.
Integer, p_value of the event.
Integer, p_value adjusted.
Integer, the position of event, calculated dividing the last position of the first fragment and the first position of the next fragment on 2.
Integer, the ratio value of velocity from 2 delay fragments
String, region annotation covering the fragments.
String, gene annotation covering the fragments.
String, locus_tag annotation covering the fragments.
Boolean. The bin/probe specific strand (+/-).
String, The overarching transcription unit.
String, the first segment of the event, includes the segment, TU, delay fragment in case of ps or iTSS_I. The rest of the events include HL fragment and could be extended intensity fragment.
String, the second fragment of the two of fragments subjected to analysis
Integer, the difference (min) between 2 delay fragment when ps or iTSS_I happen.
Integer, length in position (nt), calculated by the difference between the last position of the first fragment and the first position of the second fragment.
Integer, number of fragements involved on the event.
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) dataframe_summary_events_velocity(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) dataframe_summary_events_velocity(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
dataframe_summary_TI creates one table with all TI fragments,p_value and the coordinates
The dataframe_summary creates one table with the following columns: event, TI_fragment, TI_factor, TI_fragments_TU, p_value, feature_type, gene, locus_tag, strand, TU, features, event_position, position_1 and position_2.
dataframe_summary_TI(data, input)
dataframe_summary_TI(data, input)
data |
SummarizedExperiment: the input data frame with correct format. |
input |
dataframe: dataframe from event_dataframe function. |
String, event type, transcription interference.
String, the fragment with TI.
String, the factor of TI fragment.
Integer, p_value of the event.
Integer, p_value adjusted.
String, region annotation covering the fragments.
String, gene annotation covering the fragments.
String, locus_tag annotation covering the fragments.
Boolean. The bin/probe specific strand (+/-).
String, The overarching transcription unit.
Integer, The number of segments within the TI.
Integer, the position middle between 2 TI fragments.
String, the first position of TI fragment, if 2 fragments, first position is from the first fragment.
String, the last position of TI fragment, if 2 fragments, last position is from the second fragment.
WIP
data(stats_minimal) data(res_minimal) dataframe_summary_TI(data = stats_minimal, input = res_minimal)
data(stats_minimal) data(res_minimal) dataframe_summary_TI(data = stats_minimal, input = res_minimal)
event_dataframe creates a dataframe only with events
event_dataframe creates a dataframe connecting segments, events and the annotation.
event_dataframe(data, data_annotation)
event_dataframe(data, data_annotation)
data |
dataframe: the probe based data frame. |
data_annotation |
dataframe: the coordinates are extracted from gff3 |
The functions used are:
position_function: adds the specific position of ps or iTSS event.
annotation_function_event: adds the events to the annotated genes.
annotation file needs to be supplied. Strand is indicated in case of stranded data The event_dataframe selects columns with statistical features. ID, position, strand and TU columns are required.
A dataframe with unique intensity fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
Boolean. The bin/probe specific strand (+/-)
String, The overarching transcription unit
Integer, position of the bin/probe on the genome
String, the bin/probe segment on the genome
String, the fragments subjected to fold change
Integer, the fold change value of 2 intensity fragments
Integer, p_value of the FC_intensity
String, the fragments subjected to fold change
Integer, the fold change value of 2 HL fragments
Integer, p_value of the FC_HL
String, fragments subjected to FC_HL/FC_intensity
Integer, the value of FC_HL/FC_intensity
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Integer, p_value of the event FC_HL/FC_intensity
Integer, the value correspomding to synthesis rate
String, the event assigned by synthesis rate either Termination or iTSS
Boolean, presence or absence of pausing_site event (ps)
Boolean, presence or absence of internal starting site event (iTSS_I)
String, fragments involved on the event ps or iTSS_I
Integer, the position middle between 2 fragments with an event
Integer, the duration between two delay fragments
Integer, the delay value of the bin/probe
The half-life of the bin/probe
The relative intensity at time point 0
Integer, the slope value of the fit through the respective delay fragment
Integer, the p_value added to the inp
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) event_dataframe(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } data(stats_minimal) event_dataframe(data = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
An example SummarizedExperiment from E. coli An example SummarizedExperiment from RNA-seq containing information about the intensities at all time points (assay). Seqnames, IRanges and strand columns (rowRanges)and colData with time point series and replicates.
data(example_input_e_coli)
data(example_input_e_coli)
A assay:
relative intensities at 0 min
relative intensities at 1 min
relative intensities at 10 min
relative intensities at 15 min
relative intensities at 2 min
relative intensities at 20 min
relative intensities at 3 min
relative intensities at 4 min
relative intensities at 5 min
relative intensities at 6 min
relative intensities at 8 min
https://github.com/CyanolabFreiburg/rifi
An artificial example SummarizedExperiment An example SummarizedExperiment containing information about the intensities at all time points (assay). Seqnames, IRanges and strand columns (rowRanges) and colData with time point series and replicates.
data(example_input_minimal)
data(example_input_minimal)
An object of class RangedSummarizedExperiment
with 4 rows and 33 columns.
https://github.com/CyanolabFreiburg/rifi
An example input data frame from Synechocystis PCC 6803 A SummarizedExperiment from microarrays data containing information about the intensities at all time points (assay), Seqnames, IRanges and strand columns (rowRanges) and colData with time point series and averaged replicates.
data(example_input_synechocystis_6803)
data(example_input_synechocystis_6803)
Assay with 3000 rows and 10 variables:
relative intensities at 0 min
relative intensities at 2 min
relative intensities at 4 min
relative intensities at 8 min
relative intensities at 16 min
relative intensities at 32 min
relative intensities at 64 min
https://github.com/CyanolabFreiburg/rifi
finding_PDD Flags potential candidates for post transcription decay
'finding_PDD' uses 'score_fun_linear_PDD' to make groups by the difference to the slope. The slope is further checked for steepness to decide for PDD. 'PDD' is added to the 'flag' column. Post transcription decay is characterized by a strong decrease of intensity by position. The rowRanges need to contain at least 'ID', 'intensity', 'position' and 'position_segment'!
finding_PDD(inp, cores = 1, pen = 2, pen_out = 1, thrsh = 0.001)
finding_PDD(inp, cores = 1, pen = 2, pen_out = 1, thrsh = 0.001)
inp |
SummarizedExperiment: the input. |
cores |
integer: the number of assigned cores for the task |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Advised to be kept at 2. Default is 2. |
pen_out |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer possible outliers. Advised to be kept at 1. Default is 1. |
thrsh |
numeric: an internal parameter that allows fragments with slopes steeper than the thrsh to be flagged with 'PDD'. Higher values result in fewer candidates. Advised to be kept at 0.001. Default is 0.001. |
The SummarizedExperiment object: with "PDD" added to the flag column.
data(preprocess_minimal) finding_PDD(inp = preprocess_minimal, cores = 2, pen = 2, pen_out = 1, thrsh = 0.001)
data(preprocess_minimal) finding_PDD(inp = preprocess_minimal, cores = 2, pen = 2, pen_out = 1, thrsh = 0.001)
finding_TI flags potential candidates for transcription interference
finding_TI uses 'score_fun_ave' to make groups by the mean of "probe_TI". "TI" is added to the "flag" column. TI is characterized by relative intensities at time points later than "0". The rowRanges need to contain at least "ID", "probe_TI" and "position_segment"!
finding_TI(inp, cores, pen = 10, thrsh = 0.5, add = 1000)
finding_TI(inp, cores, pen = 10, thrsh = 0.5, add = 1000)
inp |
SummarizedExperiment: the input. |
cores |
integer: the number of assigned cores for the task |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Advised to be kept at 10. Default is 10. |
thrsh |
numeric: an internal parameter that allows fragments with a certain amount of IDs with higher relative intensities at time points later than "0" to be flagged as "TI". Higher values result in fewer candidates. -0.5 is 25 %, 0 is 50%, 0.5 is 75%. Advised to be kept at 0.5. Default is 0.5. |
add |
integer: range of nucleotides before and after a potential TI event wherein IDs are fitted with the TI fit. |
The SummarizedExperiment object: with "TI" added to the flag column.
data(preprocess_minimal) finding_TI(inp = preprocess_minimal, cores = 2, pen = 10, thrsh = 0.5, add = 1000)
data(preprocess_minimal) finding_TI(inp = preprocess_minimal, cores = 2, pen = 10, thrsh = 0.5, add = 1000)
The result of rifi_fit for E.coli example data A SummarizedExperiment containing the output from rifi_fit as an extension of rowRanges and metadata.
data(fit_e_coli)
data(fit_e_coli)
Three data frames with 290 rows and 10 variables, 155 rows and 5 variables, and 135 rows and 9 variables are generated. The columns of the first data frame are added to the rowRanges and the rest are added as metadata.
The SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
the fit object for the standard fit:
The bin/probe specific ID
The delay value of the bin/probe
The half-life of the bin/probe
The relative intensity at time point 0
The background value of the fit
the fit object for the TI fit:
The delay value of the bin/probe
The ti-delay value of the bin/probe
The half-life of the bin/probe
The ti-value of the bin/probe
String, the factor of TI fragment
The synthesis rate of the bin/probe
The background value of the fit
The bin/probe specific position
The bin/probe specific ID
https://github.com/CyanolabFreiburg/rifi
The artificial result of rifi_fit for artificial example data A SummarizedExperiment containing the output from rifi_fit.
data(fit_minimal)
data(fit_minimal)
An object of class RangedSummarizedExperiment
with 4 rows and 33 columns.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_fit for Synechocystis 6803 example data A SummarizedExperiment containing the output from rifi_fit as an extension of rowRanges and metadata.
data(fit_synechocystis_6803)
data(fit_synechocystis_6803)
Three data frames with 3000 rows and 10 variables, 2811 rows and 5 variables, and 189 rows and 9 variable are generated. The columns of the first data frame are added to the rowRanges and the rest are added as metadata.
the SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The bin/probe specific strand
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
the fit object for the standard fit:
The bin/probe specific ID
The delay value of the bin/probe
The half-life of the bin/probe
The relative intensity at time point 0
The background value of the fit
the fit object for the TI fit:
The delay value of the bin/probe
The ti-delay value of the bin/probe
The half-life of the bin/probe
The ti-value of the bin/probe
String, the factor of TI fragment
The synthesis rate of the bin/probe
The background value of the fit
The bin/probe specific position
The bin/probe specific ID
https://github.com/CyanolabFreiburg/rifi
fold_change sets a fold-change ratio between the neighboring fragments of Half-life (HL) and intensity
fold_change sets fold change on intensity and fold change HL fragments of two successive fragments. Two intensity fragments could belong to one HL fragment. This function sets first the borders using the position and applies the fold change ratio between the neighboring fragments of HL and those from intensity log2(intensity frgA/intensity frgB/half-life frgA/half-life frgB). All grepped fragments are from the same TU excluding outliers.
fold_change(inp)
fold_change(inp)
inp |
SummarizedExperiment: the input data frame with correct format. |
The function used is: synthesis_r_Function: assigns events depending on the ratio between HL and intensity of two consecutive fragments. intensity(int) = synthesis rate(k)/decay(deg) (steady state), int1/int2 = k1/deg1*deg2/k2 int1 * (deg1/int2) * deg2 = k1/k2 => synthesis ratio. In case of synthesis ratio is: synthesis ratio > 0 -> New start synthesis ratio < 0 -> Termination
the SummarizedExperiment with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
p_value of pausing site or iTSS_I.
Integer, the p_value added to the inp.
Integer, the slope value of the fit through the respective delay fragment.
Integer, the ratio value of velocity from 2 delay fragments.
Integer, position of the event added to the input.
Integer, the fold change value of 2 HL fragments.
String, the fragments corresponding to HL fold change.
Integer, the p_value added to the input of 2 HL fragments.
Integer, the fold change value of 2 intensity fragments.
String, the fragments corresponding to intensity fold change.
Integer, the p_value added to the input of 2 intensity fragments.
Integer, the value correspomding to synthesis rate.
String, the event assigned by synthesis rate either Termination or iTSS.
Integer, the value corresponding to HL and intensity fold change.
String, the fragments corresponding to intensity and HL fold change.
Integer, the fold change of half-life/ fold change of intensity,position of the half-life fragment is adapted to intensity fragment.
data(stats_minimal) fold_change(inp = stats_minimal)
data(stats_minimal) fold_change(inp = stats_minimal)
fragment_delay performs the delay fragmentation
fragment_delay makes delay_fragments based on position_segments and assigns all gathered information to the SummarizedExperiment object. The columns "delay_fragment", "velocity_fragment", "intercept" and "slope" are added. fragment_delay makes delay_fragments, assigns slopes, which are 1/velocity at the same time, and intercepts for the TU calculation. The function used is: score_fun_linear the input is the SummarizedExperiment object. pen is the penalty for new fragments in the dynamic programming, pen_out is the outlier penalty.
fragment_delay(inp, cores = 1, pen, pen_out)
fragment_delay(inp, cores = 1, pen, pen_out)
inp |
SummarizedExperiment: the input data frame with correct format. |
cores |
cores: integer: the number of assigned cores for the task. |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
the SummarizedExperiment object:
ID: |
The bin/probe specific ID. |
position: |
The bin/probe specific position. |
intensity: |
The relative intensity at time point 0. |
probe_TI: |
An internal value to determine which fitting model is applied. |
flag: |
Information on which fitting model is applied. |
position_segment: |
The position based segment. |
delay: |
The delay value of the bin/probe. |
half_life: |
The half-life of the bin/probe. |
TI_termination_factor: |
String, the factor of TI fragment. |
delay_fragment: |
The delay fragment the bin belongs to. |
velocity_fragment: |
The velocity value of the respective delay fragment. |
intercept: |
The vintercept of fit through the respective delay fragment. |
slope: |
The slope of the fit through the respective delay fragment. |
data(fragmentation_minimal) fragment_delay(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
data(fragmentation_minimal) fragment_delay(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
fragment_HL performs the half_life fragmentation
fragment_HL makes HL_fragments based on delay_fragments and assigns all gathered information to the SummarizedExperiment object.
fragment_HL(inp, cores = 1, pen, pen_out)
fragment_HL(inp, cores = 1, pen, pen_out)
inp |
SummarizedExperiment: the input data frame with correct format. |
cores |
integer: the number of assigned cores for the task. |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
The columns "HL_fragment" and "HL_mean_fragment" are added.
fragment_HL makes half-life_fragments and assigns the mean of each fragment.
The function used is:
.score_fun_ave.
The input the SummarizedExperiment object.
pen is the penalty for new fragments in the dynamic programming, pen_out is the outlier penalty.
The SummarizedExperiment object:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
The delay fragment the bin belongs to
The velocity value of the respective delay fragment
The vintercept of fit through the respective delay fragment
The slope of the fit through the respective delay fragment
The half-life fragment the bin belongs to
The mean half-life value of the respective half-life fragment
data(fragmentation_minimal) fragment_HL(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
data(fragmentation_minimal) fragment_HL(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
fragment_inty performs the intensity fragmentation
fragment_inty makes intensity_fragments based on HL_fragments and assigns all gathered information to the SummarizedExperiment object.
fragment_inty(inp, cores = 1, pen, pen_out)
fragment_inty(inp, cores = 1, pen, pen_out)
inp |
SummarizedExperiment: the input data frame with correct format. |
cores |
cores: integer: the number of assigned cores for the task. |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
The columns "intensity_fragment" and "intensity_mean_fragment" are added.
fragment_inty makes intensity_fragments and assigns the mean of each fragment.
The function used is:
.score_fun_ave.
The input is the the SummarizedExperiment object.
pen is the penalty for new fragments in the dynamic programming, pen_out is the outlier penalty.
The SummarizedExperiment object:
ID: |
The bin/probe specific ID |
position: |
The bin/probe specific position |
intensity: |
The relative intensity at time point 0 |
probe_TI: |
An internal value to determine which fitting model is applied |
flag: |
Information on which fitting model is applied |
position_segment: |
The position based segment |
delay: |
The delay value of the bin/probe |
half_life: |
The half-life of the bin/probe |
TI_termination_factor: |
String, the factor of TI fragment |
delay_fragment: |
The delay fragment the bin belongs to |
velocity_fragment: |
The velocity value of the respective delay fragment |
intercept: |
The vintercept of fit through the respective delay fragment |
slope: |
The slope of the fit through the respective delay fragment |
HL_fragment: |
The half-life fragment the bin belongs to |
HL_mean_fragment: |
The mean half-life value of the respective half-life fragment |
intensity_fragment: |
The intensity fragment the bin belongs to |
intensity_mean_fragment: |
The mean intensity value of the respective intensity fragment |
data(fragmentation_minimal) fragment_inty(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
data(fragmentation_minimal) fragment_inty(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
fragment_TI performs the TI fragmentation
fragment_TI makes TI_fragments based on TUs and assigns all gathered information to the SummarizedExperiment object. The columns TI_termination_fragment" and the TI_mean_termination_factor are added.
fragment_TI(inp, cores = 1, pen, pen_out)
fragment_TI(inp, cores = 1, pen, pen_out)
inp |
SummarizedExperiment: the input data frame with correct format. |
cores |
cores: integer: the number of assigned cores for the task. |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
The function used is:
.score_fun_ave.
The input is the SummarizedExperiment object.
pen is the penalty for new fragments in the dynamic programming, pen_out is the outlier penalty.
The SummarizedExperiment object:
ID: |
The bin/probe specific ID |
position: |
The bin/probe specific position |
intensity: |
The relative intensity at time point 0 |
probe_TI: |
An internal value to determine which fitting model is applied |
flag: |
Information on which fitting model is applied |
position_segment: |
The position based segment |
delay: |
The delay value of the bin/probe |
half_life: |
The half-life of the bin/probe |
TI_termination_factor: |
String, the factor of TI fragment |
delay_fragment: |
The delay fragment the bin belongs to |
velocity_fragment: |
The velocity value of the respective delay fragment |
intercept: |
The vintercept of fit through the respective delay fragment |
slope: |
The slope of the fit through the respective delay fragment |
HL_fragment: |
The half-life fragment the bin belongs to |
HL_mean_fragment: |
The mean half-life value of the respective half-life fragment |
intensity_fragment: |
The intensity fragment the bin belongs to |
intensity_mean_fragment: |
The mean intensity value of the respective intensity fragment |
TI_termination_fragment: |
The TI fragment the bin belongs to |
TI_mean_termination_factor: |
The mean termination factor of the respective TI fragment |
data(fragmentation_minimal) fragment_TI(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
data(fragmentation_minimal) fragment_TI(inp = fragmentation_minimal, cores = 2, pen = 2, pen_out = 1)
The result of rifi_fragmentation for E.coli example data A SummarizedExperiment containing the output from rifi_fragmentation as an extension of rowRanges
data(fragmentation_e_coli)
data(fragmentation_e_coli)
rowRanges of the SummarizedExperiment with 290 rows and 22 variables:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
The delay fragment the bin belongs to
The velocity value of the respective delay fragment
The vintercept of fit through the respective delay fragment
The slope of the fit through the respective delay fragment
The half-life fragment the bin belongs to
The mean half-life value of the respective half-life fragment
The intensity fragment the bin belongs to
The mean intensity value of the respective intensity fragment
The overarching transcription unit
The TI fragment the bin belongs to
The mean termination factor of the respective TI fragment
The combined ID of the fragment
https://github.com/CyanolabFreiburg/rifi
The result of rifi_fragmentation for artificial example data A SummarizedExperiment containing the output from rifi_fragmentation as an extension of rowRanges and metadata.
data(fragmentation_minimal)
data(fragmentation_minimal)
An object of class RangedSummarizedExperiment
with 24 rows and 33 columns.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_fragmentation for Synechocystis 6803 example data A SummarizedExperiment containing the output from rifi_fragmentation as an extension fo rowRanges
data(fragmentation_synechocystis_6803)
data(fragmentation_synechocystis_6803)
rowRanges of the SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
The delay fragment the bin belongs to
The velocity value of the respective delay fragment
The vintercept of fit through the respective delay fragment
The slope of the fit through the respective delay fragment
The half-life fragment the bin belongs to
The mean half-life value of the respective half-life fragment
The intensity fragment the bin belongs to
The mean intensity value of the respective intensity fragment
The overarching transcription unit
The TI fragment the bin belongs to
The mean termination factor of the respective TI fragment
The combined ID of the fragment
https://github.com/CyanolabFreiburg/rifi
gff3_preprocess process gff3 file from database for multiple usage
gff3_preprocess processes the gff3 file extracting gene names and locus_tag from all coding regions (CDS), UTRs/ncRNA/asRNA are also extracted if available.
gff3_preprocess(path)
gff3_preprocess(path)
path |
path: path to the directory containing the gff3 file. |
The resulting dataframe contains region, positions, strand, gene and locus_tag.
A list with 2 items:
String, the region from the gff file
Integer, the start of the annotation
Integer, the end of the annotation
Boolean, the strand of the annotation
String, the annotated gene name
String, the annotated locus tag
a numeric vector containing the length of the genome
gff3_preprocess( path = gzfile(system.file("extdata", "gff_e_coli.gff3.gz", package = "rifi")) )
gff3_preprocess( path = gzfile(system.file("extdata", "gff_e_coli.gff3.gz", package = "rifi")) )
make_df adds important columns to the SummarizedExperiment object
'make_df' adds to the SummarizedExperiment object with the columns: "intensity", "probe_TI" and "flag".
make_df(inp, cores = 1, bg = 0, rm_FLT = TRUE)
make_df(inp, cores = 1, bg = 0, rm_FLT = TRUE)
inp |
SummarizedExperiment: the (checked) input. |
cores |
integer: the number of assigned cores for the task. |
bg |
numeric: threshold over which the last timepoint has to be fitted with the above background mode. |
rm_FLT |
logical: remove IDs where all replicates are marked as filtered. Default is FALSE. |
The replicates are collapsed into their respective means.
"intensity" is the mean intensity from time point 0.
"probe_TI" is a value needed for the distribution for the different fitting models.
"flag" contains information or the distribution for the different fitting models.
Probes that don't reach the background level expression are flagged as "ABG" ("above background"). This is only needed for microarray data and is controlled by the bg parameter. The default for bg = 0, resulting in all probes to be above background (0 is advised for RNAseq data).
Probes where all replicates were filtered in the optional filtration step can be fully removed by rm_FLT = TRUE! If you wish to keep all information in the assay set to FALSE!
the SummarizedExperiment object: with intensity, probe_TI and flag added to the rowRanges.
data(preprocess_minimal) make_df(inp = preprocess_minimal, cores = 2, bg = 0, rm_FLT = TRUE)
data(preprocess_minimal) make_df(inp = preprocess_minimal, cores = 2, bg = 0, rm_FLT = TRUE)
make_pen assigns automatically a penalties
'make_pen' calls one of four available penalty functions to automatically assign penalties for the dynamic programming.
make_pen( inp, FUN, cores = 1, logs, dpt = 1, smpl_min = 10, smpl_max = 100, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 3.5, rez_pen_out = 7 )
make_pen( inp, FUN, cores = 1, logs, dpt = 1, smpl_min = 10, smpl_max = 100, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 3.5, rez_pen_out = 7 )
inp |
SummarizedExperiment: the input data frame with correct format. |
FUN |
function: one of the four bottom level functions (see details) |
cores |
integer: the number of assigned cores for the task |
logs |
numeric vector: the logbook vector. |
dpt |
integer: the number of times a full iteration cycle is repeated with a more narrow range based on the previous cycle. Default is 2. |
smpl_min |
integer: the smaller end of the sampling size. Default is 10. |
smpl_max |
integer: the larger end of the sampling size. Default is 100. |
sta_pen |
numeric: the lower starting penalty. Default is 0.5. |
end_pen |
numeric: the higher starting penalty. Default is 4.5. |
rez_pen |
numeric: the number of penalties iterated within the penalty range. Default is 9. |
sta_pen_out |
numeric: the lower starting outlier penalty. Default is 0.5. |
end_pen_out |
numeric: the higher starting outlier penalty. Default is 3.5. |
rez_pen_out |
numeric: the number of outlier penalties iterated within the outlier penalty range. Default is 7. |
The four functions to be called are:
fragment_delay_pen
fragment_HL_pen
fragment_inty_pen
fragment_TI_pen
These functions return the amount of statistically correct and statistically wrong splits at a specific pair of penalties. 'make_pen' iterates over many penalty pairs and picks the most suitable pair based on the difference between wrong and correct splits. The sample size, penalty range and resolution as well as the number of cycles can be customized. The primary start parameters create a matrix with n = rez_pen rows and n = rez_pen_out columns with values between sta_pen/sta_pen_out and end_pen/end_pen_out. The best penalty pair is picked. If dept is bigger than 1 the same process is repeated with a new matrix of the same size based on the result of the previous cycle. Only position segments with length within the sample size range are considered for the penalties to increase run time. Returns a penalty object (list of 4 objects) the first being the logbook.
A list with 4 items:
Interger, the logbook vector containing all penalty information
Integer, a vetor with the respective penalty and outlier penalty
Matrix, a matrix of the correct splits
Matrix, a matrix of the incorrect splits
data(fit_minimal) make_pen( inp = fit_minimal, FUN = rifi:::fragment_HL_pen, cores = 2, logs = as.numeric(rep(NA, 8)), dpt = 1, smpl_min = 10, smpl_max = 50, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 3.5, rez_pen_out = 7 )
data(fit_minimal) make_pen( inp = fit_minimal, FUN = rifi:::fragment_HL_pen, cores = 2, logs = as.numeric(rep(NA, 8)), dpt = 1, smpl_min = 10, smpl_max = 50, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 3.5, rez_pen_out = 7 )
nls2_fit estimates decay for each probe or bin
nls2_fit uses nls2 function to fit a probe or bin using intensities of the time series data from different time point. nls2 uses different starting values through expand grid and selects the best fit. Different filters could be applied prior fitting to the model.
nls2_fit( inp, cores = 1, decay = seq(0.01, 0.11, by = 0.02), delay = seq(0, 10, by = 0.1), k = seq(0.1, 1, 0.2), bg = 0.2 )
nls2_fit( inp, cores = 1, decay = seq(0.01, 0.11, by = 0.02), delay = seq(0, 10, by = 0.1), k = seq(0.1, 1, 0.2), bg = 0.2 )
inp |
SummarizedExperiment: the input with correct format. |
cores |
integer: the number of assigned cores for the task. |
decay |
numeric vector: A sequence of starting values for the decay. Default is seq(.08, 0.11, by=.02) |
delay |
numeric vector: A sequence of starting values for the delay. Default is seq(0,10, by=.1) |
k |
numeric vector: A sequence of starting values for the synthesis rate. Default is seq(0.1,1,0.2) |
bg |
numeric vector: A sequence of starting values. Default is 0.2. |
To apply nls2_fit function, prior filtration could applied.
generic_filter_BG: filter probes with intensities below background using threshold. Those probes are filtered.
filtration_below_backg: additional functions exclusive to microarrays could be applied. Its very strict to the background (not recommended in usual case).
filtration_above_backg: selects probes with a very high intensity and above the background (recommended for special transcripts). Probes are flagged with "ABG". Those transcripts are usually related to a specific function in bacteria. This filter selects all probes with the same ID, the mean is applied, the last time point is selected and compared to the threshold.
The model used estimates the delay, decay, intensity of the first time point (synthesis rate/decay) and the background. The coefficients are gathered in vectors with the corresponding IDs. Absence of the fit or a very bad fit are assigned with NA. In case of probes with very high intensities and above the background, the model used makes abstinence of background coefficient. The output of all coefficients is saved in the metadata. The fits are plotted using the function_plot_fit.r through rifi_fit.
the SummarizedExperiment object: with delay and decay added to the rowRanges. The full fit data is saved in the metadata as "fit_STD".
Integer, the delay value of the bin/probe
Integer, the half-life of the bin/probe
data(preprocess_minimal) nls2_fit(inp = preprocess_minimal, cores = 2)
data(preprocess_minimal) nls2_fit(inp = preprocess_minimal, cores = 2)
The result of rifi_penalties for E.coli example data. A SummarizedExperiment containing the output from rifi_penalties including the logbook and the four penalty objects as metadata.
data(penalties_e_coli)
data(penalties_e_coli)
A list with 5 items:
The logbook vector containing all penalty information
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the delay penalty and delay outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the half-life penalty and half-life outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the intensity penalty and intensity outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the TI penalty and TI outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
https://github.com/CyanolabFreiburg/rifi
The result of rifi_penalties for artificial example data A SummarizedExperiment containing the output from rifi_penalties including the logbook and the four penalty objects as metadata.
data(penalties_minimal)
data(penalties_minimal)
An object of class RangedSummarizedExperiment
with 24 rows and 33 columns.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_penalties for Synechocystis 6803 example data. A SummarizedExperiment containing the output from rifi_penalties including the logbook and the four penalty objects as metadata.
data(penalties_synechocystis_6803)
data(penalties_synechocystis_6803)
A list with 5 items:
The logbook vector containing all penalty information
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the delay penalty and delay outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the half-life penalty and half-life outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the intensity penalty and intensity outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
A list with 4 items:
The logbook vector containing all penalty information
a vetor with the TI penalty and TI outlier penalty
a matrix of the correct splits
a matrix of the incorrect splits
https://github.com/CyanolabFreiburg/rifi
predict_ps_itss predicts pausing sites (ps) and internal starting sites (ITSS) between delay fragments.
predict_ps_itss predicts ps and ITSS within the same TU. Neighboring delay segments are compared to each other by positioning the intercept of the second segment into the first segment using slope and intercept coefficients.
predict_ps_itss(inp, maxDis = 300)
predict_ps_itss(inp, maxDis = 300)
inp |
SummarizedExperiment: the input data frame with correct format. |
maxDis |
integer: the maximal distance allowed between two successive fragments. |
predict_ps_itss uses 3 steps to identify ps and ITSS:
select unique TU.
select from the input dataframe the columns: ID, position, strand, delay. delay fragment, TU and slope coordinates, velocity_fragment and intercept.
select delay segments in the TU.
loop into all delay segments and estimate the coordinates of the last point of the first segment using the coefficients of the second segment and vice versa. We get two predicted positions, the difference between them is compared to the threshold.
In case the strand is "-", additional steps are added:
The positions of both segments are ordered from the last position to the first one.
All positions are merged in one column and subtracted from the maximum position. the column is split in 2. The first and second correspond to the positions of the first and second segments respectively.
Both segments are subjected to lm fit and the positions predicted are used on the same way as the opposite strand.
If the difference between the positions predicted is lower than negative threshold, ps is assigned otherwise, and if the difference is higher than the positive threshold, ITSS is assigned.
The SummarizedExperiment with the columns regarding statistics:
The bin/probe specific ID.
The bin/probe specific position.
The bin/probe specific strand.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
presence of pausing site indicated by +/-.
presence of iTSS_I indicated by +/-.
The fragments involved in pausing site or iTSS_I.
Integer, the duration between two delay fragments.
data(fragmentation_minimal) predict_ps_itss(inp = fragmentation_minimal, maxDis = 300)
data(fragmentation_minimal) predict_ps_itss(inp = fragmentation_minimal, maxDis = 300)
The result of rifi_preprocess for E.coli example data A SummarizedExperiment containing the output from rifi_penalties including the logbook and the four penalty objects as metadata. A list containing the output from rifi_preprocess, including the inp and the modified input_df.
data(preprocess_e_coli)
data(preprocess_e_coli)
A SummarizedExperiment:
the SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
the fit object for the TI fit:
relative intensities at 0 min
relative intensities at 1 min
relative intensities at 10 min
relative intensities at 15 min
relative intensities at 2 min
relative intensities at 20 min
relative intensities at 3 min
relative intensities at 4 min
relative intensities at 5 min
relative intensities at 6 min
relative intensities at 8 min
The bin/probe specific ID
The bin/probe specific position
indicator wether the replicate is filtered or not
https://github.com/CyanolabFreiburg/rifi
The result of rifi_preprocess for artificial example data A SummarizedExperiment containing the output from rifi_preprocess
data(preprocess_minimal)
data(preprocess_minimal)
An object of class RangedSummarizedExperiment
with 4 rows and 33 columns.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_preprocess for Synechocystis 6803 example data is a A SummarizedExperiment containing the output of rifi_preprocess as an extention to rowRanges
data(preprocess_synechocystis_6803)
data(preprocess_synechocystis_6803)
A SummarizedExperiment:
the SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The bin/probe specific strand
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
the fit object for the TI fit:
relative intensities at 0 min
relative intensities at 2 min
relative intensities at 4 min
relative intensities at 8 min
relative intensities at 16 min
relative intensities at 32 min
relative intensities at 64 min
The bin/probe specific ID
The bin/probe specific position
indicator wether the replicate is filtered or not
https://github.com/CyanolabFreiburg/rifi
The result of event_dataframe for E.coli artificial example. A data frame combining the processed genome annotation and a SummarizedExperiment data from rifi_stats. The dataframe is
data(res_minimal)
data(res_minimal)
A list with 2 items:
the region from the gff file
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
the strand of the annotation
The overarching transcription unit
The bin/probe specific position
String, fragments involved in fold change between 2 intensity fragments
Integer, the fold change value of 2 intensity fragments
p_value of the fold change of intensity fragments
Integer, the fold change value of 2 intensity fragments
Integer, the fold change value of 2 HL fragments
p_value of the fold change of HL fragments
fragments involved on ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
p_value of the variance between two fold-changes, HL and intensity
Integer, the value correspomding to synthesis rate
String, the event assigned by synthesis rate either Termination or iTSS
presence of pausing site indicated by +/-
presence of iTSS_I indicated by +/-
p_value of pausing site or iTSS_I
The fragments involved in pausing site or iTSS_I
Integer, the position middle between 2 fragments with an event
Integer, the duration between two delay fragments
the slope value of the respective delay fragment
p_value of the slope
The delay value of the bin/probe
The half-life of the bin/probe
The relative intensity at time point 0
https://github.com/CyanolabFreiburg/rifi
rifi_fit wraps conveniently all fitting steps
rifi_fit wraps the functions:
nls2_fit
TI_fit
plot_nls2_function
plot_singleProbe_function
rifi_fit( inp, cores = 1, viz = FALSE, restr = 0.2, decay = seq(0.08, 0.11, by = 0.02), delay = seq(0, 10, by = 0.1), k = seq(0.1, 1, 0.2), bg = 0.2, TI_k = seq(0, 1, by = 0.5), TI_decay = c(0.05, 0.1, 0.2, 0.5, 0.6), TI = seq(0, 1, by = 0.5), TI_delay = seq(0, 2, by = 0.5), TI_rest_delay = seq(0, 2, by = 0.5), TI_bg = 0 )
rifi_fit( inp, cores = 1, viz = FALSE, restr = 0.2, decay = seq(0.08, 0.11, by = 0.02), delay = seq(0, 10, by = 0.1), k = seq(0.1, 1, 0.2), bg = 0.2, TI_k = seq(0, 1, by = 0.5), TI_decay = c(0.05, 0.1, 0.2, 0.5, 0.6), TI = seq(0, 1, by = 0.5), TI_delay = seq(0, 2, by = 0.5), TI_rest_delay = seq(0, 2, by = 0.5), TI_bg = 0 )
inp |
SummarizedExperiment: the input with correct format. |
cores |
integer: the number of assigned cores for the task. |
viz |
logical: whether to visualize the output. |
restr |
numeric: a parameter that restricts the freedom of the fit to avoid wrong TI-term_factors, ranges from 0 to 0.2 |
decay |
numeric vector: A sequence of starting values for the decay. Default is seq(.08, 0.11, by=.02) |
delay |
numeric vector: A sequence of starting values for the delay. Default is seq(0,10, by=.1) |
k |
numeric vector: A sequence of starting values for the synthesis rate.Default is seq(0.1,1,0.2) |
bg |
numeric vector: A sequence of starting values. Default is 0.2. |
TI_k |
numeric vector: A sequence of starting values for the synthesis rate. Default is seq(0, 1, by = 0.5). |
TI_decay |
numeric vector: A sequence of starting values for the decay. Default is c(0.05, 0.1, 0.2, 0.5, 0.6). |
TI |
numeric vector: A sequence of starting values for the TI. Default is seq(0, 1, by = 0.5). |
TI_delay |
numeric vector: A sequence of starting values for the delay. Default is seq(0, 2, by = 0.5). |
TI_rest_delay |
numeric vector: A sequence of starting values. Default is seq(0, 2, by = 0.5). |
TI_bg |
numeric vector: A sequence of starting values. Default is 0. |
the SummarizedExperiment object: with delay, decay and TI_termination_factor added to the rowRanges. The full fit data is saved in the metadata as "fit_STD" and "fit_TI". A plot is given if viz = TRUE.
nls2_fit
TI_fit
plot_nls2
plot_singleProbe
data(preprocess_minimal) rifi_fit( inp = preprocess_minimal, cores = 1, viz = FALSE, restr = 0.1, decay = seq(.08, 0.11, by = .02), delay = seq(0, 10, by = .1), k = seq(0.1, 1, 0.2), bg = 0.2, TI_k = seq(0, 1, by = 0.5), TI_decay = c(0.05, 0.1, 0.2, 0.5, 0.6), TI = seq(0, 1, by = 0.5), TI_delay = seq(0, 2, by = 0.5), TI_rest_delay = seq(0, 2, by = 0.5), TI_bg = 0 )
data(preprocess_minimal) rifi_fit( inp = preprocess_minimal, cores = 1, viz = FALSE, restr = 0.1, decay = seq(.08, 0.11, by = .02), delay = seq(0, 10, by = .1), k = seq(0.1, 1, 0.2), bg = 0.2, TI_k = seq(0, 1, by = 0.5), TI_decay = c(0.05, 0.1, 0.2, 0.5, 0.6), TI = seq(0, 1, by = 0.5), TI_delay = seq(0, 2, by = 0.5), TI_rest_delay = seq(0, 2, by = 0.5), TI_bg = 0 )
rifi_fragmentation wraps conveniently all fragmentation steps
rifi_fragmentation is wrapper of the following functions:
fragment_delay
fragment_HL
fragment_inty
TUgether
fragment_TI
rifi_fragmentation( inp, cores = 1, pen_delay = NULL, pen_out_delay = NULL, pen_HL = NULL, pen_out_HL = NULL, pen_inty = NULL, pen_out_inty = NULL, pen_TU = NULL, pen_TI = NULL, pen_out_TI = NULL )
rifi_fragmentation( inp, cores = 1, pen_delay = NULL, pen_out_delay = NULL, pen_HL = NULL, pen_out_HL = NULL, pen_inty = NULL, pen_out_inty = NULL, pen_TU = NULL, pen_TI = NULL, pen_out_TI = NULL )
inp |
SummarizedExperiment: the input data frame with correct format. |
cores |
integer: the number of assigned cores for the task. |
pen_delay |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out_delay |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
pen_HL |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out_HL |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
pen_inty |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out_inty |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
pen_TU |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default -0.75. |
pen_TI |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default is the auto generated value. |
pen_out_TI |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer allowed outliers. Default is the auto generated value. |
the SummarizedExperiment object: with delay_fragment, HL_fragment, intensity_fragment, TI_termination_fragment and TU, and the respective values added to the rowRanges.
fragment_delay
fragment_HL
fragment_inty
TUgether
fragment_TI
data(penalties_minimal) rifi_fragmentation(inp = penalties_minimal, cores = 2)
data(penalties_minimal) rifi_fragmentation(inp = penalties_minimal, cores = 2)
rifi_penalties wraps conveniently all penalty steps
rifi_penalties wraps the functions:
make_pen,
viz_pen_obj
rifi_penalties( inp, details = FALSE, viz = FALSE, top_i = 25, cores = 1, dpt = 1, smpl_min = 10, smpl_max = 100, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 4.5, rez_pen_out = 9 )
rifi_penalties( inp, details = FALSE, viz = FALSE, top_i = 25, cores = 1, dpt = 1, smpl_min = 10, smpl_max = 100, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 4.5, rez_pen_out = 9 )
inp |
SummarizedExperiment: the input data frame with correct format. |
details |
logical: whether to return the penalty objects or just the logbook. |
viz |
logical: whether to visualize the output or not. Default is FALSE |
top_i |
integer: the number of top results visualized. Default is all. |
cores |
integer: the number of assigned cores for the task. |
dpt |
integer: the number of times a full iteration cycle is repeated with a more narrow range based on the previous cycle. Default is 2. |
smpl_min |
integer: the smaller end of the sampling size. Default is 10. |
smpl_max |
integer: the larger end of the sampling size. Default is 100. |
sta_pen |
numeric: the lower starting penalty. Default is 0.5. |
end_pen |
numeric: the higher starting penalty. Default is 4.5. |
rez_pen |
numeric: the number of penalties iterated within the penalty range. Default is 9. |
sta_pen_out |
numeric: the lower starting outlier penalty. Default is 0.5. |
end_pen_out |
numeric: the higher starting outlier penalty. Default is 3.5. |
rez_pen_out |
numeric: the number of outlier penalties iterated within the outlier penalty range. Default is 7. |
The SummarizedExperiment object: with the penalties in the logbook added to the metadata. Also adds logbook_details if details is TRUE, and plots the penalties if viz is TRUE.
make_pen
viz_pen_obj
data(fit_minimal) rifi_penalties( inp = fit_minimal, details = FALSE, viz = FALSE, top_i = 25, cores = 2, dpt = 1, smpl_min = 10, smpl_max = 100, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 4.5, rez_pen_out = 9 )
data(fit_minimal) rifi_penalties( inp = fit_minimal, details = FALSE, viz = FALSE, top_i = 25, cores = 2, dpt = 1, smpl_min = 10, smpl_max = 100, sta_pen = 0.5, end_pen = 4.5, rez_pen = 9, sta_pen_out = 0.5, end_pen_out = 4.5, rez_pen_out = 9 )
rifi_preprocess wraps conveniently all pre-processing steps
rifi_preprocess wraps the functions:
check_input
make_df
function_seg
finding_PDD
finding_TI
rifi_preprocess( inp, cores, FUN_filter = function(x) { FALSE }, bg = 0, rm_FLT = FALSE, thrsh_check = 0, dista = 300, run_PDD = FALSE, pen_PDD = 2, pen_out_PDD = 1, thrsh_PDD = 0.001, pen_TI = 10, thrsh_TI = 0.5, add = 1000 )
rifi_preprocess( inp, cores, FUN_filter = function(x) { FALSE }, bg = 0, rm_FLT = FALSE, thrsh_check = 0, dista = 300, run_PDD = FALSE, pen_PDD = 2, pen_out_PDD = 1, thrsh_PDD = 0.001, pen_TI = 10, thrsh_TI = 0.5, add = 1000 )
inp |
SummarizedExperiment: the input. |
cores |
integer: the number of assigned cores for the task. |
FUN_filter |
function: A function of x, returning a logical. x is the numeric vector of the intensity from all time points for a specific replicate. |
bg |
numeric: threshold over which the last time point has to be to be fitted with the above background mode. |
rm_FLT |
logical: remove IDs where all replicates are marked as filtered by the background check. Default is FALSE. |
thrsh_check |
numeric: the minimal allowed intensity for time point "0". Advised to be kept at 0! Default is 0. |
dista |
integer: the amount of nucleotides defining the gap. Default is 300. |
run_PDD |
logical: running the PDD flag function |
pen_PDD |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Advised to be kept at 2. Default is 2. |
pen_out_PDD |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer possible outliers. Advised to be kept at 1. Default is 1. |
thrsh_PDD |
numeric: an internal parameter that allows fragments with slopes steeper than the threshold to be flagged with "PDD". Higher values result in fewer candidates . Advised to be kept at 0.001. Default is 0.001. |
pen_TI |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Advised to be kept at 10. Default is 10. |
thrsh_TI |
numeric: an internal parameter that allows fragments with a certain amount of IDs with higher relative intensities at time points later than "0" to be flagged as "TI". Higher values result in fewer candidates. -0.5 is 25 %, 0 is 50%, 0.5 is 75%. Advised to be kept at 0.5. Default is 0.5. |
add |
integer: range of nucleotides before a potential TI event where in IDs are fitted with the TI fit. |
rifi_preprocess allows for the optional integration of filter functions. Filter functions mark replicates with TRUE. Those are then not considered in the fit! FUN_filter is a general filter usually to exclude probes with low expression or "bad" patterns.
The SummarizedExperiment object: checked, and with position, ID, intensity, probe_TI, position_segment, flag and filtration added to the rowRanges.
check_input
make_df
segment_pos
finding_PDD
finding_TI
data(example_input_minimal) rifi_preprocess( inp = example_input_minimal, cores = 2, bg = 100, rm_FLT = FALSE, thrsh_check = 0, dista = 300, run_PDD = FALSE )
data(example_input_minimal) rifi_preprocess( inp = example_input_minimal, cores = 2, bg = 100, rm_FLT = FALSE, thrsh_check = 0, dista = 300, run_PDD = FALSE )
rifi_stats wraps the functions:
predict_ps_itss
apply_Ttest_delay
apply_ancova
apply_event_position
apply_t_test
fold_change
apply_manova
apply_t_test_ti
gff3_preprocess
rifi_stats(inp, dista = 300, path)
rifi_stats(inp, dista = 300, path)
inp |
SummarizedExperiment: the input data frame with correct format. |
dista |
integer: the maximal distance allowed between two successive fragments. Default is the auto generated value. |
path |
path: to the directory containing the gff3 file. |
The SummarizedExperiment object: ID with position, strand, intensity, probe_TI, flag, position_segment, delay, half_life, TI_termination_factor, delay_fragment, velocity_fragment, intercept, slope, HL_fragment, HL_mean_fragment, intensity_fragment, intensity_mean_fragment, TU, TI_termination_fragment, TI_mean_termination_factor, seg_ID, pausing_site, iTSS_I, ps_ts_fragment, event_ps_itss_p_value_Ttest, p_value_slope, delay_frg_slope, velocity_ratio, event_duration, event_position, FC_HL, FC_fragment_HL, p_value_HL, FC_intensity, FC_fragment_intensity, p_value_intensity, FC_HL_intensity, FC_HL_intensity_fragment, FC_HL_adapted, synthesis_ratio, synthesis_ratio_event, p_value_Manova, p_value_TI, TI_fragments_p_value
predict_ps_itss
apply_Ttest_delay
apply_ancova
apply_event_position
apply_t_test
fold_change
apply_manova
apply_t_test_ti
gff3_preprocess
data(fragmentation_minimal) rifi_stats(inp = fragmentation_minimal, dista = 300, path = gzfile(system.file("extdata", "gff_e_coli.gff3.gz", package = "rifi")))
data(fragmentation_minimal) rifi_stats(inp = fragmentation_minimal, dista = 300, path = gzfile(system.file("extdata", "gff_e_coli.gff3.gz", package = "rifi")))
rifi_summary wraps conveniently and summarize all rifi outputs
rifi_summary wraps the functions:
event_dataframe
dataframe_summary
dataframe_summary_events
dataframe_summary_events_HL_int
dataframe_summary_events_ps_itss
dataframe_summary_events_velocity
dataframe_summary_TI
rifi_summary(inp, data_annotation = metadata(inp)$annot[[1]])
rifi_summary(inp, data_annotation = metadata(inp)$annot[[1]])
inp |
SummarizedExperiment: the input data frame with correct format. |
data_annotation |
dataframe: gff3 dataframe after processing. |
WIP
event_dataframe
dataframe_summary
dataframe_summary_events
dataframe_summary_events_HL_int
dataframe_summary_events_ps_itss
dataframe_summary_events_velocity
dataframe_summary_TI
data(stats_minimal) if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } rifi_summary(inp = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
data(stats_minimal) if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } rifi_summary(inp = stats_minimal, data_annotation = metadata(stats_minimal)$annot[[1]])
rifi_visualization plots all the data with fragments and events from both strands
rifi_visualization plots the whole genome with genes, transcription units (TUs), delay, half-life (HL), intensity fragments, features, events, velocity, annotation, coverage if available.
rifi_visualization( data, genomeLength, annot, coverage = 0, chr_fwd = NA, chr_rev = NA, region = c("CDS", "asRNA", "5'UTR", "ncRNA", "3'UTR", "tRNA"), color_region = c("grey0", "red", "blue", "orange", "yellow", "green", "white", "darkseagreen1", "grey50", "black"), color_text.1 = "grey0", color_text.2 = "black", color_TU = "blue", Alpha = 0.5, size_tu = 1.6, size_locusTag = 1.6, size_gene = 1.6, Limit = 10, shape = 22, col_outiler = "grey50", col_coverage = "grey", shape_outlier = 13, limit_intensity = NA, face = "bold", tick_length = 0.3, arrow.color = "darkseagreen1", minVelocity = 3000, medianVelocity = 6000, col_above20 = "#00FFFF", fontface = "plain", shape_above20 = 14, col_outlierabove10 = "darkorchid", shape_outlierabove10 = 5, axis_text_y_size = 3, axis_title_y_size = 6, TI_threshold = 1.1, termination_threshold = -0.5, iTSS_threshold = 0.5, p_value_int = 0.05, p_value_event = 0.05, p_value_hl = 0.05, p_value_TI = 0.05, p_value_manova = 0.05, event_duration_ps = 1, event_duration_itss = -1, HL_threshold_1 = log2(1.5), HL_threshold_2 = -log2(1.5), vel_threshold = 200, HL_threshold_color = "black", vel_threshold_color = "grey52", ps_color = "orange", iTSS_I_color = "blue" )
rifi_visualization( data, genomeLength, annot, coverage = 0, chr_fwd = NA, chr_rev = NA, region = c("CDS", "asRNA", "5'UTR", "ncRNA", "3'UTR", "tRNA"), color_region = c("grey0", "red", "blue", "orange", "yellow", "green", "white", "darkseagreen1", "grey50", "black"), color_text.1 = "grey0", color_text.2 = "black", color_TU = "blue", Alpha = 0.5, size_tu = 1.6, size_locusTag = 1.6, size_gene = 1.6, Limit = 10, shape = 22, col_outiler = "grey50", col_coverage = "grey", shape_outlier = 13, limit_intensity = NA, face = "bold", tick_length = 0.3, arrow.color = "darkseagreen1", minVelocity = 3000, medianVelocity = 6000, col_above20 = "#00FFFF", fontface = "plain", shape_above20 = 14, col_outlierabove10 = "darkorchid", shape_outlierabove10 = 5, axis_text_y_size = 3, axis_title_y_size = 6, TI_threshold = 1.1, termination_threshold = -0.5, iTSS_threshold = 0.5, p_value_int = 0.05, p_value_event = 0.05, p_value_hl = 0.05, p_value_TI = 0.05, p_value_manova = 0.05, event_duration_ps = 1, event_duration_itss = -1, HL_threshold_1 = log2(1.5), HL_threshold_2 = -log2(1.5), vel_threshold = 200, HL_threshold_color = "black", vel_threshold_color = "grey52", ps_color = "orange", iTSS_I_color = "blue" )
data |
SummarizedExperiment: the input data frame with correct format. |
genomeLength |
integer: genome length output of gff3_preprocess function and element of metadata of SummarizedExperiment. |
annot |
dataframe: the annotation file, output of gff3_preprocess function and element of metadata of SummarizedExperiment. |
coverage |
integer: in case the coverage is available. |
chr_fwd |
string object: coverage of the forward strand. |
chr_rev |
string object: coverage of the reverse strand. |
region |
dataframe: gff3 features of the genome. |
color_region |
string vector: vector of colors. |
color_text.1 |
string: TU color text |
color_text.2 |
string: genes color text |
color_TU |
string. TU color |
Alpha |
integer: color transparency degree. |
size_tu |
integer: TU size |
size_locusTag |
integer: locus_tag size |
size_gene |
integer: font size for gene annotation. |
Limit |
integer: value for y-axis limit. |
shape |
integer: value for shape. |
col_outiler |
string: outlier color. |
col_coverage |
integer: color for coverage plot. |
shape_outlier |
integer: value for outlier shape. |
limit_intensity |
integer: intensity limit if applicable. |
face |
string: label font. |
tick_length |
integer: value for ticks. |
arrow.color |
string: arrows color. |
minVelocity |
integer: threshold to fix the minimum of velocity. |
medianVelocity |
integer: threshold to fix the maximum of velocity. |
col_above20 |
string: color for probes/bin above value 20. |
fontface |
integer: font type |
shape_above20 |
integer: shape for probes/bins above value 20. |
col_outlierabove10 |
string: color for probes/bin outliers between 10 and 20, |
shape_outlierabove10 |
integer: shape for probes/bin outliers between 10 and 20, |
axis_text_y_size |
integer: text size for y-axis. |
axis_title_y_size |
integer: title size for y-axis. |
TI_threshold |
integer: threshold for TI between two fragments in case the TI termination factor drops from the first segment to the second, default 1.1. If threshold is reached a line is drawn to seperates the two TI segments. |
termination_threshold |
integer: threshold for termination to plot, default .8. |
iTSS_threshold |
integer: threshold for iTSS_II selected to plot, default 1.2. |
p_value_int |
integer: p_value of intensity fragments fold-change to plot, default 0.05. |
p_value_event |
integer: p_value of t-test from pausing site and iTSS_I events to plot, default 0.05. |
p_value_hl |
integer: p_value of half_life fragments fold-change to plot, default 0.05. |
p_value_TI |
integer: p_value of TI fragments selected to be plotted, default 0.05. |
p_value_manova |
integer: p_value of manova test fragments to plot, default 0.05. |
event_duration_ps |
integer: threshold for pausing sites selected to plot, default -2. |
event_duration_itss |
integer: threshold for iTSS_I selected to plot, default 2. |
HL_threshold_1 |
integer: threshold for log2FC(HL) selected to plot, default log2(1.5). log2FC(HL) >= log2(1.5) are indicated by black color. If p_value <= p_value_hl (default 0.05), log2FC(HL) is indicated by HL* otherwise HL. |
HL_threshold_2 |
integer: threshold for log2FC(HL) selected to plot, default -log2(1.5). log2FC(HL) <= -log2(1.5) are indicated by green color. If p_value <= p_value_hl (default 0.05), log2FC(HL) is indicated by HL* otherwise HL. In case of p_value is significant and the log2FC(HL) is between -log2FC(1.5) and log2FC(1.5), FC is assigned by green color and HL*. |
vel_threshold |
integer: threshold for velocity ratio selected to plot, default 200. |
HL_threshold_color |
string: color for HL fold change plot. |
vel_threshold_color |
string: color for velocity ratio plot. |
ps_color |
string: color for pausing site plot. |
iTSS_I_color |
string: color for iTSS_I plot. |
rifi_visualization uses several functions to plot the genes including as-RNA and ncRNA and TUs as segments. The function plots delay, HL and intensity fragments with statistical t-test between the neighboring fragment, significant t-test is assigned with ''. t-test and Manova statistical test are also depicted as ''.
The functions used are:
annotation_plot: plots the corresponding annotation.
positive_strand_function: plots delay, HL, intensity and events of positive strand.
negative_strand_function: plots delay, HL, intensity and events of negative strand.
empty_data_positive: plots empty boxes in case no data is available for positive strand.
empty_data_negative: plots empty boxes in case no data is available for negative strand.
strand_selection: check if data is stranded and arrange by position.
splitGenome_function: splits the genome into fragments.
indice_function: assign a new column to the data to distinguish between fragments, outliers from delay or HL or intensity.
TU_annotation: designs the segments border for the genes and TUs annotation
gene_annot_function: it requires gff3 file, returns a dataframe adjusting each fragment according to its annotation. It allows as well the plot of genes and TUs shared into two pages.
label_log2_function: used to add log scale to intensity values.
label_square_function: used to add square scale to coverage values.
coverage_function: this function is used only in case of coverage is available.
secondaryAxis: adjusts the half-life or delay to 20 in case of the dataframe row numbers is equal to 1 and the half-life or delay exceed the limit, they are plotted with different shape and color.
outlier_plot: plot the outliers with half-life between 10 and 30 on the maximum of the yaxis.
add_genomeBorders: when the annotated genes are on the borders, they can not be plotted, therefore the region was split in 2 adding the row corresponding to the split part to the next annotation (i + 1) except for the first page.
my_arrow: creates an arrow for the annotation.
arrange_byGroup: selects the last row for each segment and add 40 nucleotides in case of negative strand for a nice plot.
regr: plots the predicted delay from linear regression if the data is on negative strand.
meanPosition: assign a mean position for the plot.
delay_mean: adds a column in case of velocity is NA or equal to 60. The mean of the delay is calculated outliers.
my_segment_T: plots terminals and pausing sites labels.
my_segment_NS: plots internal starting sites 'iTSS'.
min_value: returns minimum value for event plots in intensity plot.
velocity_fun: function for velocity plot.
limit_function: for values above 10 or 20 in delay and hl. Limit of the axis is set differently. y-axis limit is applied only if we have more than 3 values above 10 and lower or equal to 20. An exception is added in case a dataframe has less than 3 rows and 1 or more values are above 10, the rest of the values above 20 are adjusted to 20 on "secondaryAxis" function.
empty_boxes: used only in case the dataframe from the positive strand is not empty, the TU are annotated.
function_TU_arrow: used to avoid plotting arrows when a TU is split into two pages.
terminal_plot_lm: draws a linear regression line when terminal outliers have an intensity above a certain threshold and are consecutive. Usually are smallRNA (ncRNA, asRNA).
slope_function: replaces slope lower than 0.0009 to 0.
velo_function: replaces infinite velocity with NA.
plot the coverage of RNA_seq in exponential phase growth
The visualization plot.
data(stats_minimal) if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } rifi_visualization(data = stats_minimal, genomeLength = metadata(stats_minimal)$annot[[2]], annot = metadata(stats_minimal)$annot[[1]])
data(stats_minimal) if(!require(SummarizedExperiment)){ suppressPackageStartupMessages(library(SummarizedExperiment)) } rifi_visualization(data = stats_minimal, genomeLength = metadata(stats_minimal)$annot[[2]], annot = metadata(stats_minimal)$annot[[1]])
rifi_wrapper wraps conveniently all functions included on rifi workflow
rifi_wrapper wraps the functions:
rifi_preprocess
rifi_fit
rifi_penalties
rifi_fragmentation
rifi_stats
rifi_summary
rifi_visualization.
rifi_wrapper(inp, cores, path, bg, restr)
rifi_wrapper(inp, cores, path, bg, restr)
inp |
data frame: the input data frame with correct format. |
cores |
integer: the number of assigned cores for the task. |
path |
path: path to an annotation file in gff format. |
bg |
numeric: threshold over which the last time point has to be to be fitted with the above background mode. |
restr |
numeric: a parameter that restricts the freedom of the fit to avoid wrong TI-term_factors, ranges from 0 to 0.2 |
All intermediate objects
rifi_preprocess
rifi_fit
rifi_penalties
rifi_fragmentation
rifi_stats
rifi_summary
rifi_visualization
data(example_input_minimal) rifi_wrapper(inp = example_input_minimal, cores = 2, path = gzfile(system.file("extdata", "gff_e_coli.gff3.gz", package = "rifi")), bg = 0, restr = 0.01)
data(example_input_minimal) rifi_wrapper(inp = example_input_minimal, cores = 2, path = gzfile(system.file("extdata", "gff_e_coli.gff3.gz", package = "rifi")), bg = 0, restr = 0.01)
segment_pos divides all IDs by position into position_segments
segment_pos adds the column "position_segment" to the rowRanges. To reduce run time, the data is divided by regions of no expression larger than "dist" nucleotides.
segment_pos(inp, dista = 300)
segment_pos(inp, dista = 300)
inp |
SummarizedExperiment: the input. |
dista |
integer: the amount of nucleotides defining the gap. Default is 300. |
The SummarizedExperiment object:
ID: |
The bin/probe specific ID |
position: |
The bin/probe specific position |
intensity: |
The relative intensity at time point 0 |
probe_TI: |
An internal value to determine which fitting model is applied |
flag: |
Information on which fitting model is applied |
position_segment: |
The position based segment |
data(preprocess_minimal) segment_pos(inp = preprocess_minimal, dista = 300)
data(preprocess_minimal) segment_pos(inp = preprocess_minimal, dista = 300)
The result of rifi_stats for E.coli example data A SummarizedExperiment containing the output from rifi_stats
data(stats_e_coli)
data(stats_e_coli)
A SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The bin/probe specific strand
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
The delay fragment the bin belongs to
The velocity value of the respective delay fragment
The vintercept of fit through the respective delay fragment
The slope of the fit through the respective delay fragment
The half-life fragment the bin belongs to
The mean half-life value of the respective half-life fragment
The intensity fragment the bin belongs to
The mean intensity value of the respective intensity fragment
The overarching transcription unit
The TI fragment the bin belongs to
The mean termination factor of the respective TI fragment
The combined ID of the fragment
presence of pausing site indicated by +/-
presence of iTSS_I indicated by +/-
The fragments involved in pausing site or iTSS_I
p_value of pausing site or iTSS_I
p_value of the slope
the slope value of the respective delay fragment
Integer, ratio of velocity between 2 delay fragments
Integer, the duration between two delay fragments
Integer, the position middle between 2 fragments with an event
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
p_value of the fold change of HL fragments
Integer, the fold change value of 2 intensity fragments
String, fragments involved in fold change between 2 intensity fragments
p_value of the fold change of intensity fragments
ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
fragments involved on ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Integer, the value correspomding to synthesis rate
String, the event assigned by synthesis rate either Termination or iTSS
p_value of the variance between two fold-changes, HL and intensity
p_value of TI fragment
p_value of 2 TI fragments
https://github.com/CyanolabFreiburg/rifi
The result of rifi_stats for artificial example data A SummarizedExperiment containing the output of rifi_stats as an extention to rowRanges and metadata (gff file processed, see gff file documentation)
data(stats_minimal)
data(stats_minimal)
A rowRanges of SummarizedExperiment with 24 rows and 45 variables:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
The delay fragment the bin belongs to
The velocity value of the respective delay fragment
The vintercept of fit through the respective delay fragment
The slope of the fit through the respective delay fragment
The half-life fragment the bin belongs to
The mean half-life value of the respective half-life fragment
The intensity fragment the bin belongs to
The mean intensity value of the respective intensity fragment
The overarching transcription unit
The TI fragment the bin belongs to
The mean termination factor of the respective TI fragment
The combined ID of the fragment
presence of pausing site indicated by +/-
presence of iTSS_I indicated by +/-
The fragments involved in pausing site or iTSS_I
p_value of pausing site or iTSS_I
p_value of the slope
the slope value of the respective delay fragment
Integer, ratio of velocity between 2 delay fragments
Integer, the duration between two delay fragments
Integer, the position middle between 2 fragments with an event
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
p_value of the fold change of HL fragments
Integer, the fold change value of 2 intensity fragments
String, fragments involved in fold change between 2 intensity fragments
p_value of the fold change of intensity fragments
ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
fragments involved on ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Integer, the value correspomding to synthesis rate
String, the event assigned by synthesis rate either Termination or iTSS
p_value of the variance between two fold-changes, HL and intensity
p_value of TI fragment
p_value of 2 TI fragments
https://github.com/CyanolabFreiburg/rifi
The result of rifi_stats for Synechocystis 6803 example data A SummarizedExperiment containing the output of rifi_stats as an extention to rowRanges
data(stats_synechocystis_6803)
data(stats_synechocystis_6803)
The rowRanges of SummarizedExperiment:
The bin/probe specific ID
The bin/probe specific position
The relative intensity at time point 0
An internal value to determine which fitting model is applied
Information on which fitting model is applied
The position based segment
The delay value of the bin/probe
The half-life of the bin/probe
String, the factor of TI fragment
The delay fragment the bin belongs to
The velocity value of the respective delay fragment
The vintercept of fit through the respective delay fragment
The slope of the fit through the respective delay fragment
The half-life fragment the bin belongs to
The mean half-life value of the respective half-life fragment
The intensity fragment the bin belongs to
The mean intensity value of the respective intensity fragment
The overarching transcription unit
The TI fragment the bin belongs to
The mean termination factor of the respective TI fragment
The combined ID of the fragment
presence of pausing site indicated by +/-
presence of iTSS_I indicated by +/-
The fragments involved in pausing site or iTSS_I
p_value of pausing site or iTSS_I
p_value of the slope
the slope value of the respective delay fragment
Integer, ratio of velocity between 2 delay fragments
Integer, the duration between two delay fragments
Integer, the position middle between 2 fragments with an event
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
p_value of the fold change of HL fragments
Integer, the fold change value of 2 intensity fragments
String, fragments involved in fold change between 2 intensity fragments
p_value of the fold change of intensity fragments
ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
fragments involved on ratio of fold change between 2 half-life fragments and fold change between 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Integer, the value correspomding to synthesis rate
String, the event assigned by synthesis rate either Termination or iTSS
p_value of the variance between two fold-changes, HL and intensity
p_value of TI fragment
p_value of 2 TI fragments
https://github.com/CyanolabFreiburg/rifi
The result of rifi_summary for E.coli example data A SummarizedExperiment containing the output of rifi_stats as an extention to rowRanges
data(summary_e_coli)
data(summary_e_coli)
The rowRanges of SummarizedExperiment:
all information regarding bins:
The bin/probe specific ID
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific position
The bin/probe specific strand
The segment the bin/probe belongs to
The overarching transcription unit
The delay fragment the bin/probe belongs to
The delay of the bin/probe
The half-life fragment the bin/probe belongs to
The half-life of the bin/probe
The intensity fragment the bin/probe belongs to
The relative intensity at time point 0
The flag of the bin/probe(TI, PDD)
String, the factor of TI fragment
all information regarding fragments:
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The first position of the fragment on the genome
The last position of the fragment on the genome
The bin/probe specific strand
The overarching transcription unit
The segment the fragment belongs to
The delay fragment of the fragment
The half-life fragment of the fragment
The half-life mean of the fragment
The half-life standard deviation of the fragment
The half-life standard error of the fragment
The intensity_fragment of the fragment
The relative intensity at time point 0
The intensity standard deviation of the fragment
The intensity standard error of the fragment
The velocity value of the respective delay fragment
all information regarding events:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Fold change of intensity
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to half-life and intensity:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to pausing sites and iTSS_I:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to velocity:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding TI:
String, event type
String, the fragment with TI
String, the factor of TI fragment
Integer, p_value of the event
Integer, p_value adjusted
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
Integer, number of fragements involved on the event
Integer, the position middle between 2 fragments with an event
the first position of TI fragment, if 2 fragments, first position is from the first fragment
the last position of TI fragment, if 2 fragments, last position is from the second fragment.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_summary for artificial example data A SummarizedExperiment with the output from rifi_summary as metadata
data(summary_minimal)
data(summary_minimal)
A list of 7 data frames with 290 rows and 11 variables, 36 rows and 11 variables, 57 rows and 18 variables, and 8 rows and 14 variables:
all information regarding bins:
The bin/probe specific ID
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific position
The bin/probe specific strand
The segment the bin/probe belongs to
The overarching transcription unit
The delay fragment the bin/probe belongs to
The delay of the bin/probe
The half-life fragment the bin/probe belongs to
The half-life of the bin/probe
The intensity fragment the bin/probe belongs to
The relative intensity at time point 0
The flag of the bin/probe(TI, PDD)
String, the factor of TI fragment
all information regarding fragments:
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The first position of the fragment on the genome
The last position of the fragment on the genome
The bin/probe specific strand
The overarching transcription unit
The segment the fragment belongs to
The delay fragment of the fragment
The half-life fragment of the fragment
The half-life mean of the fragment
The half-life standard deviation of the fragment
The half-life standard error of the fragment
The intensity_fragment of the fragment
The relative intensity at time point 0
The intensity standard deviation of the fragment
The intensity standard error of the fragment
The velocity value of the respective delay fragment
all information regarding events:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Fold change of intensity
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to half-life and intensity:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to pausing sites and iTSS_I:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to velocity:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding TI:
String, event type
String, the fragment with TI
String, the factor of TI fragment
Integer, p_value of the event
Integer, p_value adjusted
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
Integer, number of fragements involved on the event
Integer, the position middle between 2 fragments with an event
the first position of TI fragment, if 2 fragments, first position is from the first fragment
the last position of TI fragment, if 2 fragments, last position is from the second fragment.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_summary for Synechocystis 6803 example data A list containing the output from rifi_summary, including the fragment based data frame, bin based data frame, event data frame and the TI dataframe.
data(summary_synechocystis_6803)
data(summary_synechocystis_6803)
A list of 4 data frames with 3000 rows and 11 variables, 297 rows and 11 variables, 486 rows and 18 variables, and 10 rows and 14 variables:
all information regarding bins:
The bin/probe specific ID
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific position
The bin/probe specific strand
The segment the bin/probe belongs to
The overarching transcription unit
The delay fragment the bin/probe belongs to
The delay of the bin/probe
The half-life fragment the bin/probe belongs to
The half-life of the bin/probe
The intensity fragment the bin/probe belongs to
The relative intensity at time point 0
The flag of the bin/probe(TI, PDD)
String, the factor of TI fragment
all information regarding fragments:
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The first position of the fragment on the genome
The last position of the fragment on the genome
The bin/probe specific strand
The overarching transcription unit
The segment the fragment belongs to
The delay fragment of the fragment
The half-life fragment of the fragment
The half-life mean of the fragment
The half-life standard deviation of the fragment
The half-life standard error of the fragment
The intensity_fragment of the fragment
The relative intensity at time point 0
The intensity standard deviation of the fragment
The intensity standard error of the fragment
The velocity value of the respective delay fragment
all information regarding events:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Fold change of intensity
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to half-life and intensity:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the fold change value of 2 HL fragments
Integer, the fold change value of 2 intensity fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
Fold change of half-life/ fold change of intensity
Integer, the position middle between 2 fragments with an event
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to pausing sites and iTSS_I:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
Integer, the fold change of half-life/ fold change of intensity, position of the half-life fragment is adapted to intensity fragment
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding events related to velocity:
String, event type
Integer, p_value of the event
Integer, p_value adjusted
Integer, the position middle between 2 fragments with an event
Integer, ratio of velocity between 2 delay fragments
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
String, the first fragment of the two of fragments subjected to analysis
String, the second fragment of the two of fragments subjected to analysis
Integer, the duration between two delay fragments
Integer, the distance between two delay fragments
Integer, number of fragements involved on the event
all information regarding TI:
String, event type
String, the fragment with TI
String, the factor of TI fragment
Integer, p_value of the event
Integer, p_value adjusted
String, region annotation covering the fragments
String, gene annotation covering the fragments
String, locus_tag annotation covering the fragments
The bin/probe specific strand
The overarching transcription unit
Integer, number of fragements involved on the event
Integer, the position middle between 2 fragments with an event
the first position of TI fragment, if 2 fragments, first position is from the first fragment
the last position of TI fragment, if 2 fragments, last position is from the second fragment.
https://github.com/CyanolabFreiburg/rifi
TI_fit estimates transcription interference and termination factor using nls function for probe or bin flagged as "TI".
TI_fit uses nls2 function to fit the flagged probes or bins with "TI" found using finding_TI.r. It estimates the transcription interference level (referred later to TI) as well as the transcription factor fitting the probes/bins with nls function looping into several starting values.
TI_fit( inp, cores = 1, restr = 0.2, k = seq(0, 1, by = 0.5), decay = c(0.05, 0.1, 0.2, 0.5, 0.6), ti = seq(0, 1, by = 0.5), ti_delay = seq(0, 2, by = 0.5), rest_delay = seq(0, 2, by = 0.5), bg = 0 )
TI_fit( inp, cores = 1, restr = 0.2, k = seq(0, 1, by = 0.5), decay = c(0.05, 0.1, 0.2, 0.5, 0.6), ti = seq(0, 1, by = 0.5), ti_delay = seq(0, 2, by = 0.5), rest_delay = seq(0, 2, by = 0.5), bg = 0 )
inp |
SummarizedExperiment: the input with correct format. |
cores |
integer: the number of assigned cores for the task. |
restr |
numeric: a parameter that restricts the freedom of the fit to avoid wrong TI-term_factors, ranges from 0 to 0.2. |
k |
numeric vector: A sequence of starting values for the synthesis rate. Default is seq(0, 1, by = 0.5). |
decay |
numeric vector: A sequence of starting values for the decay Default is c(0.05, 0.1, 0.2, 0.5, 0.6). |
ti |
numeric vector: A sequence of starting values for the delay. Default is seq(0, 1, by = 0.5). |
ti_delay |
numeric vector: A sequence of starting values for the delay. Default is seq(0, 2, by = 0.5). |
rest_delay |
numeric vector: A sequence of starting values. Default is seq(0, 2, by = 0.5). |
bg |
numeric vector: A sequence of starting values. Default is 0. |
To determine TI and termination factor, TI_fit function is applied to the flagged probes and to the probes localized 1000 nucleotides upstream. Before applying TI_fit function, some probes/bins are filtered out if they are below the background using generic_filter_BG. The model loops into a dataframe containing sequences of starting values and the coefficients are extracted from the fit with the lowest residuals. When many residuals are equal to 0, the lowest residual can not be determined and the coefficients extracted could be wrong. Therefore, a second filter was developed. First we loop into all starting values, we collect nls objects and the corresponding residuals. They are sorted and residuals non equal to 0 are collected in a vector. If the first residuals are not equal to 0, 20 % of the best residuals are collected in tmp_r_min vector and the minimum termination factor is selected. In case the first residuals are equal to 0 then values between 0 to 20% of the values collected in tmp_r_min vector are gathered. The minimum termination factor coefficient is determined and saved. The coefficients are gathered in res vector and saved as an object.
the SummarizedExperiment object: with delay, decay and TI_termination_factor added to the rowRanges. The full fit data is saved in the metadata as "fit_TI".
data(preprocess_minimal) TI_fit(inp = preprocess_minimal, cores=2, restr=0.01)
data(preprocess_minimal) TI_fit(inp = preprocess_minimal, cores=2, restr=0.01)
TUgether combines delay fragments into TUs
TUgether combines delay fragments into TUs. The column "TU" is added. It uses score fun_increasing on the start and end points of delay_fragments.
TUgether(inp, cores = 1, pen = -0.75)
TUgether(inp, cores = 1, pen = -0.75)
inp |
SummarizedExperiment: the input data frame with correct format. |
cores |
cores: integer: the number of assigned cores for the task. |
pen |
numeric: an internal parameter for the dynamic programming. Higher values result in fewer fragments. Default -0.75. |
The function used is: .score_fun_increasing
The input is the SummarizedExperiment object. pen is the penalty for new fragments in the dynamic programming. Since high scores are aimed, pen is negative.
The SummarizedExperiment with the columns regarding the TU:
The bin/probe specific ID.
The bin/probe specific position.
The relative intensity at time point 0.
An internal value to determine which fitting model is applied.
Information on which fitting model is applied.
The position based segment.
The delay value of the bin/probe.
The half-life of the bin/probe.
String, the factor of TI fragment.
The delay fragment the bin belongs to.
The velocity value of the respective delay fragment.
The vintercept of fit through the respective delay fragment.
The slope of the fit through the respective delay fragment.
The half-life fragment the bin belongs to.
The mean half-life value of the respective half-life fragment.
The intensity fragment the bin belongs to.
The mean intensity value of the respective intensity fragment.
The overarching transcription unit.
The TI fragment the bin belongs to.
The mean termination factor of the respective TI fragment.
The combined ID of the fragment.
data(fragmentation_minimal) TUgether(inp = fragmentation_minimal, cores = 2, pen = -0.75)
data(fragmentation_minimal) TUgether(inp = fragmentation_minimal, cores = 2, pen = -0.75)
viz_pen_obj visualizes penalty objects
viz_pen_obj provides an optional visualization of any penalty object created by make_pen. the function aan be customized to show only the n = top_i top results.
viz_pen_obj(obj, top_i = nrow(obj[[3]][[1]]) * ncol(obj[[3]][[1]]))
viz_pen_obj(obj, top_i = nrow(obj[[3]][[1]]) * ncol(obj[[3]][[1]]))
obj |
object: penalty object(make_pen output) |
top_i |
integer: the number of top results visualized. Default is all. |
A visualization of the penalty object
data(penalties_e_coli) viz_pen_obj(penalties_e_coli$pen_obj_delay,25)
data(penalties_e_coli) viz_pen_obj(penalties_e_coli$pen_obj_delay,25)
The result of rifi_wrapper for E.coli example data A list of SummarizedExperiment containing the output of rifi_wrapper. The list contains 6 elements of SummarizedExperiment output of rifi_preprocess, rifi_fit, rifi_penalties, rifi_fragmentation, rifi_stats and rifi_summary. The plot is generated from rifi_visualization. for more detail, please refer to each function separately.
data(wrapper_e_coli)
data(wrapper_e_coli)
An object of class list
of length 6.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_wrapper for E.coli artificial example. A list of SummarizedExperiment containing the output of rifi_wrapper. The list contains 6 elements of SummarizedExperiment output of rifi_preprocess, rifi_fit, rifi_penalties, rifi_fragmentation, rifi_stats and rifi_summary. The plot is generated from rifi_visualization. for more detail, please refer to each function separately.
data(wrapper_minimal)
data(wrapper_minimal)
An object of class list
of length 6.
https://github.com/CyanolabFreiburg/rifi
The result of rifi_wrapper for summary_synechocystis_6803 example data A list of SummarizedExperiment containing the output of rifi_wrapper. The list contains 6 elements of SummarizedExperiment output of rifi_preprocess, rifi_fit, rifi_penalties, rifi_fragmentation, rifi_stats and rifi_summary. The plot is generated from rifi_visualization. for more detail, please refer to each function separately.
data(wrapper_summary_synechocystis_6803)
data(wrapper_summary_synechocystis_6803)
An object of class list
of length 6.
https://github.com/CyanolabFreiburg/rifi