NEWS
infercnv 1.18.1 (2023-12-01)
- Update to work with Seurat v5 object changes.
- New default setting for leiden_resolution now set to "auto" which will very roughly set a resolution value that scales with the number of cells to avoid over splitting subclusters when the number of cells increases.
- Fix for plotting with dynamic_resolution enabled when the png size limit of cairo is hit to not have parts of the plot that try to be plotted outside the limit.
- Add useRaster option to plot_per_group method and transfer it plot_cnv calls.
- Transfer cluster_by_groups option to HMM when running in samples mode to allow running the HMM on "all_observations".
infercnv 1.15.3 (2023-03-29)
- Now use the first of the 2 color bars on the left side of the observations heatmap to display subclusters when they have been calculated and k_obs_groups is not used (>1) when cluster_by_groups=F is.
- Add export of infercnv subclusters to Seurat object and features file generated by add_to_seurat.
- Add write_phylo option to run() and plot_cnv() to control if a file with newick strings for the dendrogram is generated.
- Fully transfer subclustering information when running plot_per_group to each annotation's object to take advantage of subcluster being displayed on color bars now.
- Fix for "meanvar" sim_method in hspike generation when there are no references and one of the observation annotations has only a single cell.
- Fix plot_subclusters() when using an object that was processed with cluster_by_groups=FALSE.
- Change default k_obs_groups in plot_cnv to 1 instead of 3 to use the new subclustering coloring by default.
infercnv 1.15.2 (2023-03-08)
- "infercnv_subclusters" plot after subclustering step in run() that displays the subclusters is now controlled by its own "inspect_subclusters" option.
infercnv 1.15.1 (2023-02-24)
- Add helper method plot_subclusters() to plot subclusters as the annotations to more easily check if the subclustering settings used produce good results or not, so settings can be adjusted.
- Change cluster_by_groups default to True.
infercnv 1.14.2 (2023-03-08)
- Add "infercnv_subclusters" plot after subclustering step in run() that displays the subclusters. Can be disabled with the exisiting no_plot argument.
infercnv 1.14.1 (2023-02-24)
- Fix per chromosome subclustering to use all the data and not only the data from the last annotation group when there are no outliers filtered by z_score (which would always happen when no references are defined).
- Apply per subcluster consensus on HMM predictions on the object directly when running with per chromosome subclustering enabled, rather than only running it before writting txt outputs and figure. This makes the infercnv_obj also contain the same values plotted.
- Disable per chromosome subclustering by default. The option remains available.
- Change how the subclustering is run on the hspike to avoid issues with Leiden settings tuning. Now simply keep the structure of how the fake cells are generated.
- Fix to order in which Leiden partition is made into an hclust/dendrogram to avoid issues with singleton (as long as there are non singletons).
- Fix options stored in backup infercnv objects not being updated properly updated when changing the settings on a rerun.
- Fix to properly restart past the Bayesian network step if it has already been run and only the BayesMaxPNormal filter has been changed.
infercnv 1.13.1 (2022-10-17)
- Updated how the Leiden subclustering is run to use igraph instead of the Python implementation called through reticulate, and to run a PCA before building neighboring graph.
- Added option to run a second round of subclustering that is per chromosome (for the HMM predictions).
- Added method that runs the i6 HMM predictions based on per chromosome subclustering.
- Add arguments/options for Leiden settings and change defaults.
- Change default analysis mode to subclusters.
- Fix plotting when no hclust is already stored in the object and have clusters with only 1 or 2 cells.
- Fix to add_to_seurat so that unsorted chromosomes that are only named by their number are read as character and not integers, which resulted in the named features using the unsorted index as a name.
- Allow adding a custom column prefix when using add_to_seurat (by @matt-sd-watson )
- Add an "assay_name" option to add_to_seurat in case the assay is not named "RNA" like expected.
- More helpful message displayed before erroring when trying to run add_to_seurat without having run the HMM.
- Add option to filter genes used in subclustering based on zscore.
- Fix i3 settings calculation.
- Change to not output "scaled" CNV proportion values when running the HMM in i3 mode as we don't know if a gain/loss is of only 1 copy or more.
- Update actually used "levels" in the factor used to store gene chromosome information so that contigs entirely filtered out don't mess up the color bars on the heatmap.
- Linked write_expr_matrix option in plot_cnv to the methods that plot the observations and references so that the split matrix also don't get output when set to false.
- Fix for plot_cnv(plot_chr_scale=T) to handle chromosomes with a single gene on them, though such chromosomes should be dropped entirely.
infercnv 1.11.3 (2022-04-11)
- Add tumor_subcluster_partition_method option to CLI script.
- Add sd_amplifier option to example run.R to match wiki better.
- Transfer denoise default parameter to reloading function when its not manually set.
- Add handling of annotation groups with 1 or 2 cells when running Leiden subclustering.
- Fix i3 HMM plotting color range in step 19.
- Fixes to i3 HMM and tweak of some default settings for it.
infercnv 1.11.2 (2022-02-03)
- Sort observation groups names when storing the list of indices so they are plotted in order, making it easier to change the sorting on the final figure.
- Fix plotting error on Windows by adding a check that bitmapType option is not null before checking what it is to prevent error in comparison.
- Fix for some group labels being cut off in width at the bottom of the figure
- Add conversion of expression data in sparse matrix format to dense matrix when splitting references in n groups as parallelDist does not handle them.
infercnv 1.11.1 (2021-11-08)
- Link "plot_chr_scale" and "chr_lengths" options from plot_cnv() to run().
- Added parameter k_obs_group to plot_per_group so that it can be applied to each of the subsequent group plotting calls if desired.
- Added a check to plot_cnv when using the dynamic_resize option that the increased size is not above the max size allowed when using Cairo as the graphical back end on Linux. If it is, set the size to the highest allowed instead.
- Replaced all calls to dist() with parallelDist(num_threads). (suggestion from @WalterMuskovic)
- Updated the code to run the Leiden clustering to be able to handle bigger datasets (work around R not having long vectors implemented) and be more efficient.
- Add a warning to infercnv object creation if the number of cells is more than half of the setting for scientific notation in R, as it may cause an issue while using as.hclust(phylo_obj) in the Leiden subclustering step.
- Updated plot_cnv method and imports to handle sparse matrices.
- Fix main heatmap drawing which in some cases caused the plotting to be out of field in the output.
- Fix method that compares arguments with backups with changes in R 4.1.1
infercnv 1.10.1 (2021-11-08)
- Fix missing colnames in subcluster information when using the Leiden method (used downstream by add_to_seurat).
- Fix "plot_per_group" to handle infercnv objects with NULL clustering information (mainly to be able to plot using existing results but changing the annotations).
infercnv 1.8.1 (2021-08-16)
- Fix name generation error for step 15.
- Handle annotation names as strings even if they look like numbers.
infercnv 1.7.2 (2021-05-05)
- New dependencies : RANN, leiden, phyclust
- Added new partition method for subclustering that uses the Leiden algorithm based on a K-nn adjacency matrix.
- Changed the default subclustering method to leiden which is much faster than the random trees method.
- Split Bayesian filtering step in two steps, one that runs the model, and one that applies the filter threshold. This allows updating the threshold without having to rerun the whole model.
- Fix what groupings of references the subclustering is done on.
- Updated expectations of the internally stored clustering information when plotting references to allow for results obtained with version between ~1.3 and this one to be plotted.
- Fix add_to_seurat method to work when no seurat object is provided after the reordering fix.
- Fix the random trees subclustering applying a different method of centering to the data between the hclust stored in infercnv_obj@tumors_subclusters$hc and the splits in infercnv_obj@tumors_subclusters$subclusters.
- Make denoising step figure only be generated if plot_steps is true. (it is identical to final figure that is plotted based on a different option, making it redundant)
infercnv 1.5.3 (2020-10-23)
- Add check that detectCores() doesn't return NA before comparing. In case it does, just use the value provided as an option directly.
infercnv 1.5.2 (2020-10-23)
- Added reordering of cells in metadata exported to a seurat object so that it always matches, in case the cells are not sorted in the same order in the data provided to infercnv and in seurat.
infercnv 1.5.1 (2020-10-06)
- Fix to reload in cases where comparing NULL/NA.
- Fixed issue in denoising when trying to reload results from step 18 or 19.
- Fixed MCMC Diagnostic plots by adding diagnostic generation.
- Update included data objects to contain additional option slot, and prevent common name collisions.
- Fully rename data objects and name of the vars they provide.
- Fix reference plotting not having access to the actual subclustering information but that of the previous provided data object (that was renamed to avoid name collision by mistake like this one).
- Added checks in add_to_seurat methods that there are gains/losses found when taking the top hits.
- Added check that output to write in add_to_seurat top regions is not null and output an empty file without erroring if it is.
- Change to remove genes in "chr_exclude" from counts before doing the read level filtering of cells.
- Added minimum read count requirement per cell of 1 after removal of "chr_exclude" genes so that there are no divisions by 0 when normalizing.
- Changed default min_max_counts_per_cell to select cells with at least 100 counts by default.
- Fix to plotting for HMM coloring of heatmap when the full range of values are not present.
- Fix to plotting when no reference groups are used to not produce warnings.
infercnv 1.3.6 (2020-03-20)
- Fix a read.table() call who's behavior was changed by the latest R base devel changes.
infercnv 1.3.5 (2020-03-02)
- Fix saving file name for preliminary object.
- Added an option to plot_cnv(), plot_chr_scale, that allows to plot chromosomes proportionally to their full size. Either a list of chromosomes sizes is provided, or the last gene stop position + 10k bp is assumed. Gene width is proportional to the area in the chromosomes they "cover" (midpoints between genes). This option is slower than the default as rasterization can not be applied when drawing the heatmap.
infercnv 1.3.4 (2020-02-18)
- Added options to save rds results after each step (save_rds) or final results (save_final_rds), with default TRUE.
- Fix size of pdf output settings to use the given width and height values instead of the "USr" format.
- Fixes in plot_cnv() when using observation groups with 1 or 2 cells.
- Fix checks in plot_cnv() of references for when subclusters are not defined in the object.
- Fix (sub)cluster definition when an annotation only has 1 cell (although this should probably be avoided).
infercnv 1.3.3 (2020-02-07)
- Large speed improvements in plot_cnv() that are more significant the bigger the matrix size.
- plot_cnv() now uses clustering information stored in the infercnv object for the references instead of rerunning the clustering every time.
- Fix assumption that length(class(obj)) == 1, in this case for matrix objects for R 2020-01-28 update.
- Made color.palette an exported function so that it can be used for the "custom_color_pal" option.
- Updated some logging to be more accurate.
infercnv 1.3.2 (2019-12-06)
- Added an new option "custom_color_pal" to plot_cnv() so a user can provide their choice of colors for the heatmap.
infercnv 1.2.2 (2019-12-06)
- Fixed check of how HMM results are reported by so that add_to_seurat works when HMM_report_by is "cell" but HMM analysis_mode was "subclusters".
- Fix check of HMM_report_by setting in add_to_seurat when the option was not specified.
infercnv 1.2.1 (2019-11-14)
- Fix _R_CHECK_LENGTH_1_LOGIC2_ related error because an argument was checked before matching to it's list of potential arguments.
- Fix text outputs for CNV reports when running with report_by="cell".
- Fix add_to_seurat method to handle CNVs reported by cells in HMM predictions.
- Fix how options used for a run are stored in the object and compared when trying to reload previous results so that it handles arguments given as variables in the call to run(). This also fixes the command-line script to run infercnv.
infercnv 1.1.4 (2019-10-29)
- Fix reading of input annotations when some are only digits to be properly read as characters.
- Added checks that HMM_report_by option is compatible with analysis_mode option. If not, change it automatically.
- Fix reading of bayesian filtered HMM results in add_to_seurat after previous version changes to keep CNV ids and states scale.
infercnv 1.1.3 (2019-09-16)
- Fix to reload checks on HMM steps.
- Added new smoothing method, 'coordinates", that smooths the per cell data using a window based on a base pairs distance (around 10.000.000 seems to be a good start for the window size) to the current gene. As the hspike does not model gene distances/positions on a chromosome at this time, the HMM i6 mode is not compatible with this smoothing method and an error will be returned if both try to be used together.
- Added a bp distance tolerance to "merge" top CNVs that are actually the same CNV in different subclusters in add_to_seurat method.
- Removed the top any type of CNV field, as it is redundant to top loss and top dupli.
- Added text output of identified top CNVs as they the base pair tolerance aggregates some compared to the original HMM output.
- Added an argument "up_to_step" to stop infercnv::run() after a given step.
- Fix contents of @options field in infercnv_obj to store non default run time arguments in the same form as object creation arguments.
- Update so that HMM predictions outputs have the analysis_mode in their name and do not get overwritten when using different modes. Can also know which pairs of file to reload together now.
- Update add_to_seurat so that it checks what analysis_mode was used in the run based on the @options field to reload the matching HMM predictions or Bayesian Network filtered predictions.
- Updated "subclustering" method for sample mode so that @tumor_subclusters$subclusters indices have the cell names attached to be able to map back in add_to_seurat.
- Allow HMM steps to be resumed if needed even if steps 20/21 are done.
infercnv 1.1.2 (2019-07-08)
- Added method to write table of wide array of predicted features from HMM results to file or add them as meta.data to a Seurat object if one is provided.
- Overhaul of save/reload system to store non default arguments and keep track of relevant options at each step when trying to reload backups. Also check for input counts matrix identity with reloaded one (hash at object creation time).
- Added linking of image() option useRaster to run() and plot_cnv() to be able to enable by default, speeding up plotting significantly.
infercnv 1.1.1 (2019-05-20)
- Added method to sample an infercnv object to a given number of cells, or at a given frequency, per annotation group. This is to make it easy to plot figures where all annotations groups have the same overall height, as well as downsample very large datasets that would otherwise take too long to plot (while still running the analysis on the full data)
- Added method to plot each annotation group to a different figure and combine with sampling. Mostly intended to split data for larger datasets.
- Added support for output_format option within run() to link to plot_cnv() to support only writting text outputs during the analysis.
infercnv 1.0.4 (2019-09-16)
- Fix check that contig to cluster by was found when specified.
- Added support to plot_cnv for cell groups with exactly 2 cells.
- Fix which input file type is checked.
- Made it so that plot_cnv recalculates clustering automatically if non null ref_contig argument is provided.
- Fix for plot_cnv() when providing multiple ref_contigs and cluster_by_group is False.
- Fix only 1/n genes being taken into account when using n ref_contig in plot_cnv.
- Fix error in file creation when using multiple ref_contig and cluster_by_groups=FALSE in plot_cnv.
- Bayesian filtering now preserves CNV ids in outputs
infercnv 1.0.3 (2019-07-05)
- Fix missing dendrograms in text output when drawing figures.
- Fix path to save object to when splitting references.
- Fix file name creation when using num_ref_groups option.
- Fix reference cells indices returned from method that splits references in num_ref_groups when references are not sorted and at the beginning of the matrix.
- Fix to support of data.frame as input type for counts matrix.
infercnv 1.0.2 (2019-05-21)
- Reduce peak memory usage.
- Fix to subclusters definition when using a sparse matrix and a non random trees method with no references.
infercnv 1.0.1 (2019-05-20)
- Improved when the clustering is defined for groups when running in sample mode.
- Fixed support for NA to be understood as an output_format value to plot_cnv() in case a user only wants to generate the text outputs and not the plot.
- Fix ordering of cells and color bars on the heatmap and text outputs when cells are not sorted in the same order in the input matrix and the annotation file.
- Fix to (sub)cluster definition when a group only has 1 cell.
- Fix plot_cnv() to handle observations groups with only 1 cell (that can't be hierarchically clustered), a single reference group when no references are not clusterd, and a single reference group with a single cell.
infercnv 1.0.0 (2019-05-02)
- Released as a BioConductor packages
infercnv 0.99.8 (2019-04-20)
- Fix missing labels on heatmap due to change in how axis() handles overlaps.
- Improve auto thresholding of heatmap colors to still work when using stronger denoise settings.
- Updated some examples to use tempfile() for output.
- Fix duplicate piece of code that produced an error when using plot_steps.
infercnv 0.99.0 (2019-03-15)
- Submitted to Bioconductor