Title: | Pedigree Analysis and Familial Aggregation |
---|---|
Description: | Framework providing basic pedigree analysis and plotting utilities as well as a variety of methods to evaluate familial aggregation of traits in large pedigrees. |
Authors: | J. Rainer, D. Taliun, C.X. Weichenberger |
Maintainer: | Johannes Rainer <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.35.0 |
Built: | 2024-11-29 05:55:51 UTC |
Source: | https://github.com/bioc/FamAgg |
The FABinTestResults
object contains the results from a
simple binomial to test whether the number of affected in a trait are
higher than expected by chance. For more details on the method please
see binomialTest
.
## S4 method for signature 'FABinTestResults' result(object, method="BH") ## S4 replacement method for signature 'FABinTestResults' trait(object) <- value
## S4 method for signature 'FABinTestResults' result(object, method="BH") ## S4 replacement method for signature 'FABinTestResults' trait(object) <- value
(in alphabetic order)
method |
The multiple hypothesis testing method. All methods supported by
|
object |
The |
value |
For |
A call to the setter methods trait<-
resets any test
results present in the result
slot, thus, the object can be
re-used to perform a simulation analysis using the new trait data.
Refer to the method and function description above for detailed information on the returned result object.
FABinTestResults
objects are returned by the
binomialTest
function.
Class FAData
directly.
The results data.frame
.
Returns the result from the test as a data.frame
with
columns:
"trait_name"
: the name of the trait.
"total_phenotyped"
: total number of phenotyped individuals in
the trait.
"total_affected"
: total number of affected individuals in the
trait.
"family"
: the family id. If a global test is used (i.e. if
the pedigree consists of a single family, or global = TRUE
was provided, the column shows "full pedigree"
).
phenotyped
: the number of phenotyped individuals in the
family.
affected
: the number of affected individuals in the family.
pvalue
: the p-value from the binomial test (conducted using
the binom.test
function).
prob
: the probability of being affected. Either a
local probability calculated based on all affected and
phenotyped individuals in the whole pedigree, or a global
(population) probability that has to be provided with argument
prob
.
padj
: the p-value adjusted for multiple hypothesis testing
using the method defined with argument "method"
.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Subsetting (using the [
operator) is not supported.
Johannes Rainer, Christian Weichenberger
FAData
,
kinship
,
trait
,
probabilityTest
,
kinshipGroupTest
,
kinshipSumTest
,
genealogicalIndexTest
,
familialIncidenceRateTest
,
fsirTest
,
plotPed
########################## ## ## Perform the analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7 | minnbreast$famid == 8, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the test: bir <- binomialTest(fad, trait=tcancer, traitName="cancer") result(bir) ## Calculating the probability of being affected from the whole data set. prob <- sum(minnbreast$cancer, na.rm = TRUE) / sum(!is.na(minnbreast$cancer)) bir <- binomialTest(fad, trait = tcancer, prob = prob) result(bir) ## Plot the pedigree of the family with the smallest p-value. plotPed(bir, family = "8")
########################## ## ## Perform the analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7 | minnbreast$famid == 8, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the test: bir <- binomialTest(fad, trait=tcancer, traitName="cancer") result(bir) ## Calculating the probability of being affected from the whole data set. prob <- sum(minnbreast$cancer, na.rm = TRUE) / sum(!is.na(minnbreast$cancer)) bir <- binomialTest(fad, trait = tcancer, prob = prob) result(bir) ## Plot the pedigree of the family with the smallest p-value. plotPed(bir, family = "8")
FAData
objects conveniently store pedigree along with trait
information. This object is the central data structure from the
FamAgg
package. Basic usage pedigree analysis methods are
described on this page and on the PedigreeUtils
help
page, familial aggregation analysis methods on the
PedigreeAnalysis
help page.
See the section about the pedigree
data.frame
below for
a detailed description of the encoding of missing trait data or
founder individuals in FamAgg
.
## S4 method for signature 'FAData' affectedIndividuals(object) ## S4 method for signature 'FAData' age(object) ## S4 replacement method for signature 'FAData' age(object) <- value ## S4 method for signature 'FAData' buildPed(object, id=NULL, family = NULL, max.generations.up=3, max.generations.down=16, prune=FALSE, ...) ## S4 method for signature 'FAData' export(object, con, format="ped", ...) FAData(pedigree, age, trait, traitName, header=FALSE, sep="\t", id.col="id", family.col="family", father.col="father", mother.col="mother", sex.col="sex") ## S4 method for signature 'FAData' family(object, id=NULL, family=NULL, return.type="data.frame") ## S4 method for signature 'FAData' kinship(id, ...) ## S4 method for signature 'FAData' pedigree(object, return.type="data.frame") ## S4 replacement method for signature 'FAData' pedigree(object) <- value ## S4 method for signature 'FAData' pedigreeSize(object) ## S4 method for signature 'FAData' phenotypedIndividuals(object) ## S4 method for signature 'FAData' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", symbol.related=NA, proband.id=NULL, highlight.ids=NULL, only.phenotyped=FALSE, label1=age(object), label2=NULL, label3=NULL, ...) ## S4 method for signature 'FAData' show(object) ## S4 method for signature 'FAData' trait(object, na.rm=FALSE) ## S4 replacement method for signature 'FAData' trait(object) <- value
## S4 method for signature 'FAData' affectedIndividuals(object) ## S4 method for signature 'FAData' age(object) ## S4 replacement method for signature 'FAData' age(object) <- value ## S4 method for signature 'FAData' buildPed(object, id=NULL, family = NULL, max.generations.up=3, max.generations.down=16, prune=FALSE, ...) ## S4 method for signature 'FAData' export(object, con, format="ped", ...) FAData(pedigree, age, trait, traitName, header=FALSE, sep="\t", id.col="id", family.col="family", father.col="father", mother.col="mother", sex.col="sex") ## S4 method for signature 'FAData' family(object, id=NULL, family=NULL, return.type="data.frame") ## S4 method for signature 'FAData' kinship(id, ...) ## S4 method for signature 'FAData' pedigree(object, return.type="data.frame") ## S4 replacement method for signature 'FAData' pedigree(object) <- value ## S4 method for signature 'FAData' pedigreeSize(object) ## S4 method for signature 'FAData' phenotypedIndividuals(object) ## S4 method for signature 'FAData' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", symbol.related=NA, proband.id=NULL, highlight.ids=NULL, only.phenotyped=FALSE, label1=age(object), label2=NULL, label3=NULL, ...) ## S4 method for signature 'FAData' show(object) ## S4 method for signature 'FAData' trait(object, na.rm=FALSE) ## S4 replacement method for signature 'FAData' trait(object) <- value
(in alphabetic order)
age |
For |
con |
For |
device |
For |
family |
For |
family.col |
For |
father.col |
For |
filename |
For |
format |
For |
header |
For |
highlight.ids |
A list of character vector(s) of ids that should be labeled. The
name(s) of the character vector(s) is/are used as the text to label
the individuals (the text is shown below the symbol of the
individuals). Up to 3 character vectors are
supported. Alternatively, a single character vector of ids can be
submitted in which case the individuals are labeled with an asterisc
( |
id |
For method For method Note: |
id.col |
For |
label1 |
For |
label2 |
For |
label3 |
For |
max.generations.down |
For |
max.generations.up |
For |
mother.col |
For |
na.rm |
For |
object |
The |
only.phenotyped |
Wheter only phenotyped individuals, i.e. individuals with a non-NA
value in column |
pedigree |
For |
proband.id |
For |
prune |
For |
return.type |
Either |
sep |
For |
sex.col |
For |
symbol.related |
For |
trait |
For |
traitName |
For |
value |
For For For |
... |
Additional arguments to be passed to the plotting functions
( |
See sections below for a description of the individual methods.
The buildPed
method is a combination of the methods
getAncestors
, getChildren
and getMissingMate
,
i.e. it first gets all ancestors for the specified id(s), determines
then the children of all of the ids (submitted ids and their
ancestors) and at last looks for any missing mates/spouses to complete
the pedigree.
The plotPed
function uses either the external perl program
HaploPainter
or the plotting capabilities of the
kinship2
package.
With HaploPainter
, as it is an external too, it is not possible
to display the plot directly, but each plot is automatically saved to
a file (either "pdf"
, "ps"
, "svg"
or
"png"
; can be specified with the device
parameter). HaploPainter
plotting supports also
device = "txt"
in which case the data table is exported (in the format
expected by HaploPainter
) to a tabulator delimited text file
and the name of this text file is returned - no plot is created.
Plotting with kinship2
(the default) allows to display the plot
(device="plot"
) or export it to a file (device="pdf"
or
device="png"
).
The switchPlotfun
function can be used to change the plotting
system.
Refer to the method and function description above for detailed information on the returned result object.
FAData
objects are created by the constructor function
FAData
and should not be directly created by a call to
new
.
A (named) numerical vector with the age of the individuals. It is suggested to use the getter and setter methods described below to access this slot.
A data.frame
with the pedigree. It is suggested to use the
getter and setter methods described below to access this slot.
The kinship matrix for the kinship of each individual in the
pedigree with each other. This slot should not be accessed
directly, but the kinship
method should be used
instead.
The name of the trait being stored in the object.
A numerical vector with the trait information, 0
, 1
,
NA
, for phenotyped but not affected, affected and not
tested, respectively. This slot should not be accessed directly,
but the trait
and trait<-
methods should be used
instead that ensure that the data is matched to the information in
the pedigree.
Constructor function to create a new FAData
instance. In
addition to submitting the pedigree information as
data.frame
, pedigree
or pedigreeLinst
it is
possible to specify the name of the file from which the pedigree
information should be read.
The recognizes and imports plink ped and fam
files
(http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml) or from
generic text files.
For the latter, arguments
header
, sep
, family.col
, id.col
,
father.col
, mother.col
and sex.col
allow to
further specify which columns of the file contain what information
etc.
If argument pedigree
is a data.frame
, the column
names "family"
, "id"
, "father"
,
"mother"
and "sex"
are expected. Any additional
columns are dropped.
The sex is expected to be encoded either as a numeric 1 (male),
2 (female) with any other number or NA
representing
unknown, or as a character vector or factor with "M"
,
"m"
, "Male"
or "male"
for male and "F"
,
"f"
, "Female"
or "female"
for female.
Export pedigree data to a file.
Access name
column in the pedigree of the FAData
object. The function returns a named vector wirh the names
corresponding to the ids of the individuals or NULL
if
name
does not correspoind to a column name in the pedigree.
The trait data can be accessed either by object$trait
or
object$affected
.
Returns the age of the individuals as a named numeric vector.
If the pedigree is set, the order of the values corresponds always
to the ordering of the individuals in the pedigree with NA
for individuals for which the age is unknown. In case the age was
never set it returns a vector of NAs with length equal to the
number of individuals.
Setter for the age. Value has to be a named numeric vector.
Returns the pedigree either as a data.frame
or a
pedigreeList
object (defined in the kinship2
package) depending on the value of the parameter
return.type
(i.e. either return.type="data.frame"
or
return.type="pedigree"
). If pedigree
is called on
any other object than a FAData
object (or any object
that inherits from that object), the
pedigree
method from the
kinship2
package is called.
For the default return type (i.e. return.type="data.frame"
)
a data.frame
is returned with the following columns:
"family"
: the ID of the family,
"id"
: the ID of the individual,
"father"
: the ID of the individual's father. Founder
individuals, i.e. individuals for whom the father and mother is
not known in the data set, contain a NA
in this column.
"mother"
: the ID of the individual's father. Founder
individuals, i.e. individuals for whom the father and mother is
not known in the data set, contain a NA
in this column.
"sex"
: the sex of the individuals encoded as a
factor
with levels "M"
and "F"
for male and
female, or NA
for not known.
If trait information is available in the object the returned
data.frame
will also contain a column named affected
with the information whether the individual is affected
(1
), not affected (0
) or was not tested/phenotyped
NA
.
Setter for the pedigree slow. Value can be a data.frame
with columns containing the family id, individual id, father id,
mother id and sex (in this order) or a pedigree
or
pedigreeList
object as defined in the kinship2
package.
Subsets the FAData
object to individuals specified with
i
which can be a logical, numeric or character vector. For
the latter, the elements have to be the ids of the individuals
(i.e. rownames of pedigree(object)
). Returns the sub-setted
object.
Note that subsetting other than by family might result in a
non-valid pedigree (e.g. if mother or father ID are not available
in the sub-setted pedigree).
Get the trait vector from the object. By default, the ordering
is the same as pedigree, setting argument na.rm=TRUE
removes all NA
values, thus the ordering and length might
be different. Returns a named vector with the names corresponding
to the ids of the individuals.
Setter for the trait slot. Can be a named numeric vector (values
0
, 1
and NA
) or logical vector (values
FALSE
, TRUE
and NA
) with the names matching
the ids of the individuals in the pedigree. The method internally
matches and re-orders the trait vector to match the ordering of
the ids in the pedigree.
Returns a character vector with the ids of the affected
individuals, i.e. the id of the individuals with a value other
than 0 or NA
in the trait.
If no trait data is available the method returns NULL
.
Builds a pedigree for the specified id(s) containing generations
defined by max.generations.up
and
max.generations.down
and returns it as a
data.frame
. The pedigree contains all individuals in the
family sharing kinship with the input individual(s) and mates
needed to complete the pedigree.
For prune=TRUE
the function tries to
find the smallest connected pedigree for all the submitted ids.
Returns the pedigree for a full family. In contrast to
buildPed
which constructs a (sub)pedigree for a specific
individual, this method returns the pedigree of the complete
family for an individual (if id
is specified). The function
returns either a data.frame
or a pedigreeList
with
the pedigree for the family.
Extracts the pre-calculated kinship matrix, i.e. a symmetric
matrix with the kinship between all individuals in the
pedigree. The matrix is calculated using the kinship
method
provided by the kinship2
package [Sinwell (2014)].
The function returns a dsCMatrix
from the Matrix
package.
Returns the size, i.e. the number of individuals (rows) in the pedigree.
Returns a character vector with the ids of the phenotyped
individuals, i.e. the id of all individuals that have a non-NA
value in thetrait.
If no trait data is available the method returns NULL
.
Creates the pedigree for the submitted id(s) or family and plots
it (i.e. saves it to the specified file). See details above for
more information.
Returns the file name of the file to which the pedigree plot was
exported or NULL
for kinship2
plotting and
device="plot"
.
For HaploPainter
plotting and device = "txt"
the name
of the file to which the plotting data has been exported is returned.
See doPlotPed
for more information.
Methods for familial aggregation and other pedigree analysis methods are described on the PedigreeAnalysis help page.
A variety of different pedigree utilities are defined for
FAData
objects. For the full list of methods see the
PedigreeUtils help page.
The ids of individuals, father, mother and family can be either numeric or characters, internally, all ids will however be handled as characters.
The pedigree<-
setter method removes all white spaces in
columns "id"
, "family"
, "father"
and
"mother"
of the pedigree.
Johannes Rainer.
Sinwell JP, Therneau TM & Schaid DJ (2014) The kinship2 R package for pedigree data. Human heredity 78:91-93.
pedigree
,
FAProbResults
,
FAKinGroupResults
,
FAKinSumResults
,
FAGenIndexResults
,
doPlotPed
,
PedigreeUtils
,
getAll
,
PedigreeAnalysis
########################## ## ## Create a new FAData object ## ## Load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## Renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## Defining the optional argument age. Age <- mbsub$endage names(Age) <- mbsub$id ## Create the object fad <- FAData(pedigree=mbped, age=Age) fad ## Extract the ids directly... head(fad$id) ## Extract the kinship matrix dim(kinship(fad)) ## What's the size of the pedigree? pedigreeSize(fad) ## Importing a "ped" file. pedFile <- system.file("txt/minnbreastsub.ped.gz", package="FamAgg") ## Quick glance at the file. readLines(pedFile, n=1) fad <- FAData(pedFile) head(pedigree(fad)) ## Creating the FAData reading data from a txt file. pedFile <- system.file("txt/minnbreastsub.txt", package="FamAgg") fad <- FAData(pedigree=pedFile, header=TRUE, id.col="id", family.col="famid", father.col="fatherid", mother.col="motherid") ## Adding the age age(fad) <- Age fad ## List all families in the pedigree along with the number of ## individuals table(fad$family) ########################## ## ## Basic usage ## ## Extracting the pedigree information ped <- pedigree(fad) ## By default the pedigree is returned as a data.frame. class(ped) head(ped) ## In addition, we can extract the pedigree as a pedigreeList pedigree(fad, return.type="pedigree") ## Return the ids of all ancestors of individual 6 ## up to 3 generations getAncestors(fad, id="6") ## Build the pedigree for individual 6: this includes all of its ## children and all of its ancestors up to the maximal number of ## specified generations. buildPed(fad, id=6) ## Which is a sub-pedigree of the complete family: family(fad, id=6) ## In addition we can specify manually some ids in the pedigree and ## generate the smallest possible pedigree containing all ids: buildPed(fad, id=c(6, 23, 28), prune=TRUE) ## Get the list of all ids sharing kinship with individuals ## 5 and 9 shareKinship(fad, id=c("5", "9")) ## Subset the fad to family "4" subFad <- fad[fad$family == "4", ] subFad ## Export the pedigree from this family to a ped file tmpFile <- tempfile() export(subFad, con=tmpFile, format="ped") head(read.table(tmpFile, sep="\t")) ########################## ## ## Plotting ## ## Plot the pedigree for individual 6. plotPed(fad, id=6) ## Alternatively, exporte it to a temporary file pfile <- plotPed(fad, id=6, device="pdf") pfile ## Highlighting some of the individuals: ## first get to know which other individuals are in the pedigree plotPed(fad, id=6, highlight.ids=list(hello=c(1, 2, 4))) ########################## ## ## Adding trait data ## fad <- FAData(pedigree=mbped, age=Age) tcancer <- mbsub$cancer names(tcancer) <- mbsub$id trait(fad) <- tcancer ## Now we can plot the pedigree also showing the affected status. plotPed(fad, id=6) ## Alternatively, create the FAData with the trait data fad <- FAData(pedigree=mbped, trait=mbsub$cancer, traitName="cancer") plotPed(fad, id=6)
########################## ## ## Create a new FAData object ## ## Load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## Renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## Defining the optional argument age. Age <- mbsub$endage names(Age) <- mbsub$id ## Create the object fad <- FAData(pedigree=mbped, age=Age) fad ## Extract the ids directly... head(fad$id) ## Extract the kinship matrix dim(kinship(fad)) ## What's the size of the pedigree? pedigreeSize(fad) ## Importing a "ped" file. pedFile <- system.file("txt/minnbreastsub.ped.gz", package="FamAgg") ## Quick glance at the file. readLines(pedFile, n=1) fad <- FAData(pedFile) head(pedigree(fad)) ## Creating the FAData reading data from a txt file. pedFile <- system.file("txt/minnbreastsub.txt", package="FamAgg") fad <- FAData(pedigree=pedFile, header=TRUE, id.col="id", family.col="famid", father.col="fatherid", mother.col="motherid") ## Adding the age age(fad) <- Age fad ## List all families in the pedigree along with the number of ## individuals table(fad$family) ########################## ## ## Basic usage ## ## Extracting the pedigree information ped <- pedigree(fad) ## By default the pedigree is returned as a data.frame. class(ped) head(ped) ## In addition, we can extract the pedigree as a pedigreeList pedigree(fad, return.type="pedigree") ## Return the ids of all ancestors of individual 6 ## up to 3 generations getAncestors(fad, id="6") ## Build the pedigree for individual 6: this includes all of its ## children and all of its ancestors up to the maximal number of ## specified generations. buildPed(fad, id=6) ## Which is a sub-pedigree of the complete family: family(fad, id=6) ## In addition we can specify manually some ids in the pedigree and ## generate the smallest possible pedigree containing all ids: buildPed(fad, id=c(6, 23, 28), prune=TRUE) ## Get the list of all ids sharing kinship with individuals ## 5 and 9 shareKinship(fad, id=c("5", "9")) ## Subset the fad to family "4" subFad <- fad[fad$family == "4", ] subFad ## Export the pedigree from this family to a ped file tmpFile <- tempfile() export(subFad, con=tmpFile, format="ped") head(read.table(tmpFile, sep="\t")) ########################## ## ## Plotting ## ## Plot the pedigree for individual 6. plotPed(fad, id=6) ## Alternatively, exporte it to a temporary file pfile <- plotPed(fad, id=6, device="pdf") pfile ## Highlighting some of the individuals: ## first get to know which other individuals are in the pedigree plotPed(fad, id=6, highlight.ids=list(hello=c(1, 2, 4))) ########################## ## ## Adding trait data ## fad <- FAData(pedigree=mbped, age=Age) tcancer <- mbsub$cancer names(tcancer) <- mbsub$id trait(fad) <- tcancer ## Now we can plot the pedigree also showing the affected status. plotPed(fad, id=6) ## Alternatively, create the FAData with the trait data fad <- FAData(pedigree=mbped, trait=mbsub$cancer, traitName="cancer") plotPed(fad, id=6)
The genealogical index [Hill, 1980], also referred to as the genealogical index of familiality (GIF) in the literature, is a method to identify familial clustering of diseases or other traits. For a given trait, the method computes the mean kinship between affected in the whole pedigree along with mean kinships of randomly drawn sets of individuals. The distribution of average kinship values among the control sets is used to estimate the probability that the observed level of kinship among the cases is due to chance.
## S4 method for signature 'FAGenIndexResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", ...) ## S4 method for signature 'FAGenIndexResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAGenIndexResults' result(object, method="BH") ## S4 method for signature 'FAGenIndexResults' runSimulation(object, nsim=50000, perFamilyTest=FALSE, controlSetMethod="getAll", rm.singletons=TRUE, strata=NULL, ...) ## S4 replacement method for signature 'FAGenIndexResults' trait(object) <- value
## S4 method for signature 'FAGenIndexResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", ...) ## S4 method for signature 'FAGenIndexResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAGenIndexResults' result(object, method="BH") ## S4 method for signature 'FAGenIndexResults' runSimulation(object, nsim=50000, perFamilyTest=FALSE, controlSetMethod="getAll", rm.singletons=TRUE, strata=NULL, ...) ## S4 replacement method for signature 'FAGenIndexResults' trait(object) <- value
(in alphabetic order)
addLegend |
For |
controlSetMethod |
For |
device |
For |
family |
For |
filename |
For |
id |
For |
method |
The multiple hypothesis testing method. All methods supported by
|
nsim |
Number of simulations. |
perFamilyTest |
For |
rm.singletons |
For |
object |
The |
strata |
For |
type |
For |
value |
For |
... |
For For |
This implementation differs from the original method from Hill as it
allows, in addition to perform per family analyses, to use also
stratified sampling and allows a more flexible definition of the set
of matched control individuals.
The controlSetMethod
parameter allows to specify a method to
define the matched control set (e.g. matched by sex or matched by any
externally provided vector).
Stratified sampling allows to even further fine tune the selection of
matched controls. Assuming that in a pedigree the group of affected
consists of 5 females and 3 male individuals, passing the sex of all
individuals to the function (e.g. strata=fad$sex
, with
fad
being the FAData
object containing the
pedigree to be analyzed) results in random sets with the same
proportion of male/female individuals (i.e. 5 females, 3 males).
Note that, if strata
is specified, all individuals with a
missing value in strata
(also affected individuals) are
excluded from the analysis.
Note that by default singletons (i.e. unconnected individuals in
the pedigree) are removed from the pedigree prior the
analysis. Set rm.singletons=FALSE
if you do not want them
to be removed.
By default, the genealogical index is calculated on the whole
pedigree, but it is also possible to evaluate within-family clustering
of cases by specifying perFamilyTest=TRUE
. In that case, it is
also possible to use the getGenerationMatched
and
getGenerationSexMatched
functions to define the set of
matched controls from which random samples will be taken.
A call to the setter methods trait<-
resets any simulation
results present in the sim
slot, thus, the object can be
re-used to perform a simulation analysis using the new trait data.
Refer to the method and function description above for detailed information on the returned result object.
FAGenIndexResults
objects are created calling the
genealogicalIndexTest
method on a
FAData
object.
Class FAData
directly.
A character specifying the name of the method used to define the set of control individuals from which random samples were taken.
Number of simulations.
Logical indicating whether a per-family test was performed.
The result of the simulation. This slot should not be accessed
directly, use the result
method to extract result information.
plotPed
Plots a pedigree for one of the affected individuals in the
simulation results. The id of the selected affected individual
(specified with argument id
) is highlighted in red.
See plotPed
for more details.
plotRes
Plots the results from a genealogical index simulation
analysis. The distribution of the mean kinship values of the
randomly drawn controls are displayed as a grey density plot, the
observed mean kinship value of all affected as a blue vertical
line.
Returns the result from the simulation as a data.frame
with
columns:
"trait_name"
: the name of the trait.
"total_phenotyped"
: total number of individuals in the
pedigree phenotyped in the analyzed trait.
"total_affected"
: total number ofindividuals in the
pedigree that are affected in the analyzed trait (i.e. number of
cases).
"entity_id"
: the id for the analyzed entity, being either
the whole pedigree (in which case the id will be "1"
) or
the id of the family (if perFamilyTest=TRUE
).
"entity_ctrls"
: the number of (matched) control individuals
from which the random samples were drawn.
"entity_affected"
: the number of affected individuals in
the entity. This number can differ from the number of affected, if
strata
was specified and some of the affected have a
missing value in strata
.
"genealogical_index"
: the genealogical index of familiality
(gif), i.e. the mean kinship value between all affected in the
entity (pedigree or family). To be consistent with the original
implementations, the genealogical index is the mean kinship
multiplied with 100000.
"pvalue"
: the p-value for the significance of the mean
kinship.
"padj"
: the p-value adjusted for multiple hypothesis
testing (with the method specified with argument method
).
The returned data.frame
is sorted by column
"pvalue"
, its rownames correspond to column
"entity_id"
.
Performs the simulation analysis based on the pedigree and trait
information stored in the object. Returns a
FAGenIndexResults
object with the results of the simulation.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Subsetting (using the [
operator) is not supported.
Johannes Rainer
Hill, J. R. (1980) A survey of cancer sites by kinship in the Utah Mormon population. In Cairns J, Lyon JL, Skolnick M (eds): Cancer Incidence in Defined Populations. Banbury Report 4. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, pp 299–318.
FAData
,
trait
,
probabilityTest
,
kinshipGroupTest
,
kinshipSumTest
,
familialIncidenceRateTest
,
fsirTest
,
plotPed
########################## ## ## Perform the simulation analysis ## ## Load the Minnesota Breast Cancer data set. data(minnbreast) ## Subset to some families and generate a pedigree data.frame mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 14 | minnbreast$famid == 6 | minnbreast$famid == 8, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the test with default settings, i.e. use all individuals ## in the pedigree as control set from which random samples are drawn ## and perform the analysis on the whole pedigree. gi <- genealogicalIndexTest(fad, trait=tcancer, traitName="cancer", nsim=1000,) ## Just show some information gi ## Show the results result(gi) ## Plot the observed mean kinship and the distribution of the mean kinship of ## random samples. plotRes(gi) ## Plot the pedigree for one of the families. All individuals ## used as matched control set are highlighted in red. plotPed(gi, family="8") ## Repeat the analysis using the sex as strata. This will result in stratified ## random sampling with the number of female and male individuals selected in ## each permutation corresponding to the numbers below table(gi$sex[affectedIndividuals(gi)]) giStrata <- runSimulation(gi, nsim=1000, strata=gi$sex) result(giStrata) ## Alternatively, we can use "getSexMatched" as the function to define the set ## of control individuals. Just, in the present case both male and females ## individuals will be selected since also there are male and female individuals ## among the affected cases. giPerFam <- runSimulation(gi, nsim=1000, controlSetMethod="getSexMatched", perFamilyTest=TRUE) result(giPerFam) ## For those families in which there are only female cases, random samples ## were drawn among only female individuals (within the same family). These ## are highlighted in red in the pedigree plot: plotPed(giPerFam, family="14", cex=0.5) ## Plot the simulation result for this family: plotRes(giPerFam, family="14")
########################## ## ## Perform the simulation analysis ## ## Load the Minnesota Breast Cancer data set. data(minnbreast) ## Subset to some families and generate a pedigree data.frame mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 14 | minnbreast$famid == 6 | minnbreast$famid == 8, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the test with default settings, i.e. use all individuals ## in the pedigree as control set from which random samples are drawn ## and perform the analysis on the whole pedigree. gi <- genealogicalIndexTest(fad, trait=tcancer, traitName="cancer", nsim=1000,) ## Just show some information gi ## Show the results result(gi) ## Plot the observed mean kinship and the distribution of the mean kinship of ## random samples. plotRes(gi) ## Plot the pedigree for one of the families. All individuals ## used as matched control set are highlighted in red. plotPed(gi, family="8") ## Repeat the analysis using the sex as strata. This will result in stratified ## random sampling with the number of female and male individuals selected in ## each permutation corresponding to the numbers below table(gi$sex[affectedIndividuals(gi)]) giStrata <- runSimulation(gi, nsim=1000, strata=gi$sex) result(giStrata) ## Alternatively, we can use "getSexMatched" as the function to define the set ## of control individuals. Just, in the present case both male and females ## individuals will be selected since also there are male and female individuals ## among the affected cases. giPerFam <- runSimulation(gi, nsim=1000, controlSetMethod="getSexMatched", perFamilyTest=TRUE) result(giPerFam) ## For those families in which there are only female cases, random samples ## were drawn among only female individuals (within the same family). These ## are highlighted in red in the pedigree plot: plotPed(giPerFam, family="14", cex=0.5) ## Plot the simulation result for this family: plotRes(giPerFam, family="14")
The FAIncidenceRateResults
object contains the results from a
familial incidence rate calculation employing in addition Monte Carlo
simulations to assess significance levels for the familial incidence
rate of each individual.
The familial incidence rate (FIR, also referred to as FR in Kerber
(1995)) is an estimate for the risk per gene-time for each individuals
for a certain disease given the disease experience in the cohort. The
measure considers the kinship of each individual with any affected
individual in the pedigree and the time at risk for each.
Note that in contrast to e.g. FAKinSumResults
a familial
incidence rate and corresponding p-value are calculated and available
for all individuals in the pedigree, not only for affected individuals.
## S4 method for signature 'FAIncidenceRateResults' familialIncidenceRate(object, trait=NULL, timeAtRisk=NULL, ...) ## S4 method for signature 'FAIncidenceRateResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", only.phenotyped=FALSE, ...) ## S4 method for signature 'FAIncidenceRateResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAIncidenceRateResults' result(object, method="BH") ## S4 method for signature 'FAIncidenceRateResults' runSimulation(object, nsim=50000, timeAtRisk=NULL, strata=NULL, ...) ## S4 method for signature 'FAIncidenceRateResults' timeAtRisk(object) ## S4 replacement method for signature 'FAIncidenceRateResults' timeAtRisk(object) <- value ## S4 replacement method for signature 'FAIncidenceRateResults' trait(object) <- value
## S4 method for signature 'FAIncidenceRateResults' familialIncidenceRate(object, trait=NULL, timeAtRisk=NULL, ...) ## S4 method for signature 'FAIncidenceRateResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", only.phenotyped=FALSE, ...) ## S4 method for signature 'FAIncidenceRateResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAIncidenceRateResults' result(object, method="BH") ## S4 method for signature 'FAIncidenceRateResults' runSimulation(object, nsim=50000, timeAtRisk=NULL, strata=NULL, ...) ## S4 method for signature 'FAIncidenceRateResults' timeAtRisk(object) ## S4 replacement method for signature 'FAIncidenceRateResults' timeAtRisk(object) <- value ## S4 replacement method for signature 'FAIncidenceRateResults' trait(object) <- value
(in alphabetic order)
addLegend |
For |
device |
For |
family |
For |
filename |
For |
id |
For |
method |
The multiple hypothesis testing method. All methods supported by
|
nsim |
Number of simulations. |
object |
The |
only.phenotyped |
For |
strata |
For |
timeAtRisk |
For |
trait |
For |
type |
For |
value |
For |
... |
For For |
Monte Carlo simulation and empirical p-value estimation: the background distribution to calculate the p-value for a familial incidence rate (FIR) is determined by randomly sampling N affected individuals (N being the number of affected) and calculating the expected FIR for all individuals in each simulation iteration. The p-value for an individual represents thus the number of times an expected FIR for that individual from the simulation was found to be larger than or equal to the observed FIR divided by the number of iterations.
Calling the runSimulation
method on a FAIncidenceRateResults
object is the same as calling the
familialIncidenceRateTest
on a
FAData
object. In the first case the simulation
is performed using the trait information data stored internally in the
object, while in the latter case the trait information have to be
submitted to the function call.
By providing argument strata
, the stratified random sampling is
performed. See example below and the details section in
PedigreeAnalysis
for more details.
The familial incidence rate can also be directly calculated, without
simulation, using the familialIncidenceRate
method of a
FAData
object.
A call to the setter methods trait<-
resets any simulation
results present in the sim
slot, thus, the object can be
re-used to perform a simulation analysis using the new trait data.
By default the Monte Carlo p-value estimation in the
runSimulation
method is quite memory demanding. For very large
pedigrees the optional argument lowMem=TRUE
might be passed to
the method which results in faster and less memory demanding
calculations. This will however disable the plotRes
method on
the resulting FAIncidenceRateResults
as the distribution of
familial incidence rates from the simulation runs is no longer
reported.
Note: the FIR for singletons and individuals that do not share kinship
with at least one other phenotyped individual that has also a valid
value in argument timeAtRisk
(and eventually strata
)
will be NA
.
Refer to the method and function description above for detailed information on the returned result object.
FAIncidenceRateResults
objects are created by the
familialIncidenceRateTest
method on a
FAData
object.
Class FAData
directly.
Number of simulations.
The result of the simulation. This slot should not be accessed
directly, use the result
method to extract result
information.
Numeric vector with the time at risk for each individual. Use the
accessor method timeAtRisk
or use object$tar
to
extract this data.
Access the familial incidence rate using object$fir
, the
(raw) p-value from the simulation using object$pvalue
and the
time at risk for each individual using object$tar
or
object$timeAtRisk
with object
being the
FAIncidenceRateResults
object.
Returns the familial incidence rate values calculated by a call to
the runSimulation
method or
familialIncidenceRate
method on a
FAData
object. In contrast to that latter method,
which directly calculates the values, this method returns the
values from a calculation stored inside the
FAIncidenceRateResults
object.
The method returns a named numeric vector with the familial
incidence rates for all individuals in the pedigree, the names
being the ID of the individuals. Singletons as well as individuals
that, after removing not phenotyped individuals or individuals
without time at risk, do not share kinship with any other
individual in the pedigree have a value of NA
.
Plots a pedigree for one of the affected individuals in the
simulation results. The id of the selected affected individual
(specified with argument id
) is highlighted in red. The
familial incidence rate value for each individual is drawn below
the individual's id.
See plotPed
for more details.
Plots the distribution of expected familial incidence rates calculated for the selected individuals from Monte Carlo simulations along with the actually observed familial incidence rate.
Returns the result from the simulation as a data.frame
with
columns:
"trait_name"
: the name of the trait.
"total_phenotyped"
: total number of phenotyped individuals in
the trait.
"total_affected"
: total number of affected individuals in the
trait.
"total_tested"
: the number of individuals in the pedigree
considered for the simulation. This corresponds to all individuals
with valid, non-NA, values in trait, timeAtRisk and eventually
strata.
"id"
: the id of the individual.
"family"
: the family id.
"fir"
: the familial incidence rate. Note that this will be
NA
for all non-phenotyped individuals and singletons in the pedigree as well as for
individuals that do not share kinship with at least one other
phenotyped individual with valid time at risk (or valid value in
parameter strata
).
"pvalue"
: the p-value for the significance of the familial
incidence rate assessed by Monte Carlo simulations.
"padj"
: the p-value adjusted for multiple hypothesis
testing (with the method specified with argument method
).
The returned data.frame
is sorted by column
"pvalue"
, its row names correspond to column
"id"
.
Performs the simulation analysis based on the pedigree and trait
information stored in the object as well as the time at risk
provided with argument timeAtRisk
.
Returns a FAIncidenceRateResults
object with the results
from the simulation.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Subsetting (using the [
operator) is not supported.
Johannes Rainer
Kerber, R.A. (1995) Method for calculating risk associated with family history of a disease. Genet Epidemiol, pp 291–301.
FAData
,
kinship
,
trait
,
probabilityTest
,
kinshipGroupTest
,
kinshipSumTest
,
genealogicalIndexTest
,
familialIncidenceRateTest
,
fsirTest
,
plotPed
,
estimateTimeAtRisk
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7 | minnbreast$famid == 411, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Spefify the "time at risk"; we are using column "endage" tar <- mbsub$endage ## Perform the simulation test: far <- familialIncidenceRateTest(fad, trait=tcancer, traitName="cancer", timeAtRisk=tar, nsim=1000) head(result(far)) ## We can easily extract the actual FIR values: head(far$fir) ## Or head(familialIncidenceRate(far)) ## Access the p-value directly. head(far$pvalue) ## Access the time at risk head(timeAtRisk(far)) head(far$tar) head(far$timeAtRisk) ## Plot the pedigree for a family with significant FIRs. ## The numbers below the IDs of the individuals represent the actual ## FIR values. plotPed(far, family=result(far)$family[1]) ## Plot also the result from the simulation run. plotRes(far, id=result(far)$id[1])
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7 | minnbreast$famid == 411, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Spefify the "time at risk"; we are using column "endage" tar <- mbsub$endage ## Perform the simulation test: far <- familialIncidenceRateTest(fad, trait=tcancer, traitName="cancer", timeAtRisk=tar, nsim=1000) head(result(far)) ## We can easily extract the actual FIR values: head(far$fir) ## Or head(familialIncidenceRate(far)) ## Access the p-value directly. head(far$pvalue) ## Access the time at risk head(timeAtRisk(far)) head(far$tar) head(far$timeAtRisk) ## Plot the pedigree for a family with significant FIRs. ## The numbers below the IDs of the individuals represent the actual ## FIR values. plotPed(far, family=result(far)$family[1]) ## Plot also the result from the simulation run. plotRes(far, id=result(far)$id[1])
The FAKinGroupResults
object contains the results from the
kinship test. This test performs a familial aggregation
analysis on a subset of individuals within a family. Two actual tests
are conducted, a ratio test that evaluates whether the number
of affected individuals within the group is higher than expected by
chance, and a kinship test that compares the largest kinship
value between affected in the group to the one between randomly
sampled individuals.
For more details see kinshipGroupTest
.
## S4 method for signature 'FAKinGroupResults' affectedKinshipGroups(object) ## S4 method for signature 'FAKinGroupResults' buildPed(object, id=NULL, max.generations.up=3, max.generations.down=16, prune=FALSE) ## S4 method for signature 'FAKinGroupResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", ...) ## S4 method for signature 'FAKinGroupResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAKinGroupResults' result(object, method="BH") ## S4 method for signature 'FAKinGroupResults' runSimulation(object, nsim=50000, strata=NULL) ## S4 method for signature 'FAKinGroupResults' shareKinship(object, id=NULL, rmKinship=0) ## S4 replacement method for signature 'FAKinGroupResults' trait(object) <- value
## S4 method for signature 'FAKinGroupResults' affectedKinshipGroups(object) ## S4 method for signature 'FAKinGroupResults' buildPed(object, id=NULL, max.generations.up=3, max.generations.down=16, prune=FALSE) ## S4 method for signature 'FAKinGroupResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", ...) ## S4 method for signature 'FAKinGroupResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAKinGroupResults' result(object, method="BH") ## S4 method for signature 'FAKinGroupResults' runSimulation(object, nsim=50000, strata=NULL) ## S4 method for signature 'FAKinGroupResults' shareKinship(object, id=NULL, rmKinship=0) ## S4 replacement method for signature 'FAKinGroupResults' trait(object) <- value
(in alphabetic order)
addLegend |
For |
device |
For |
family |
For |
filename |
For |
id |
For Note: |
max.generations.down |
For |
max.generations.up |
For |
method |
The multiple hypothesis testing method. All methods supported by
|
nsim |
Number of simulations. |
object |
The |
prune |
For |
rmKinship |
For |
strata |
For |
type |
For |
value |
For |
... |
For |
Calling the runSimulation
method on a FAKinGroupResults
object is the same as calling the kinshipGroupTest
on a
FAData
object. In the first case the
simulation is performed using the trait information
data stored internally in the object, while in the latter case the
trait information have to be submitted to the function
call.
A call to the setter methods trait<-
resets any simulation
results present in the sim
slot, thus, the object can be
re-used to perform a simulation analysis using the new trait data.
The buildPed
method returns by default the full pedigree (all
ancestors and all children) up to the maximal number of
generations. By setting prune=TRUE
the method restricts the
pedigree to all individuals with a kinship >= the minimal kinship
between the individual (with the id equal to the group id) and any
other affected individual in its pedigree.
The plotPed
method allows to plot the pedigree for a kinship
group. This pedigree consists of the full pedigree (all ancestors and
children) of all individuals in the kinship group. Similar to the
buildPed
method the pedigree can be restricted to the kinship
group (and eventual missing parents etc) by setting prune=TRUE
.
Refer to the method and function description above for detailed information on the returned result object.
FAKinGroupResults
objects are created by the
kinshipGroupTest
method for FAData
objects.
Class FAData
directly.
Number of simulations.
The list containing the result of the simulation. Element
"pvalueKinship"
contains the p-values from the kinship test,
"pvalueRatio"
the p-values from the ratio test,
"expDensity"
and "expHist"
density
and
hist
objects representing the background distribution from
the Monte Carlo simulation.
This slot should not be directly accessed, use the result
method to extract result information.
A list of lists, each element representing one kinship group,
$aff
the ids of all affected persons in that group sharing
kinship with the indivudual (being the id of the group),
$phe
ids of all phenotyped individuals in the pedigree with
kinship to the individual up to a kinship being smaller or equal
to the smallest kinship of the indivudal with any other affected
in that group. $kinfreq
: a table with the frequency
(counts) of kinship values (smaller 0.5). $meankin
: the
mean kinship in that group.
Get groups of affected individuals in the pedigree along with
all phenotyped individuals with kinship larger or equal to the
smallest kinship between affected individuals in the
group. Returns a list
of lists with elements aff
,
phe
, kinfreq
and meankin
for each list item:
aff
: character vector with the ids of all affected in the
group, phe
: character vector with the ids of all phenotyped
in the group, kinfreq
: a table
with the frequency
(counts) of kinship values (self-self kinships removed; the names of the
table correnspond to the kinship values ordered increasingly), and
meankin
: the mean kinship value in the group. The names of
the list correspond to the id of the affected individual for which
the affected kinship group was determined.
Builds the pedigree for the submitted id (which represents the id
of the group, i.e. one of
names(affectedKinshipGroups(object))
, respectively ids in
column "group_id"
of result(object)
). Building a
pedigree by submitting the family id is not supported. See details
below for more information. The resulting pedigree is returned as
a data.frame
.
Plots a pedigree for one of the affected kinship groups in the
simulation results. The ids of all individuals of the affected
kinship group are highlighted in the plot in red.
See plotPed
for more details on
the plotting and details below for additional settings.
Plots the distribution of counts of randomly sampled affected counts within the kinship group along with the actually observed kinship sum.
Returns the result from the simulation as a data.frame
with
columns:
"trait_name"
: the name of the trait.
"total_phenotyped"
: the total number of phenotyped
individuals in the trait.
"total_affected"
: the total number of affected individuals
in the trait.
"phenotyped"
: the number of phenotyped individuals in all
analyzed kinship groups.
"affected"
: the number of affected individuals in all
analyzed kinship groups.
"group_id"
: the id for the kinship group (represents the id
of one of the affected individuals in the group).
"family"
: the family id of the affected/group.
"group_phenotyped"
: the number of phenotyped individuals in the
current kinship group.
"group_affected"
: the number of affected individuals in the
current kinship group.
"ratio_pvalue"
: the p-value from the ratio test.
"ratio_padj"
: the p-value from the ratio test adjusted for
multiple hypothesis testing using the method specified with
argument method
.
"mean_kinship"
: the mean kinship value between all
individuals in the current kinship group.
"kinship_pvalue"
: the p-value from the kinship test.
"kinship_padj"
: the p-value from the kinship test adjusted
for multiple hypothesis testing using the method specified with
argument method
.
The data.frame
is sorted by column "ratio_pvalue"
,
its rownames correspond to the "group_id"
.
Performs the simulation analysis based on the pedigree and trait
information stored in the object. Optionally allows to perform
stratified sampling. Returns a FAKinGroupResults
with the results of the simulation.
Returns a character vector with ids of all individuals that share
kinship with any of the individuals in the kinship group
identified by the argument id
. If rmKinship
is
specified, only individuals with a kinship >rmKinship
to the group defined by id
will be reported. This esentially
restricts only the inclusion of individuals outside of the group.
Everyone inside the group will be reported independently of the
threshold defined by rmKinship
. This feature has mainly been
implemented for reasons of API compatibility with the remaining
versions of shareKinship
.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Subsetting (using the [
operator) is not supported.
Johannes Rainer, Daniel Taliun
FAData
,
kinship
,
trait
,
probabilityTest
,
kinshipSumTest
,
genealogicalIndexTest
,
familialIncidenceRateTest
,
fsirTest
,
buildPed
,
plotPed
,
switchPlotfun
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate the pedigree data.frame mbsub <- minnbreast[minnbreast$famid == 165 | minnbreast$famid == 432, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the kinship group test. far <- kinshipGroupTest(fad, trait=tcancer, traitName="cancer", nsim=1000) res <- result(far) head(res) ## Plot the pedigree for the most significant kinship group plotPed(far, id=res[1, "group_id"]) ## The full pedigree for this affected individual and its kinship group is ## large: nrow(buildPed(far, id=res[1, "group_id"])) ## We can however restrict it to a reduced pedigree containing only the ## kinship group and all individuals with a kinship >= the smallest kinship ## between the individual and any other affected individual in the pedigree: nrow(buildPed(far, id=res[1, "group_id"], prune=TRUE)) ## By specifying prune=TRUE we can restrict the pedigree plot to these ## individuals plotPed(far, id=res[1, "group_id"], prune=TRUE) ## Get the ids of all individuals sharing kinship with any of the inddividuals ## in that kinship group: shareKinship(far, id=res[1, "group_id"]) ## Plot the simulation analysis result for the ratio test. plotRes(far, id=res[1, "group_id"], type="hist")
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate the pedigree data.frame mbsub <- minnbreast[minnbreast$famid == 165 | minnbreast$famid == 432, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the kinship group test. far <- kinshipGroupTest(fad, trait=tcancer, traitName="cancer", nsim=1000) res <- result(far) head(res) ## Plot the pedigree for the most significant kinship group plotPed(far, id=res[1, "group_id"]) ## The full pedigree for this affected individual and its kinship group is ## large: nrow(buildPed(far, id=res[1, "group_id"])) ## We can however restrict it to a reduced pedigree containing only the ## kinship group and all individuals with a kinship >= the smallest kinship ## between the individual and any other affected individual in the pedigree: nrow(buildPed(far, id=res[1, "group_id"], prune=TRUE)) ## By specifying prune=TRUE we can restrict the pedigree plot to these ## individuals plotPed(far, id=res[1, "group_id"], prune=TRUE) ## Get the ids of all individuals sharing kinship with any of the inddividuals ## in that kinship group: shareKinship(far, id=res[1, "group_id"]) ## Plot the simulation analysis result for the ratio test. plotRes(far, id=res[1, "group_id"], type="hist")
The FAKinSumResults
object contains the results from a kinship
cluster test which evaluates familial aggregation based on the sum of kinship
values between affected cases. This test highlights individuals that
exhibit a higher than chance relationship to other affected
individuals, therefore highlighting individuals within families
aggregating the phenotype. To achieve this, for each affected
individual the sum of kinship values to all other affected cases is
computed. In a Monte Carlo simulation this is repeated with the same
number of cases and the resulting background distribution is used to
compute p-values for the kinship sums obtained from the observed
cases.
## S4 method for signature 'FAKinSumResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", only.phenotyped=FALSE, ...) ## S4 method for signature 'FAKinSumResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAKinSumResults' result(object, method="BH", cutoff=0.05, rmKinship=0) ## S4 method for signature 'FAKinSumResults' runSimulation(object, nsim=50000, strata=NULL, ...) ## S4 replacement method for signature 'FAKinSumResults' trait(object) <- value
## S4 method for signature 'FAKinSumResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", only.phenotyped=FALSE, ...) ## S4 method for signature 'FAKinSumResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAKinSumResults' result(object, method="BH", cutoff=0.05, rmKinship=0) ## S4 method for signature 'FAKinSumResults' runSimulation(object, nsim=50000, strata=NULL, ...) ## S4 replacement method for signature 'FAKinSumResults' trait(object) <- value
(in alphabetic order)
addLegend |
For |
cutoff |
For |
device |
For |
family |
For |
filename |
For |
id |
For |
method |
The multiple hypothesis testing method. All methods supported by
|
nsim |
Number of simulations. |
object |
The |
only.phenotyped |
For |
rmKinship |
For |
strata |
For |
type |
For |
value |
For |
... |
For |
Calling the runSimulation
method on a FAKinSumResults
object is the same as calling the kinshipSumTest
on a
FAData
object. In the first case the
simulation is performed using the trait information
data stored internally in the object, while in the latter case the
trait information have to be submitted to the function
call.
A call to the setter methods trait<-
resets any simulation
results present in the sim
slot, thus, the object can be
re-used to perform a simulation analysis using the new trait data.
The expected frequency (column "freq"
) in the result
data.frame
is NA
if the corresponding sum of kinship
coefficients reported in column "kinship_sum"
was never sampled
in the simulation. Still, a p-value can be reported.
The plotPed
function does not support to draw pedigrees for
individuals for which no simulation test has been performed. To draw a
pedigree for any individual (with or without trait information, being
affected or not in the trait) refer to the plotPed
method for
FAData
objects.
Refer to the method and function description above for detailed information on the returned result object.
FAKinSumResults
objects are created by the
kinshipSumTest
method on a
FAData
object.
Class FAData
directly.
Number of simulations.
The result of the simulation. This slot should not be accessed
directly, use the result
method to extract result information.
Plots a pedigree for one of the affected individuals in the
simulation results. The id of the selected affected individual
(specified with argument id
) is highlighted in red.
See plotPed
for more details.
Plots the distribution of kinship sums between random sets of
samples individuals from the Monte Carlo simulation along with the
actually observed kinship sum for the affected individual
specified with parameter id
. For id
only affected
individuals for which the analysis has been performed are
allowed. The ids of these individuals are listed in column
"affected_id"
of the data.frame
returned by
result
.
Returns the result from the simulation as a data.frame
with
columns:
"trait_name"
: the name of the trait.
"total_phenotyped"
: total number of phenotyped individuals in
the trait.
"total_affected"
: total number of affected individuals in the
trait.
"affected_id"
: the id of the affected individual for whom
the test has been performed.
"family"
: the family id of the affected.
"ksgrp"
: Numeric identifier that specifies a group of affected
individuals related by kinship. Group assignment starts with the top
ranking individual (by padj
), NA
is assigned to those
that did not pass the threshold cutoff
supplied to result
.
If parameter rmKinship
is passed, assignment is restricted to
kinship values >rmKinship
between the top ranking individual
that founded this group and the rest. Kinship-related individuals that
have a lower kinship value will be left unassigned, therefore they may
end up in a separate group.
"kinship_sum"
: the sum of kinship values.
"freq"
: the expected frequency of the kinship sum from the
simulation.
"pvalue"
: the p-value for the significance of the kinship
sum.
"padj"
: the p-value adjusted for multiple hypothesis
testing (with the method specified with argument method
).
The returned data.frame
is sorted by column
"pvalue"
, its row names correspond to column
"affected_id"
.
Performs the simulation analysis based on the pedigree and trait
information stored in the object. Returns a
FAKinSumResults
object with the results of the simulation.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Subsetting (using the [
operator) is not supported.
Johannes Rainer, Christian Weichenberger
FAData
,
kinship
,
trait
,
probabilityTest
,
kinshipGroupTest
,
kinshipSumTest
,
genealogicalIndexTest
,
familialIncidenceRateTest
,
fsirTest
,
plotPed
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the test: far <- kinshipSumTest(fad, trait=tcancer, traitName="cancer", nsim=1000) head(result(far)) ## Plot the pedigree for one of affected individuals. The id of the affected ## individual is highlighted in red. plotPed(far, id=result(far)$affected_id[1]) ## Replace the trait, this will delete all simulation results ## on the existing FAKinSumResults object tpreg <- mbsub$everpreg names(tpreg) <- mbsub$id trait(far) <- tpreg ## The analysis can be performed using the runSimulation method. far <- runSimulation(far, nsim=1000) head(result(far)) ## Plot the pedigree of one of the affected; note that "affected" now ## indicates whether the individual was ever pregnant. plotPed(far, id="9") ## Plot also the result from the simulation run. plotRes(far, id="9")
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Perform the test: far <- kinshipSumTest(fad, trait=tcancer, traitName="cancer", nsim=1000) head(result(far)) ## Plot the pedigree for one of affected individuals. The id of the affected ## individual is highlighted in red. plotPed(far, id=result(far)$affected_id[1]) ## Replace the trait, this will delete all simulation results ## on the existing FAKinSumResults object tpreg <- mbsub$everpreg names(tpreg) <- mbsub$id trait(far) <- tpreg ## The analysis can be performed using the runSimulation method. far <- runSimulation(far, nsim=1000) head(result(far)) ## Plot the pedigree of one of the affected; note that "affected" now ## indicates whether the individual was ever pregnant. plotPed(far, id="9") ## Plot also the result from the simulation run. plotRes(far, id="9")
The FAProbResults
object contains the results from the
probability test.
The probability test is only a convience method that calls the gap
package's
method pfc.sim
to compute probabilities of familial clustering
of phenotypes [Yu and Zelterman (2002)]. One drawback of that method
is that it is limited to families with at most 22 individuals. Thus,
pedigrees need to be split with specialized software such as Jenti
[Falchi and Fuchsberger ea. (2008)], which within large families
define cliques that can then be used as input to this algorithm.
DEPRECATION WARNING:
Due to problems of the gap
package on MS Windows systems, this
test will be removed in the next Bioconductor release (3.8).
## S4 method for signature 'FAProbResults' buildPed(object, id=NULL, max.generations.up=3, max.generations.down=16, prune=FALSE) ## S4 method for signature 'FAProbResults' cliqueAndTrait(object, na.rm=FALSE) ## S4 method for signature 'FAProbResults' cliques(object, na.rm=FALSE) ## S4 replacement method for signature 'FAProbResults' cliques(object) <- value ## S4 method for signature 'FAProbResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", ...) ## S4 method for signature 'FAProbResults' result(object, method="BH") ## S4 method for signature 'FAProbResults' runSimulation(object, nsim=50000) ## S4 method for signature 'FAProbResults' shareKinship(object, id=NULL) ## S4 replacement method for signature 'FAProbResults' trait(object) <- value ## S4 method for signature 'FAProbResults' traitByClique(object)
## S4 method for signature 'FAProbResults' buildPed(object, id=NULL, max.generations.up=3, max.generations.down=16, prune=FALSE) ## S4 method for signature 'FAProbResults' cliqueAndTrait(object, na.rm=FALSE) ## S4 method for signature 'FAProbResults' cliques(object, na.rm=FALSE) ## S4 replacement method for signature 'FAProbResults' cliques(object) <- value ## S4 method for signature 'FAProbResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", ...) ## S4 method for signature 'FAProbResults' result(object, method="BH") ## S4 method for signature 'FAProbResults' runSimulation(object, nsim=50000) ## S4 method for signature 'FAProbResults' shareKinship(object, id=NULL) ## S4 replacement method for signature 'FAProbResults' trait(object) <- value ## S4 method for signature 'FAProbResults' traitByClique(object)
(in alphabetic order)
device |
For |
family |
For |
filename |
For |
id |
For Note: |
max.generations.down |
For |
max.generations.up |
For |
method |
The multiple hypothesis testing method. All methods supported by
|
na.rm |
Whether |
nsim |
Number of simulations. |
object |
The |
prune |
For |
value |
For |
... |
For |
Calling the runSimulation
method on a FAProbResults
object is the same as calling the probabilityTest
on a
FAData
object. In the first case the
simulation is performed using the clique and trait information
data stored internally in the object, while in the latter case the
clique and trait information have to be submitted to the function
call.
A call to the setter methods trait<-
or cliques<-
resets
any simulation results present in the sim
slot, thus, the
object can be re-used to perform a simulation analysis using the new
trait or clique data.
Refer to the method and function description above for detailed information on the returned result object.
FAProbResults
objects are created by the
probabilityTest
method of FAData
objects.
Class FAData
directly.
Number of simulations.
The result of the simulation. This slot should not be directly
accessed, use the result
method to extract result information.
A factor with the assignment of individuals to cliques. This slot
should not be accessed directly, but the cliques
and
cliques<-
methods should be used instead that ensure that
the data is matched to the information in the pedigree.
Get a data.frame
with the clique ID and the value from the
trait for each individual. If na.rm=TRUE
all rows
(individuals) with either a missing clique ID or trait value are
removed.
Builds the pedigree for the submitted id (which represents the id
of the group, i.e. the clique (e.g. column "group_id"
of
result(object)
). By default the method builds the full
pedigree for all individuals in the clique but the argument
prune
allows to reduce it to the individuals of the clique.
Building a pedigree by submitting the family id is not supported.
The pedigree is returned as a data.frame
.
Returns a factor
vector representing the clique/group
assignment of the individuals. By default, the ordering
is the same as pedigree, setting argument na.rm=TRUE
removes all NA
values, thus the ordering and length might
differ (e.g. if some individuals are not part of any clique).
Setter for the cliques vector. Can be a named numeric vector, character
vector or factor with the names matching the ids of the
individuals in the pedigree. The method internally matches and
re-orders the cliques vector to match the ordering of the ids in
the pedigree.
In addition, the function resets eventually stored simulation
results in the sim
slot.
plotPed
Plots a pedigree for one of the cliques in the
simulation results. The ids of all individuals of the selected
clique are highlighted in red.
See plotPed
for more details on
the plotting and details below for additional settings.
Returns the result from the simulation as a data.frame
with
columns:
trait_name
: the name of the trait.
total_phenotyped
: the total number of phenotyped
individuals in the trait.
total_affected
: the total number of affected individuals in
the trait.
phenotyped
: the number of phenotyped individuals in
all group (i.e. cliques specified by the clique
parameter).
affected
: the number of affected individuals in
all group (i.e. cliques specified by the clique
parameter).
group_id
: the id of the group (clique).
family
: the id of the family in which the clique was defined.
group_phenotyped
: the number of phenotyped individuals in the
current group.
group_affected
: the number of affected individuals in the current
group.
pvalue
: the p-value from the Monte Carlo simulation.
padj
: the p-value adjusted for multiple hypothesis testing
using the method specified with argument method
.
Performs the simulation analysis based on the pedigree, trait and
clique information stored in the object. Returns a
FAProbResults
object with the results of the simulation.
Returns a character vector with the ids of all individuals that
share kinship with any of the individuals in the clique identified
by the argument id
.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Summarize the trait information by clique. Returns a matrix
with the size of the clique (i.e. individuals in the clique with
available trait information) and count of individuals in the
clique with a trait value other than zero (column
affected_size
). The clique ids are used as rownames of the
matrix
.
Subsetting (using the [
operator) is not supported.
Johannes Rainer, Daniel Taliun
Yu C & Zelterman D (2002) Statistical inference for familial disease clusters. Biometrics, pp 481-491
Falchi M & Fuchsberger C (2008) Jenti: an efficient tool for mining complex inbred genealogies. Bioinformatics, pp 724-726
FAData
,
buildPed
,
plotPed
,
trait
,
probabilityTest
,
kinshipGroupTest
,
kinshipSumTest
,
genealogicalIndexTest
,
familialIncidenceRateTest
,
fsirTest
The FAStdIncidenceRateResults
object contains the results from a
familial standardized incidence rate (FSIR) calculation employing in
addition Monte Carlo simulations to assess significance levels for the
individuals' FSIRs.
The FSIR weights the disease status of relatives based on their degree of relatedness with the proband (Kerber, 1995). Formally, the FSIR is defined as the standardized incidence ratio (SIR) or standardized morality ratio in epidemiology, i.e. as the ratio between the observed and expected number of cases, only that both are in addition also weighted by the degree of relatedness (i.e. kinship value) between individuals in the pedigree.
## S4 method for signature 'FAStdIncidenceRateResults' fsir(object, trait=NULL, lambda=NULL, timeInStrata=NULL, ...) ## S4 method for signature 'FAStdIncidenceRateResults' lambda(object, ...) ## S4 method for signature 'FAStdIncidenceRateResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", only.phenotyped=FALSE, ...) ## S4 method for signature 'FAStdIncidenceRateResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAStdIncidenceRateResults' result(object, method="BH") ## S4 method for signature 'FAStdIncidenceRateResults' resultForId(object, id=NULL) ## S4 method for signature 'FAStdIncidenceRateResults' runSimulation(object, nsim=50000, lambda=NULL, timeInStrata=NULL, strata=NULL, ...) ## S4 method for signature 'FAStdIncidenceRateResults' timeInStrata(object) ## S4 replacement method for signature 'FAStdIncidenceRateResults' trait(object) <- value
## S4 method for signature 'FAStdIncidenceRateResults' fsir(object, trait=NULL, lambda=NULL, timeInStrata=NULL, ...) ## S4 method for signature 'FAStdIncidenceRateResults' lambda(object, ...) ## S4 method for signature 'FAStdIncidenceRateResults' plotPed(object, id=NULL, family=NULL, filename=NULL, device="plot", only.phenotyped=FALSE, ...) ## S4 method for signature 'FAStdIncidenceRateResults' plotRes(object, id=NULL, family=NULL, addLegend=TRUE, type="density", ...) ## S4 method for signature 'FAStdIncidenceRateResults' result(object, method="BH") ## S4 method for signature 'FAStdIncidenceRateResults' resultForId(object, id=NULL) ## S4 method for signature 'FAStdIncidenceRateResults' runSimulation(object, nsim=50000, lambda=NULL, timeInStrata=NULL, strata=NULL, ...) ## S4 method for signature 'FAStdIncidenceRateResults' timeInStrata(object) ## S4 replacement method for signature 'FAStdIncidenceRateResults' trait(object) <- value
(in alphabetic order)
addLegend |
For |
device |
For |
family |
For |
filename |
For |
id |
For For |
lambda |
Numeric vector with the incidence rates per stratum from the
population. The length of this vector has to match the number of
columns of argument For |
method |
The multiple hypothesis testing method. All methods supported by
|
nsim |
Number of simulations. |
object |
The |
only.phenotyped |
For |
strata |
For |
timeInStrata |
For For |
trait |
For |
type |
For |
value |
For |
... |
For For |
Monte Carlo simulation and empirical p-value estimation: see details
of FAIncidenceRateResults
as the concept and calculation
is essentially identical.
Calling the runSimulation
method on a FAStdIncidenceRateResults
object is the same as calling the
fsirTest
on a
FAData
object. In the first case the simulation
is performed using the trait information data stored internally in the
object, while in the latter case the trait information have to be
submitted to the function call.
By providing argument strata
, the stratified random sampling is
performed. See example below and the details section in
PedigreeAnalysis
for more details.
The FSIR can also be directly calculated, without
simulation, using the fsir
method of a
FAData
object.
A call to the setter methods trait<-
resets any simulation
results present in the sim
slot, thus, the object can be
re-used to perform a simulation analysis using the new trait data.
By default the Monte Carlo p-value estimation in the
runSimulation
method is quite memory demanding. For very large
pedigrees the optional argument lowMem=TRUE
might be passed to
the method which results in faster and less memory demanding
calculations. This will however disable the plotRes
method on
the resulting FAStdIncidenceRateResults
as the distribution of
FSIR from the simulation runs is no longer reported.
Note: the FIR for singletons and individuals that do not share kinship
with at least one other phenotyped individual that has also a valid
value in argument timeAtRisk
(and eventually strata
)
will be NA
.
Refer to the method and function description above for detailed information on the returned result object.
FAStdIncidenceRateResults
objects are created by the
fsirTest
method on a
FAData
object.
Class FAData
directly.
Number of simulations.
The result of the simulation. This slot should not be accessed
directly, use the result
method to extract result
information.
Numeric matrix specifying the time at risk in each strata
(represented by the columns) of each individual (rows). Use the
accessor method timeInStrata
to
extract this data.
Numeric vector with the incidence rates per stratum from the population.
Access various results and data stored in the object. name
can be fsir
, pvalue
, lambda
or
timeInStrata
to access the FSIR, the (raw) p-value from the
simulation analysis, the lambda or the time in strata.
Returns the FSIR values calculated by a call to
the runSimulation
method or
fsir
method on a
FAData
object. In contrast to that latter method,
which directly calculates the values, this method returns the
values from a calculation stored inside the
FAStdIncidenceRateResults
object.
The method returns a named numeric vector with the familial
incidence rates, the names being the ID of the individuals.
Singletons as well as individuals
that, after removing not phenotyped individuals or individuals
without time at risk, do not share kinship with any other
individual in the pedigree have a value of NA
.
Plots a pedigree for one of the affected individuals in the
simulation results. The id of the selected affected individual
(specified with argument id
) is highlighted in red. The
FSIR value for each individual is drawn below
the individual's id.
See plotPed
for more details.
Plots the distribution of expected FSIR calculated for the selected individuals from Monte Carlo simulations along with the actually observed FSIR.
Returns the result from the simulation as a data.frame
with
columns:
"trait_name"
: the name of the trait.
"total_phenotyped"
: total number of phenotyped individuals in
the trait.
"total_affected"
: total number of affected individuals in the
trait.
"total_tested"
: the number of individuals in the pedigree
considered for the simulation. This corresponds to all individuals
with valid, non-NA, values in trait, timeAtRisk and eventually
strata.
"id"
: the id of the individual.
"family"
: the family id.
"fsir"
: the familial standardized incidence rate. Note that this will be
NA
for all non-phenotyped individuals and singletons in the pedigree as well as for
individuals that do not share kinship with at least one other
phenotyped individual with valid time at risk (or valid value in
parameter strata
).
"pvalue"
: the p-value for the significance of the familial
standardized incidence rate assessed by Monte Carlo simulations.
"padj"
: the p-value adjusted for multiple hypothesis
testing (with the method specified with argument method
).
The returned data.frame
is sorted by column
"pvalue"
, its row names correspond to column
"id"
.
Extracts results information for a given individual. The method
returns a list
with elements "id"
, "fsir"
,
"pvalue"
, "timeInStrata"
and "lambda"
with
the ID of the individual, the FSIR and corresponding p-value
estimated by the simulation, the row from the timeInStrata
matrix of the individual and the lambda
(incidence rates
from the population).
Performs the simulation analysis based on the pedigree and trait
information stored in the object as well as the time at risk
provided with argument timeAtRisk
.
Returns a FAStdIncidenceRateResults
object with the results
from the simulation.
Set the trait information. This method will reset all simulation
results saved in the sim
slot.
Subsetting (using the [
operator) is not supported.
Johannes Rainer
Kerber, R.A. (1995) Method for calculating risk associated with family history of a disease. Genet Epidemiol, pp 291–301.
FAData
,
kinship
,
trait
,
probabilityTest
,
kinshipGroupTest
,
kinshipSumTest
,
genealogicalIndexTest
,
familialIncidenceRateTest
,
fsirTest
,
plotPed
,
estimateTimeAtRisk
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7 | minnbreast$famid == 411, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Next we have to specify lambda and the timeInStrata matrix. ## For lambda we use information from Cancer Research UK: ## New breast cancer cases in females per year and 100000: 155.3 ## New breast cancer cases in males per year and 100000: 1.1 ## New prostate cancer cases in females per year and 100000: 0 ## New prostate cancer cases in males per year and 100000: 134.3 lbda <- c(M=(1.1+134.4)/100000, F=155.3/100000) ## Next we need the time at risk of each individual in each strata. ## For strata we use male and females, for the time at risk we use ## column "endage": stratMat <- factor2matrix(fad$sex) ## Next we have to multiply that with the endage, since each person ## spent that time "at risk" to get cancer. stratMat <- stratMat * mbsub$endage ## Running the simulation fsirs <- fsirTest(fad, trait=tcancer, traitName="cancer", lambda=lbda, timeInStrata=stratMat, nsim=500) ## Showing some of the results head(result(fsirs)) ## Extract the FSIR head(fsirs$fsir) ## We can also directly access the p-values from the simulation, ## these are however the raw, unadjusted p-values. head(fsirs$pvalue) ## Get the time in strata for each individual head(fsirs$timeInStrata) ## Plot the pedigree for a family with significant FSIRs. ## The numbers below the IDs of the individuals represent the actual ## FSIR values. plotPed(fsirs, family=result(fsirs)$family[1]) ## Plot also the result from the simulation run. plotRes(fsirs, id=result(fsirs)$id[1]) ## Extract the data and result for an individual resultForId(fsirs, id="16424")
########################## ## ## Perform the simulation analysis ## ## Load the test data. data(minnbreast) ## Subset to some families and generate a pedigree data.frame. mbsub <- minnbreast[minnbreast$famid == 4 | minnbreast$famid == 5 | minnbreast$famid == 6 | minnbreast$famid == 7 | minnbreast$famid == 411, ] PedDf <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] colnames(PedDf) <- c("family", "id", "father", "mother", "sex") ## Generate the FAData. fad <- FAData(pedigree=PedDf) ## Specify the trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ## Next we have to specify lambda and the timeInStrata matrix. ## For lambda we use information from Cancer Research UK: ## New breast cancer cases in females per year and 100000: 155.3 ## New breast cancer cases in males per year and 100000: 1.1 ## New prostate cancer cases in females per year and 100000: 0 ## New prostate cancer cases in males per year and 100000: 134.3 lbda <- c(M=(1.1+134.4)/100000, F=155.3/100000) ## Next we need the time at risk of each individual in each strata. ## For strata we use male and females, for the time at risk we use ## column "endage": stratMat <- factor2matrix(fad$sex) ## Next we have to multiply that with the endage, since each person ## spent that time "at risk" to get cancer. stratMat <- stratMat * mbsub$endage ## Running the simulation fsirs <- fsirTest(fad, trait=tcancer, traitName="cancer", lambda=lbda, timeInStrata=stratMat, nsim=500) ## Showing some of the results head(result(fsirs)) ## Extract the FSIR head(fsirs$fsir) ## We can also directly access the p-values from the simulation, ## these are however the raw, unadjusted p-values. head(fsirs$pvalue) ## Get the time in strata for each individual head(fsirs$timeInStrata) ## Plot the pedigree for a family with significant FSIRs. ## The numbers below the IDs of the individuals represent the actual ## FSIR values. plotPed(fsirs, family=result(fsirs)$family[1]) ## Plot also the result from the simulation run. plotRes(fsirs, id=result(fsirs)$id[1]) ## Extract the data and result for an individual resultForId(fsirs, id="16424")
These functions allow to define sets of (eventually matched) control
individuals for one or more given individuals using pedigree data. By
default, controls from the same family than the specified
individual(s) are returned.
They are used e.g. in the genealogicalIndexTest
method.
## S4 method for signature 'data.frame' getAll(object, id=NULL, ...) ## S4 method for signature 'data.frame' getExternalMatched(object, id=NULL, match.using, ...) ## S4 method for signature 'data.frame' getGenerationMatched(object, id=NULL, include.anc=0, include.off=0, ...) ## S4 method for signature 'data.frame' getGenerationSexMatched(object, id=NULL, include.anc=0, include.off=0, ...) ## S4 method for signature 'data.frame' getSexMatched(object, id=NULL, ...)
## S4 method for signature 'data.frame' getAll(object, id=NULL, ...) ## S4 method for signature 'data.frame' getExternalMatched(object, id=NULL, match.using, ...) ## S4 method for signature 'data.frame' getGenerationMatched(object, id=NULL, include.anc=0, include.off=0, ...) ## S4 method for signature 'data.frame' getGenerationSexMatched(object, id=NULL, include.anc=0, include.off=0, ...) ## S4 method for signature 'data.frame' getSexMatched(object, id=NULL, ...)
(in alphabetic order)
id |
A character vector with ids of individuals for whom matched control individuals (from the same family) should be defined. |
include.anc |
For |
include.off |
For |
match.using |
For |
object |
Either a |
... |
Additional arguments; not used at present. |
Refer to the method and function description above for detailed information on the returned result object.
Simply returns the ids of all individuals in the family
(i.e. individuals with the same family id in the pedigree) of the
specified ids in id
.
Returns a list with the ids (of type character
) of the
controls. The names of the list correspond to the family id.
Returns the ids of matched individuals from the same family for
the specified ids in id
.
The match.using
vector is for the matching, i.e. the
function first extracts the values for the individuals in
id
from match.using
and returns the ids of all
individuals whose value in match.using
matches the value of
the individuals specified by id
. Individuals with a missing
value in match.using
are excluded.
Returns a list with the ids (of type character
) of the
controls. The names of the list correspond to the family id.
Returns the ids of individuals matched by the generation of the
individuals in id
. The function returns the ids of all
individuals from the same generation(s) than the individuals in
id
. The arguments include.anc
and include.off
can be used to increase the range of generation from which
individuals are selected.
Returns a list with the ids (of type character
) of the
controls. The names of the list correspond to the family id.
Same as getGenerationMatched
, but matches in addition
individuals by sex (see getSexMatched
).
Returns a list with the ids (of type character
) of the
controls. The names of the list correspond to the family id.
Returns ids of individuals from the same family matching the sex
of the individuals in id
. Individuals with sex being
NA
are excluded. The result is essentially identical to the
getAll
if id
contains ids of male and female
individuals.
Returns a list with the ids (of type character
) of the
controls. The names of the list correspond to the family id.
Johannes Rainer.
pedigree
,
FAData
,
FAProbResults
,
FAKinGroupResults
,
FAKinSumResults
,
FAGenIndexResults
,
genealogicalIndexTest
########################## ## ## Defining a small pedigree ## ## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## Get ids of individuals from the same generation than "4" getGenerationMatched(mbped, id="4") ## Get the ids of gemale individuals from the same generation than ## "4" and "22" getGenerationSexMatched(mbped, id=c("4", "22")) ## Generate a FAData fad <- FAData(mbped) ## Plot the pedigree so we can evaluate the results of the function plotPed(fad, family="4") ## We're using getExternalMatched on the sex of the individuals ## in the pedigree, thus the results are identical to the ## getSexMatched function. ## Extracting the sex using $sex returns a named vector just ## as we need for getExternalMatched. head(fad$sex) getExternalMatched(fad, id="4", match.using=fad$sex) getSexMatched(fad, id="4")
########################## ## ## Defining a small pedigree ## ## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## Get ids of individuals from the same generation than "4" getGenerationMatched(mbped, id="4") ## Get the ids of gemale individuals from the same generation than ## "4" and "22" getGenerationSexMatched(mbped, id=c("4", "22")) ## Generate a FAData fad <- FAData(mbped) ## Plot the pedigree so we can evaluate the results of the function plotPed(fad, family="4") ## We're using getExternalMatched on the sex of the individuals ## in the pedigree, thus the results are identical to the ## getSexMatched function. ## Extracting the sex using $sex returns a named vector just ## as we need for getExternalMatched. head(fad$sex) getExternalMatched(fad, id="4", match.using=fad$sex) getSexMatched(fad, id="4")
The kinshipPairs
function allows to extract pairs of individuals matching
a user-defined kinship condition (e.g. individuals with a kinship larger
than 0.0625). Such sets of paired individuals (along with paired unrelated
values) would enable a familial resemblance analysis on quantitative
traits (Ziegler 2010) (see examples below for details).
By default, kinshipPairs
returns all pairs of individuals for which the
condition
on the kinship matrix matches (e.g. all pairs of individuals with
a kinship coefficient larger than or equal to 0.25). Individuals can thus
be reported multiple times (see examples below). Parameter duplicates
can
be used to define a strategy to avoid such duplicated IDs. Supported are:
duplicates = "keep"
: the default, return all values.
duplicates = "first"
: report only the first pair of individuals for each
individual ID.
duplicates = "last"
: report only the last pair of individuals for each
individual ID.
duplicates = "random"
: randomly select one pair of individuals for
each individual ID.
For any setting different than duplicates = "keep"
each individual will
only be listed once in the resulting matrix.
kinshipPairs( x, condition = function(x) x >= 0.25, duplicates = c("keep", "first", "last", "random"), id = NULL, family = NULL )
kinshipPairs( x, condition = function(x) x >= 0.25, duplicates = c("keep", "first", "last", "random"), id = NULL, family = NULL )
x |
A |
condition |
A |
duplicates |
|
id |
optional identifiers of subsets of individuals on which the pairs
should be defined. Defaults to |
family |
optional family identifiers if pairs should only defined for
selected families. Defaults to |
A two column matrix
with the IDs (colnames/rownames of the kinship matrix
or as defined in x$id
) of the pairs. If duplicates
is either "first"
,
"last"
or "random"
each ID is only returned once (i.e. no ID is reported
more than one time).
Johannes Rainer
Ziegler A., Koenig I. R. (2010). Familiality, Heristability, and Segregation Analysis. In A Statistical Approach to Genetic Epidemiology: With Access to E-Learning Platform by Friedrich Pahlke, Second Edition. doi:10.1002/9783527633654.ch6.
PedigreeUtils for other pedigree utility functions.
########################## ## ## Create a new FAData object ## ## Load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid %in% 1:20, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## Renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## Defining the optional argument age. Age <- mbsub$endage names(Age) <- mbsub$id ## Create the object fad <- FAData(pedigree=mbped, age=Age) ## Getting all pairs of individuals with a kinship coefficient >= 0.25 ## keeping all duplicates rel_pairs <- kinshipPairs(fad) head(rel_pairs) ## As we see, we have multiple times the individual 1 etc. ## For an actual correlation analysis it would be better to drop duplicates. ## Below we randomly select individual pairs if they occurr multiple times rel_pairs <- kinshipPairs(fad, duplicates = "random") head(rel_pairs) ## In addition we extract pairs of individuals that are much less related. ## For this examples we consider all individuals with a kinship ## coefficient < 0.03125 (second cousin) to be *unrelated*. unrel_pairs <- kinshipPairs(fad, duplicates = "random", condition = function(z) z < 0.03125) head(unrel_pairs) ## For a familial resemblance analysis we can now calculate the correlation ## coefficient of a quantitative trait between pairs of related individuals ## and compare that with the correlation coefficient calculated on unrelated ## individuals. For our toy example we use the participant's age, since we ## don't have any other quantitative values available. cor_rel <- cor(age(fad)[rel_pairs[, 1]], age(fad)[rel_pairs[, 2]], use = "pairwise.complete.obs") cor_unrel <- cor(age(fad)[unrel_pairs[, 1]], age(fad)[unrel_pairs[, 2]], use = "pairwise.complete.obs") cor_rel cor_unrel ## We don't see a clear difference in the correlation, thus, the age (as ## expected) has no familial component.
########################## ## ## Create a new FAData object ## ## Load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid %in% 1:20, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## Renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## Defining the optional argument age. Age <- mbsub$endage names(Age) <- mbsub$id ## Create the object fad <- FAData(pedigree=mbped, age=Age) ## Getting all pairs of individuals with a kinship coefficient >= 0.25 ## keeping all duplicates rel_pairs <- kinshipPairs(fad) head(rel_pairs) ## As we see, we have multiple times the individual 1 etc. ## For an actual correlation analysis it would be better to drop duplicates. ## Below we randomly select individual pairs if they occurr multiple times rel_pairs <- kinshipPairs(fad, duplicates = "random") head(rel_pairs) ## In addition we extract pairs of individuals that are much less related. ## For this examples we consider all individuals with a kinship ## coefficient < 0.03125 (second cousin) to be *unrelated*. unrel_pairs <- kinshipPairs(fad, duplicates = "random", condition = function(z) z < 0.03125) head(unrel_pairs) ## For a familial resemblance analysis we can now calculate the correlation ## coefficient of a quantitative trait between pairs of related individuals ## and compare that with the correlation coefficient calculated on unrelated ## individuals. For our toy example we use the participant's age, since we ## don't have any other quantitative values available. cor_rel <- cor(age(fad)[rel_pairs[, 1]], age(fad)[rel_pairs[, 2]], use = "pairwise.complete.obs") cor_unrel <- cor(age(fad)[unrel_pairs[, 1]], age(fad)[unrel_pairs[, 2]], use = "pairwise.complete.obs") cor_rel cor_unrel ## We don't see a clear difference in the correlation, thus, the age (as ## expected) has no familial component.
Various functions to perform pedigree analyses and to investigate familial clustering of e.g. cancer cases.
binomialTest(object, trait, traitName, global = FALSE, prob = NULL, alternative = c("greater", "less", "two.sided")) estimateTimeAtRisk(startDate=NULL, startDateFormat="%Y-%m-%d", endDate=NULL, endDateFormat="%Y-%m-%d", incidenceDate=NULL, incidenceDateFormat="%Y-%m-%d", deathDate=NULL, deathDateFormat="%Y-%m-%d", allowNegative=FALSE, affected=NULL, incidenceSubtract=0.5) factor2matrix(x) ## S4 method for signature 'FAData' familialIncidenceRate(object, trait=NULL, timeAtRisk=NULL) ## S4 method for signature 'FAData' familialIncidenceRateTest(object, trait=NULL, nsim=50000, traitName=NULL, timeAtRisk=NULL, strata=NULL, ...) ## S4 method for signature 'FAData' fsir(object, trait=NULL, lambda=NULL, timeInStrata=NULL) ## S4 method for signature 'FAData' fsirTest(object, trait=NULL, nsim=50000, traitName=NULL, lambda=NULL, timeInStrata=NULL, strata=NULL, ...) ## S4 method for signature 'FAData' genealogicalIndexTest(object, trait, nsim=50000, traitName, perFamilyTest=FALSE, controlSetMethod="getAll", rm.singletons=TRUE, strata=NULL, ...) ## S4 method for signature 'FAData' kinshipGroupTest(object, trait, nsim=50000, traitName, strata=NULL, ...) ## S4 method for signature 'FAData' kinshipSumTest(object, trait, nsim=50000, traitName, strata=NULL, ...) ## S4 method for signature 'FAData' probabilityTest(object, trait, cliques, nsim=50000, traitName, ...) sliceAge(x, slices=c(0, 40, Inf))
binomialTest(object, trait, traitName, global = FALSE, prob = NULL, alternative = c("greater", "less", "two.sided")) estimateTimeAtRisk(startDate=NULL, startDateFormat="%Y-%m-%d", endDate=NULL, endDateFormat="%Y-%m-%d", incidenceDate=NULL, incidenceDateFormat="%Y-%m-%d", deathDate=NULL, deathDateFormat="%Y-%m-%d", allowNegative=FALSE, affected=NULL, incidenceSubtract=0.5) factor2matrix(x) ## S4 method for signature 'FAData' familialIncidenceRate(object, trait=NULL, timeAtRisk=NULL) ## S4 method for signature 'FAData' familialIncidenceRateTest(object, trait=NULL, nsim=50000, traitName=NULL, timeAtRisk=NULL, strata=NULL, ...) ## S4 method for signature 'FAData' fsir(object, trait=NULL, lambda=NULL, timeInStrata=NULL) ## S4 method for signature 'FAData' fsirTest(object, trait=NULL, nsim=50000, traitName=NULL, lambda=NULL, timeInStrata=NULL, strata=NULL, ...) ## S4 method for signature 'FAData' genealogicalIndexTest(object, trait, nsim=50000, traitName, perFamilyTest=FALSE, controlSetMethod="getAll", rm.singletons=TRUE, strata=NULL, ...) ## S4 method for signature 'FAData' kinshipGroupTest(object, trait, nsim=50000, traitName, strata=NULL, ...) ## S4 method for signature 'FAData' kinshipSumTest(object, trait, nsim=50000, traitName, strata=NULL, ...) ## S4 method for signature 'FAData' probabilityTest(object, trait, cliques, nsim=50000, traitName, ...) sliceAge(x, slices=c(0, 40, Inf))
(in alphabetic order)
affected |
For |
allowNegative |
For |
alternative |
For |
cliques |
A named numeric or characted vector or factor with the names corresponding to ids of the individuals in the pedigree. The ids will be internally matched and sub-set to the ids available in the pedigree. |
controlSetMethod |
For |
deathDate |
For |
deathDateFormat |
For |
endDate |
For |
endDateFormat |
For |
global |
For |
incidenceDate |
For |
incidenceDateFormat |
For |
incidenceSubtract |
For |
lambda |
Numeric vector with the incidence rates per stratum from the
population. The length of this vector has to match the number of
columns of argument |
nsim |
The number of simulations. |
object |
The |
perFamilyTest |
For |
prob |
For |
rm.singletons |
For |
slices |
For |
startDate |
For |
startDateFormat |
For |
strata |
For |
timeAtRisk |
A numeric vector specifying the time at risk for each
individual. The definition for this variable is taken from Kerber
(1995). See description of the method below for more information.
|
timeInStrata |
For |
trait |
A named numeric vector (values If trait is not specified, the trait information stored within the
|
traitName |
The name of the trait (optional). |
x |
For |
... |
For For |
Stratified sampling: some of the familial aggregation methods allow to use stratified sampling for the Monte Carlo simulations. In stratified sampling, the same number of random samples will be selected within each class/stratum then there are among the affected. As example, if 5 female and 2 male individuals are affected in the analysed trait and sex stratified sampling is performed, in each permuatation the same number of random samples in each group (i.e. 5 females and 2 males) are selected.
A note on singletons: for all per-individual measures, unconnected individuals within the pedigree are automatically excluded from the calculations as no kinship based statistic can be estimated for them since they do, by definition, not share kinship with any other individual in the pedigree.
Refer to the method and function description above for detailed information on the returned result object.
Evaluate whether the number of affected in a trait are higher than
expected by chance using a simple binomial test. In contrast to
most other methods presented here, this does not use the kinship
between affected individuals, but simply performs a binomial test
for each family considering the numbers of affected within the
family, the size of the family and the global probability of being
affected. The latter is by default calculated on the data set
(ratio between the total number of affected in the pedigree and
the total number of phenotyped individuals), can however also be
specified with the prob
argument.
The test is performed using the binom.test
.
The function returns a FABinTestResults
object.
Calculate the familial incidence rate (FIR, or FR) as defined in [Kerber 1995], formula (3). The FIR is an estimate for the risk per gene-time for each individual for a certain disease (trait) given the disease experience in the cohort. The measure considers the kinship of each individual with any affected individual in the pedigree and the time at risk for each individual.
Internally, the function first excludes individuals from the test
which have a missing value (NA
) either in the argument
trait
or in the argument timeAtRisk
. Next, the thus
reduced pedigree, is further cleaned by removing all resulting
singletons (i.e. individuals that do not share kinship with any
other individual in the above reduced data set).
The method returns a vector with the FIR value for each
individual. Individuals that were excluded from the test as
described above habe an FIR of NA
.
Calculates the familial incidence rate for each individual and in
addition assesses the significance of these based on Monte Carlo
simulations. See FAIncidenceRateResults
for
more details.
The method returns a FAIncidenceRateResults
object.
Calculate the familial standardized incidence rate (FSIR) as defined in [Kerber, 1995], formula (4). The FSIR weights the disease status of relatives based on their degree of relatedness with the proband [Kerber, 1995]. Formally, the FSIR is defined as the standardized incidence ratio (SIR) or standardized morality ratio in epidemiology, i.e. as the ratio between the observed and expected number of cases, only that both are in addition also weighted by the degree of relatedness (i.e. kinship value) between individuals in the pedigree.
Similar to familialIncidenceRate
, the function excludes
individuals with missing values in any of the arguments
trait
, timeInStrata
(and optionally strata
)
and all individuals that do not share any kinship with any other
individual in the pedigree after removing the above individuals.
The method returns a vector with the FSIR value for each
individual. Individuals excluded as above describe have a FSIR value
of NA
.
Calculates the familial standardized incidence rate (FSIR) for
each individual and in addition assesses the significance of these
based on Monte Carlo simulations. See
FAStdIncidenceRateResults
for more details.
The method returns a FAStdIncidenceRateResults
object.
Performs the genealogical index analysis from [Hill
1980] (also known as the genealogical index of
familiality or genetic index of familiality
) to identify
familial clustering of traits (e.g. cancers etc).
This test calculates the mean kinship among affected individuals in a
pedigree along with mean kinships of equal sized random control
sets drawn form the pedigree. The distribution of average kinship
values among these random sets is used to estimate the probability
that the observed mean kinship between the affected individuals is due
to chance.
The controlSetMethod
argument allows to specify the method
to define sets of matched control individuals in a pedigree or
family.
Note that by default singletons (i.e. unconnected individuals in
the pedigree) are removed from the pedigree prior the
analysis. Set rm.singletons=FALSE
if you do not want them
to be removed.
The method can also be performed separately for each family within
the larger pedigree (perFamilyTest=TRUE
to evaluate the
presence of clustered affected within each family).
In this case it is also possible to
use controlSetMethod="getGenerationMatched"
or
controlSetMethod="getGenerationSexMatched"
, which allows to
draw random control samples from the same generation(s) than the
affected are.
Stratified random sampling can be performed with the strata
argument. See details for more information.
The function returns a FAGenIndexResults
object.
Performs a familial aggregation test on a subset of a family. The
idea behind this test is to narrow down the set of controls for
each affected individual by considering only individuals that are
as closely related as the most distant affected individual. This
strategy incorporates more the family structure of the cases and
is meant to be an alternative to the kinshipSumTest
method.
Initially, for an affected individual i a group C(i) is created that contains all individuals that share kinship as far as the most distantly related affected individual. This cluster can be interpreted as a circle that is centered at individual i with radius equal to the most distantly related case. Therefore, the cluster defines a narrowed, individual-specific set of individuals in which the phenotype is assumed to have been passed on. Groups consisting of the same set of affected individuals are reduced to a single group (i.e. the group with the smallest total number of individuals).
Based on this definition of groups C(i), we compute two statistics by performing Monte Carlo simulations (which optionally allow to perform stratified random sampling). During each simulation step affected cases are randomly sampled from the population.
1. The ratio test counts per group C(i) the number of times we observe a higher number of affected individuals in the simulation than in the observed case. Dividing this number by the number of simulation steps yields immediately the p-value that describes the event to observe by chance a higher number of affected individuals than in the given case.
2. The kinship test addresses the degree of relatedness within the simulated set by a counting method where we count the number of times in a simulation step there is a pair of affected individuals that are more closely related than in the observed group C(i). In case the closest degree of relatedness is equal in both the simulation step and the observed case, we look at the number of pairs found in both and count it if this number is higher in the simulation step. Again, dividing this count by the number of simulation steps readily yields a p-value.
See also the method runSimulation
for
FAKinGroupResults
.
The function returns a FAKinGroupResults
object.
Performs a test for familial aggregation based on the sum of kinship
values between affected cases. This test highlights individuals that
exhibit a higher than chance relationship to other affected
individuals, therefore highlighting individuals within families
aggregating the phenotype. To achieve this, for each affected
individual the sum of kinship values to all other affected cases is
computed. In a Monte Carlo simulation this is repeated with the same
number of cases (and optionally stratified with the strata
argument), and the resulting background distribution is used to
compute p-values for the kinship sums obtained from the observed
cases. See also the method runSimulation
for
FAKinSumResults
.
The function returns a FAKinSumResults
object.
DEPRECATED: this test will be removed in Bioconductor version 3.8
due to problems and incompatibilities of the gap
package on
MS Windows systems.
This is only a convience method that calls the gap
package's
method pfc.sim
to compute probabilities of familial clustering
of phenotypes [Yu and Zelterman (2002)]. One drawback of that method
is that it is limited to families with at most 22 individuals. Thus,
pedigrees need to be split with specialized software such as Jenti
[Falchi and Fuchsberger ea. (2008)], which within large families
define cliques that can then be used as input to this algorithm.
See also method runSimulation
for
FAProbResults
.
The function returns a FAProbResults
object.
Converts a factor into a matrix with columns corresponding to the
levels and values (cell row i, column j) being either 0 or 1
depending on whether the ith factor was of the level j. See
examples below for in or FAStdIncidenceRateResults
.
Function to calculate the time at risk based on the start date of
the study or the birth date of an individual (startDate
)
and the study's end date (endDate
), the date of an
incidence (e.g. date of diagnosis of a cancer
incidenceDate
) or the death of the individual
(deathDate
). The time at risk for each individual is
calculated as the minimal time period between startDate
and
any of endDate
, incidenceDate
or
deathDate
. Thus it is also possible to provide just the
endDate
along with the startDate
, in which case the
endDate
should be the earliest time point of: end date of
the study, incidence date or date of death.
For affected individuals (those for which either an incidence date
is provided or the value in the optional argument affected
is TRUE
or bigger than 0), by default half of the time unit
is subtracted. For example, a individual that has an incidence
after 2 days is 1.5 days at risk. The proportion of the time unit
to subtract can be specified with the argument
incidenceSubtract
.
The function returns a numeric vector with the time at risk in days.
Generates a matrix with columns corresponding to age slices/strata
defined by argument slices
and rows to individuals. Each
cell in a row represents the time spent by the individual in the
age slice/strata. See example below.
Johannes Rainer, Daniel Taliun, Christian Weichenberger.
Rainer J, Talliun D, D'Elia Y, Domingues FS and Weichenberger CX (2016) FamAgg: an R package to evaluate familial aggregation of traits in large pedigrees. Bioinformatics.
Hill, J.R. (1980) A survey of cancer sites by kinship in the Utah Mormon population. In Cairns J, Lyon JL, Skolnick M (eds): Cancer Incidence in Defined Populations. Banbury Report 4. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, pp 299–318.
Kerber, R.A. (1995) Method for calculating risk associated with family history of a disease. Genet Epidemiol, pp 291–301.
Yu, C. and Zelterman, D. (2002) Statistical inference for familial disease clusters. Biometrics, pp 481–491
Falchi, M. and Fuchsberger, C. (2008) Jenti: an efficient tool for mining complex inbred genealogies. Bioinformatics, pp 724–726
pedigree
,
FAData
,
FAProbResults
,
FAKinGroupResults
,
FAKinSumResults
,
FAIncidenceRateResults
########################## ## ## Defining a small pedigree ## ## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5 | minnbreast$famid==14 | minnbreast$famid==8, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## create the FAData object fad <- FAData(pedigree=mbped) ## We specify the cancer trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ########################## ## ## Familial Incidence Rate ## ## Calculate the FR for each individual given the affected status of ## each individual in trait cancer and the time at risk for each ## participant. We use column "endage" in the minnbreast data.frame ## that specifies the age at the last follow-up or incident cancer as a ##rather impresice estimate for time at risk. fr <- familialIncidenceRate(fad, trait=tcancer, timeAtRisk=mbsub$endage) ## Plot the distribution of familial rates plot(density(fr, na.rm=TRUE)) ## Perform in addition Monte Carlo simulations to assess the significance ## for the familial incidence rates. frRes <- familialIncidenceRateTest(fad, trait=tcancer, timeAtRisk=mbsub$endage, nsim=500) head(result(frRes)) ########################## ## ## Familial Standardized Incidence Rate: ## Please see examples of FAStdIncidenceRateResults. ########################## ## ## Perform familial aggregation analyses using the genealogical index ## gi <- genealogicalIndexTest(fad, trait=tcancer, traitName="cancer", nsim=500) result(gi) ## A significant clustering of cancer cases was identified in the ## analyzed pedigree. ## Plotting the observed mean kinship and the distribution of mean kinship ## from the random sampling. plotRes(gi) ########################## ## ## Perform familial aggregation analysis using the kinship sum test ## kcr <- kinshipSumTest(fad, trait=tcancer, traitName="cancer", nsim=500) kcr head(result(kcr)) ########################## ## ## Perform familial aggregation analysis using the kinship group test, ## stratifying by sex ## kr <- kinshipGroupTest(fad, trait=tcancer, traitName="cancer", nsim=500, strata=fad$sex) kr head(result(kr)) ########################## ## ## Estimate the time at risk given ## ## Define some birth dates and incidence dates and end date of study bdates <- c("2012-04-17", "2014-05-29", "1999-12-31", "2002-10-10") idates <- c(NA, NA, "2007-07-13", "2013-12-23") edates <- rep("2015-09-15", 4) ## Estimate the time at risk. The time period is returned in days. riskDays <- estimateTimeAtRisk(startDate=bdates, incidenceDate=idates, endDate=edates) riskDays ########################## ## ## Define the time spent in an age stratum given the indivduals' ## age at incidence or end of study. head(mbsub$endage) ## We "slice" the age in specified intervals/slices stratAge <- sliceAge(mbsub$endage, slices=c(0, 40, 60, Inf)) head(stratAge) ## The first column lists the number of years spent in the first age ## stratum (0 < age <= 40) and the second in the second stratum ## (40 < age <= Inf) ## We could also stratify the disk days from above in per year strata. sliceAge(riskDays/365, slices=c(0, 2.5, 5, 10, 20)) ########################## ## ## Simple example for factor2matrix: generate a matrix for factor $sex head(factor2matrix(fad$sex))
########################## ## ## Defining a small pedigree ## ## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5 | minnbreast$famid==14 | minnbreast$famid==8, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## create the FAData object fad <- FAData(pedigree=mbped) ## We specify the cancer trait. tcancer <- mbsub$cancer names(tcancer) <- mbsub$id ########################## ## ## Familial Incidence Rate ## ## Calculate the FR for each individual given the affected status of ## each individual in trait cancer and the time at risk for each ## participant. We use column "endage" in the minnbreast data.frame ## that specifies the age at the last follow-up or incident cancer as a ##rather impresice estimate for time at risk. fr <- familialIncidenceRate(fad, trait=tcancer, timeAtRisk=mbsub$endage) ## Plot the distribution of familial rates plot(density(fr, na.rm=TRUE)) ## Perform in addition Monte Carlo simulations to assess the significance ## for the familial incidence rates. frRes <- familialIncidenceRateTest(fad, trait=tcancer, timeAtRisk=mbsub$endage, nsim=500) head(result(frRes)) ########################## ## ## Familial Standardized Incidence Rate: ## Please see examples of FAStdIncidenceRateResults. ########################## ## ## Perform familial aggregation analyses using the genealogical index ## gi <- genealogicalIndexTest(fad, trait=tcancer, traitName="cancer", nsim=500) result(gi) ## A significant clustering of cancer cases was identified in the ## analyzed pedigree. ## Plotting the observed mean kinship and the distribution of mean kinship ## from the random sampling. plotRes(gi) ########################## ## ## Perform familial aggregation analysis using the kinship sum test ## kcr <- kinshipSumTest(fad, trait=tcancer, traitName="cancer", nsim=500) kcr head(result(kcr)) ########################## ## ## Perform familial aggregation analysis using the kinship group test, ## stratifying by sex ## kr <- kinshipGroupTest(fad, trait=tcancer, traitName="cancer", nsim=500, strata=fad$sex) kr head(result(kr)) ########################## ## ## Estimate the time at risk given ## ## Define some birth dates and incidence dates and end date of study bdates <- c("2012-04-17", "2014-05-29", "1999-12-31", "2002-10-10") idates <- c(NA, NA, "2007-07-13", "2013-12-23") edates <- rep("2015-09-15", 4) ## Estimate the time at risk. The time period is returned in days. riskDays <- estimateTimeAtRisk(startDate=bdates, incidenceDate=idates, endDate=edates) riskDays ########################## ## ## Define the time spent in an age stratum given the indivduals' ## age at incidence or end of study. head(mbsub$endage) ## We "slice" the age in specified intervals/slices stratAge <- sliceAge(mbsub$endage, slices=c(0, 40, 60, Inf)) head(stratAge) ## The first column lists the number of years spent in the first age ## stratum (0 < age <= 40) and the second in the second stratum ## (40 < age <= Inf) ## We could also stratify the disk days from above in per year strata. sliceAge(riskDays/365, slices=c(0, 2.5, 5, 10, 20)) ########################## ## ## Simple example for factor2matrix: generate a matrix for factor $sex head(factor2matrix(fad$sex))
Utility functions to access, modify or subset pedigrees. Most of these
functions can be applied to simple data.frame
in pedigree
format or pedigree
or pedigreeList
objects defined in
the kinship2
package.
## S4 method for signature 'missing' cliques(object, ...) connectedSubgraph(graph, nodes, mode="all", all.nodes=TRUE, ifnotfound) ## S4 method for signature 'FAData' countGenerations(object, id=NULL, direction="down", ...) ## S4 method for signature 'FAData' estimateGenerations(object, family=NULL, ...) ## S4 method for signature 'FAData' findFounders(object, family=NULL, id = NULL, ...) ## S4 method for signature 'FAData' generationsFrom(object, id=NULL, ...) ## S4 method for signature 'FAData' getAncestors(object, id=NULL, max.generations=3, ...) ## S4 method for signature 'FAData' getChildren(object, id=NULL, max.generations=16, ...) ## S4 method for signature 'FAData' getCommonAncestor(object, id, method="min.dist") ## S4 method for signature 'FAData' getFounders(object, ...) ## S4 method for signature 'FAData' getMissingMate(object, id=NULL, ...) ## S4 method for signature 'FAData' getSiblings(object, id=NULL, ...) ## S4 method for signature 'FAData' getSingletons(object, ...) ped2graph(ped) ## S4 method for signature 'FAData' removeSingletons(object, ...) ## S4 method for signature 'data.frame' removeSingletons(object, ...) subPedigree(ped, id=NULL, all=TRUE) ## S4 method for signature 'FAData' shareKinship(object, id, rmKinship=0)
## S4 method for signature 'missing' cliques(object, ...) connectedSubgraph(graph, nodes, mode="all", all.nodes=TRUE, ifnotfound) ## S4 method for signature 'FAData' countGenerations(object, id=NULL, direction="down", ...) ## S4 method for signature 'FAData' estimateGenerations(object, family=NULL, ...) ## S4 method for signature 'FAData' findFounders(object, family=NULL, id = NULL, ...) ## S4 method for signature 'FAData' generationsFrom(object, id=NULL, ...) ## S4 method for signature 'FAData' getAncestors(object, id=NULL, max.generations=3, ...) ## S4 method for signature 'FAData' getChildren(object, id=NULL, max.generations=16, ...) ## S4 method for signature 'FAData' getCommonAncestor(object, id, method="min.dist") ## S4 method for signature 'FAData' getFounders(object, ...) ## S4 method for signature 'FAData' getMissingMate(object, id=NULL, ...) ## S4 method for signature 'FAData' getSiblings(object, id=NULL, ...) ## S4 method for signature 'FAData' getSingletons(object, ...) ped2graph(ped) ## S4 method for signature 'FAData' removeSingletons(object, ...) ## S4 method for signature 'data.frame' removeSingletons(object, ...) subPedigree(ped, id=NULL, all=TRUE) ## S4 method for signature 'FAData' shareKinship(object, id, rmKinship=0)
(in alphabetic order)
all |
For |
all.nodes |
For |
direction |
For |
family |
A character or numeric representing the family id.
For |
graph |
An |
id |
A character or numeric vector length 1 or longer specifying the
id(s) of the individual(s).
For |
ifnotfound |
For |
max.generations |
For |
method |
For |
mode |
For |
nodes |
For |
object |
For |
ped |
Either a |
rmKinship |
For |
... |
For |
Refer to the method and function description above for detailed information on the returned result object.
Count the generations up- or down the pedigree for the specified individual(s), i.e. determine the number of ancestor or offspring generations defined in the pedigree for the specified individual(s). Returns a named numeric vector, names corresponding to the individual's id, with the number of generations for each specified individual.
Identifies the founder couple with the largest number of offspring
generations in the pedigree. The provided pedigree
object/data.frame can contain pedigrees of multiple families,
thus, to identify the founder pair for a family its ID can be
provided with the family
parameter. Alternatively, the ID
of an individual can be specified, in which case the founder pair
of the (full) pedigree of the specified individual is identified.
If two or more couples have the same, largest number of
offspring generations, the first couple is selected.
Returns a character vector of length 2 with the ids of the founder
individuals.
Returns the ids of all founders in the pedigree. A founder is an individual from which neither father nor mother is known in the pedigree.
Returns the ids of all singletons, i.e. individuals in the pedigree that are not connected to any other individual (have no parents in the pedigree and no children).
Identify and return the ids of ancestor generations (up to
max.generations
) for the specified individual(s).
Identify and return the ids of offspring generations (up to
max.generations
) for the specified individual(s).
Finds the closest common ancestor between specified individuals (2
or more ids are required).
Returns a character vector with the ids of the ancestors or NA
if
no common ancestor was found.
The function evaluates if in the sub-pedigree defined by the specified ids one or more mates (spouse) are missing and if so it returns their ids.
Get siblings for the specified id(s). Returns their ids as character, or numeric vector.
Removes all unconnected individuals (i.e. singletons) from the
pedigree. Returns a data.frame
with the pedigree cleaned
from all singletons. Note that, due to internal sanitizing,
columns "father"
and "mother"
in the resulting
data.frame
have a NA
for individuals for which the
father or mother is not known in the pedigree.
Finds the smallest pedigree containing all specified individuals.
Depending on the input, a data.frame
, pedigree
or pedigreeList
.
Estimates generation levels/numbers for each, or only one, family
in the object. Generation numbers are always relative to the
founder couple (defined by findFounders
).
Returns (always) a named list of generation numbers. The names of
the list represent the family id, the names of the numeric vector
of generations the id of the individuals in the family.
Determine generations starting from the specified
individual. Siblings including their mates and all other in the
same generation () are assigned generation 0, ancestor generations
(all their parents etc) negative generation numbers, decreasing with
ancestor level and their offspring positive numbers, increasing
with each generation.
Generations are only estimated within the family of the
individual, also, if the pedigree consists of un-connected
sub-pedigree, generation numbers will only be calculated for the
sub-pedigree containing the specified individual.
The function returns a named numeric vector of generation numbers,
the names corresponding to the ids of the individuals in the
specified individual's family. Not connected individuals in the
family get a NA
generation number.
Finds all related individuals (individuals sharing kinship >
rmKinship
with the individual) for the specified
individual(s) in the pedigree and returns their ids as a character
vector.
Wrapper method passing all arguments to the cliques
function from the igraph
package.
Finds the (eventually smallest) connected subgraph of all
specified nodes.
Returns an igraph
object representing the subgraph of the
specified nodes.
Transforms the pedigree into a (directed) graph with the direction
of the edges being always from parent to child.
An igraph
object.
Johannes Rainer.
pedigree
,
FAData
,
FAProbResults
,
FAKinGroupResults
,
FAKinSumResults
,
PedigreeAnalysis
kinshipPairs
########################## ## ## Defining a small pedigree ## ## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## plot the pedigree for family 4 to get an overview. switchPlotfun(method="ks2paint") fam4 <- mbped[mbped$family==4, ] doPlotPed(individual=fam4$id, father=fam4$father, mother=fam4$mother, gender=fam4$sex, device="plot") ## find the closest common ancestor between individuals 23, 3 and 8 getCommonAncestor(fam4, id=c(23, 3, 8)) ## create the smallest sub-pedigree for individuals 21, 22 and 25 subPedigree(fam4, id=c(21, 22, 25)) ## plot that fam4sub <- subPedigree(fam4, id=c(21, 22, 25)) doPlotPed(individual=fam4sub$id, father=fam4sub$father, mother=fam4sub$mother, gender=fam4sub$sex, device="plot") ######################### ## ## Basic pedigree utils ## ## Note: the same methods can be applied to a data.frame representing ## a pedigree, or a FAData, pedigree or pedigreeList object. ## Find the founder couple for family 4 findFounders(fam4, family=4) ## Alternatively, find the founders for the pedigree in which ibdividual 20 is a ## member findFounders(fam4, id = 20) ## Return all founders in the pedigree. getFounders(fam4) ## Get all founders without children (i.e. singletons). getSingletons(fam4) ## Clean the pedigree from all singletons fam4noS <- removeSingletons(fam4) nrow(fam4) nrow(fam4noS) ## Count the offspring generations for individual "4" countGenerations(fam4, id="4") ## Get the ids of all ancestors for that individual getAncestors(fam4, id="4") ## Get the ids of the children of this individual getChildren(fam4, id="4", max.generations=1) ## Get the ids of the complete offspring for this individuals getChildren(fam4, id="4") ## Create a FAData object from the pedigree data.frame fad <- FAData(fam4) ## get the list of all ids sharing kinship with individuals ## 5 and 9 shareKinship(fad, id=c("5", "9")) ## Count the numbers of generations of ancestors for individual 12 countGenerations(fad, id="12", direction="up") ## Count the numbers of offspring generations for individuals 2 and 29 countGenerations(fad, id=c("2", "29")) ## Get all brothers/sisters for individual 9 getSiblings(fad, id="9") ## Determine generation levels starting from individual "9" generationsFrom(fad, id="9") ## Estimate generations relative to the founder couple for each ## family in the submitted object, a data.frame in the example below estimateGenerations(mbped) ######################### ## ## Graph utilities ## ## Convert the pedigree into a graph pgraph <- ped2graph(fam4) plot(pgraph) ## Make a subgraph containing nodes 10, 22, 12 and 14 sgraph <- connectedSubgraph(pgraph, c("10", "22", "12", "14")) plot(sgraph)
########################## ## ## Defining a small pedigree ## ## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) mbsub <- minnbreast[minnbreast$famid==4 | minnbreast$famid==5, ] mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")] ## renaming column names colnames(mbped) <- c("family", "id", "father", "mother", "sex") ## plot the pedigree for family 4 to get an overview. switchPlotfun(method="ks2paint") fam4 <- mbped[mbped$family==4, ] doPlotPed(individual=fam4$id, father=fam4$father, mother=fam4$mother, gender=fam4$sex, device="plot") ## find the closest common ancestor between individuals 23, 3 and 8 getCommonAncestor(fam4, id=c(23, 3, 8)) ## create the smallest sub-pedigree for individuals 21, 22 and 25 subPedigree(fam4, id=c(21, 22, 25)) ## plot that fam4sub <- subPedigree(fam4, id=c(21, 22, 25)) doPlotPed(individual=fam4sub$id, father=fam4sub$father, mother=fam4sub$mother, gender=fam4sub$sex, device="plot") ######################### ## ## Basic pedigree utils ## ## Note: the same methods can be applied to a data.frame representing ## a pedigree, or a FAData, pedigree or pedigreeList object. ## Find the founder couple for family 4 findFounders(fam4, family=4) ## Alternatively, find the founders for the pedigree in which ibdividual 20 is a ## member findFounders(fam4, id = 20) ## Return all founders in the pedigree. getFounders(fam4) ## Get all founders without children (i.e. singletons). getSingletons(fam4) ## Clean the pedigree from all singletons fam4noS <- removeSingletons(fam4) nrow(fam4) nrow(fam4noS) ## Count the offspring generations for individual "4" countGenerations(fam4, id="4") ## Get the ids of all ancestors for that individual getAncestors(fam4, id="4") ## Get the ids of the children of this individual getChildren(fam4, id="4", max.generations=1) ## Get the ids of the complete offspring for this individuals getChildren(fam4, id="4") ## Create a FAData object from the pedigree data.frame fad <- FAData(fam4) ## get the list of all ids sharing kinship with individuals ## 5 and 9 shareKinship(fad, id=c("5", "9")) ## Count the numbers of generations of ancestors for individual 12 countGenerations(fad, id="12", direction="up") ## Count the numbers of offspring generations for individuals 2 and 29 countGenerations(fad, id=c("2", "29")) ## Get all brothers/sisters for individual 9 getSiblings(fad, id="9") ## Determine generation levels starting from individual "9" generationsFrom(fad, id="9") ## Estimate generations relative to the founder couple for each ## family in the submitted object, a data.frame in the example below estimateGenerations(mbped) ######################### ## ## Graph utilities ## ## Convert the pedigree into a graph pgraph <- ped2graph(fam4) plot(pgraph) ## Make a subgraph containing nodes 10, 22, 12 and 14 sgraph <- connectedSubgraph(pgraph, c("10", "22", "12", "14")) plot(sgraph)
Plot a pedigree for a family or an individual.
doPlotPed(family=NULL, individual=NULL, father=NULL, mother=NULL, gender=NULL, affected=NULL, is.deceased=NULL, is.sab.or.top=NULL, is.proband=NULL, is.adopted=NULL, are.twins=NULL, are.consanguineous=NULL, text.inside.symbol=NULL, text.beside.symbol=NULL, text1.below.symbol=NULL, text2.below.symbol=NULL, text3.below.symbol=NULL, text4.below.symbol=NULL, filename=NULL, device="plot", res=600, ...) switchPlotfun(method, check=TRUE)
doPlotPed(family=NULL, individual=NULL, father=NULL, mother=NULL, gender=NULL, affected=NULL, is.deceased=NULL, is.sab.or.top=NULL, is.proband=NULL, is.adopted=NULL, are.twins=NULL, are.consanguineous=NULL, text.inside.symbol=NULL, text.beside.symbol=NULL, text1.below.symbol=NULL, text2.below.symbol=NULL, text3.below.symbol=NULL, text4.below.symbol=NULL, filename=NULL, device="plot", res=600, ...) switchPlotfun(method, check=TRUE)
family |
(Optional) character or numeric vector specifying the family id. |
individual |
(Required) character or numeric vector with the ids of the individuals. |
father |
(Required) character or numeric vector with the id of the father for each individual. |
mother |
(Required) character or numeric vector with the id of the mother for each individual. |
gender |
(Required) character, factor or numeric vector specifying the
gender, with |
affected |
(Optional) numeric or logical vector specifying if the individual is
affected, |
is.deceased |
(Optional) numeric of logical vector specifying whether the individual is deceased. |
is.sab.or.top |
(Optional) numeric or logical vector specifying if the individual is the result of a spontaneous abortion or termination of pregnancy. |
is.proband |
(Optional) numeric or logical vector specifying whether the individual is declared as proband (i.e. the first affected family member coming to medical attention). |
is.adopted |
(Optional) numeric or logical vector specifying if the individual has been adopted. |
are.twins |
(Optional) character vector spefifying twins in the
family. Individuals sharing the same string are recognized as
twins. The string has to start either with |
are.consanguineous |
(Optional) character vector specifying cosanguineous couples. |
text.inside.symbol |
(Optional) character vector with text to place inside symbols. |
text.beside.symbol |
(Optional) character vector with text to place beside symbols. |
text1.below.symbol |
(Optional) character vector with text to place below symbols. |
text2.below.symbol |
(Optional) character vector with text to place below symbols. |
text3.below.symbol |
(Optional) character vector with text to place below symbols. |
text4.below.symbol |
(Optional) character vector with text to place below symbols. |
filename |
(Optional) file name for the plot. If not specified the result is plotted to a temporary file. |
device |
The format of the output file. Can be |
res |
(Optional) the resolution of the image if saved to a bitmap device. |
method |
The method which should be used for plotting, either
|
check |
A logical indicating whether the plotting backends (currently
applied only to |
... |
For |
All arguments passed to the doPlotPed
function have to have the
same length (with the exception of arguments filename
,
device
and res
) and their order has to
match the order of the individuals.
The arguments of the doPlotPed
function represent the input
parameters supported by HaploPainter
; for more information
about HaploPainter
refer to http://haplopainter.sourceforge.net/.
By default, doPlotPed
uses the kinship2
package for
plotting, but can also be configured by the switchPlotFun
to
use HaploPainter
instead.
HaploPainter
is a perl script/tool for pedigree plotting bundled in
the package that requires however some dependencies that might not be
present on every system. Thus, the package checks on startup whether
all requirements for HaploPainter
are available. This check can
be skipped by using check=FALSE
when calling switchPlotFun
.
While using this argument is generally not recommended, it is of use
when only writing HaploPainter
input files, which does not make
use of the HaploPainter
plotting backend.
If HaploPainter
is used, the plot can only be exported to a pdf
or png device, while, if kinship2
is used, the plot can also be
directly plotted and displayed (if device="plot"
is specified).
HaploPainter
plotting supports also device = "txt"
in
which case the pedigree data are exported (in the HaploPainter file
format) as a tabulator delimited file - no plot is created, the
name of the file is returned. This can even be done without a
HaploPainter
executable by calling
switchPlotFun("haplopaint", check=FALSE)
.
Also, the arguments of this function match the arguments for
HaploPainter
and not all settings can be directly matched to
settings in kinship2
plotting. The list below lists all
arguments specific to HaploPainter
and how, if at all, they are
displayed or used in kinship2
plotting:
Not supported yet.
The id of individuals which are marked as probands are highlighted in red.
Not supported yet.
Not supported yet.
Not supported yet.
The text will be plotted on the top left corner of the symbol of the respective individual.
The text will be plotted on the top right corner of the symbol of the respective individual.
A character string representing the plotting function to be used.
The name of the file to which the plot was exported.
Johannes Rainer.
plot.pedigree
,
plotPed,FAData-method
,
## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) family <- minnbreast[minnbreast$famid==4, ] ## draw a pedigree and export it to a pdf file; the file name is ## returned by the function. doPlotPed(family=family$famid, individual=family$id, father=family$fatherid, mother=family$motherid, gender=family$sex, device="pdf") ## switch to the plotting functions of the kinship2 package switchPlotfun("ks2paint") ## plot the same pedigree, but display it doPlotPed(family=family$famid, individual=family$id, father=family$fatherid, mother=family$motherid, gender=family$sex, device="plot") ## Finally, generate an input file that can be used for interactive or ## scripted HaploPainter pedigree drawing. switchPlotfun("haplopaint", check=FALSE) doPlotPed(family=family$famid, individual=family$id, father=family$fatherid, mother=family$motherid, gender=family$sex, device="txt", filename="haplopainter.tsv")
## load the Minnesota Breast Cancer record and subset to the ## first families. data(minnbreast) family <- minnbreast[minnbreast$famid==4, ] ## draw a pedigree and export it to a pdf file; the file name is ## returned by the function. doPlotPed(family=family$famid, individual=family$id, father=family$fatherid, mother=family$motherid, gender=family$sex, device="pdf") ## switch to the plotting functions of the kinship2 package switchPlotfun("ks2paint") ## plot the same pedigree, but display it doPlotPed(family=family$famid, individual=family$id, father=family$fatherid, mother=family$motherid, gender=family$sex, device="plot") ## Finally, generate an input file that can be used for interactive or ## scripted HaploPainter pedigree drawing. switchPlotfun("haplopaint", check=FALSE) doPlotPed(family=family$famid, individual=family$id, father=family$fatherid, mother=family$motherid, gender=family$sex, device="txt", filename="haplopainter.tsv")