Title: | Pedigree Functions |
---|---|
Description: | Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome. |
Authors: | Louis Le Nézet [aut, cre, ctb] , Jason Sinnwell [aut], Terry Therneau [aut], Daniel Schaid [ctb], Elizabeth Atkinson [ctb] |
Maintainer: | Louis Le Nézet <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.3.0 |
Built: | 2024-10-30 09:21:18 UTC |
Source: | https://github.com/bioc/Pedixplorer |
The Pedixplorer package for pedigree data an updated package of the
kinship2
package.
The kinship2
package was originally
written by Terry Therneau and Jason Sinnwell.
The Pedixplorer
package is a fork of the
kinship2
package with
additional functionality and bug fixes.
The package download, NEWS, and README are available on CRAN: Kinship2 for the previous version of the package.
Below are listed some of the most widely used functions available in arsenal:
Pedigree()
: Contstructor of the Pedigree class,
given identifiers, sex, affection status(es), and special relationships
kinship()
: Calculates the kinship matrix, the
probability having an allele sampled from two individuals
be the same via IBD.
plot()
: Method to transform a Pedigree
object into a graphical plot.
Allows extra information to be included in the id under the
plot symbol.
This method use the plot_fromdf()
function to
transform the Pedigree object into a data frame
of graphical elements, the same is done for the
legend with the ped_to_legdf()
function.
When done, the data frames are plotted with the
plot_fromdf()
function.
shrink()
: Shrink a Pedigree to a specific bit size,
removing non-informative members first.
bit_size()
: Approximate the output from SAS's
PROC FREQ
procedure when using the /list
option of the TABLE
statement.
sampleped()
: Pedigree example data sets
with two pedigrees
minnbreast()
: Larger cohort of pedigrees
from MN breast cancer study
Maintainer: Louis Le Nézet [email protected] (ORCID) [contributor]
Authors:
Jason Sinnwell [email protected]
Terry Therneau
Other contributors:
Daniel Schaid [contributor]
Elizabeth Atkinson [contributor]
Useful links:
Report bugs at https://github.com/LouisLeNezet/Pedixplorer/issues
library(Pedixplorer)
library(Pedixplorer)
Given a Pedigree, this function creates helper matrices that describe the layout of a plot of the Pedigree.
## S4 method for signature 'Pedigree' align( obj, packed = TRUE, width = 10, align = TRUE, hints = NULL, missid = "NA_character_", align_parents = TRUE, force = FALSE, precision = 2 )
## S4 method for signature 'Pedigree' align( obj, packed = TRUE, width = 10, align = TRUE, hints = NULL, missid = "NA_character_", align_parents = TRUE, force = FALSE, precision = 2 )
obj |
A Pedigree object |
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
width |
For a packed output, the minimum width of the plot, in inches. |
align |
For a packed Pedigree, align children under parents |
hints |
A Hints object or a named list containing
|
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
align_parents |
If |
force |
If |
precision |
The number of decimal places to round the solution to. |
This is an internal routine, used almost exclusively by
ped_to_plotdf()
.
The subservient functions auto_hint()
,
alignped1()
, alignped2()
,
alignped3()
, and alignped4()
contain the bulk of the computation.
If the hints are missing the
auto_hint()
routine is called to
supply an initial guess.
If multiple families are present in the obj Pedigree, this routine is called once for each family, and the results are combined in the list returned.
For more information you can read the associated vignette:
vignette("pedigree_alignment")
.
A list with components
n
: A vector giving the number of subjects on each
horizonal level of the plot
nid
: A matrix with one row for each level, giving
the numeric id of each subject plotted.
(A value of 17
means the 17th subject in the Pedigree).
pos
: A matrix giving the horizontal position of each plot point
fam
: A matrix giving the family id of each plot point.
A value of 3
would mean that the two subjects in positions
3 and 4, in the row above, are this subject's parents.
spouse
: A matrix with values
0
= not a spouse
1
= subject plotted to the immediate right is a spouse
2
= subject plotted to the immediate right is an inbred spouse
twins
: Optional matrix which will only be present if the Pedigree
contains twins :
0
= not a twin
1
= sibling to the right is a monozygotic twin
2
= sibling to the right is a dizygotic twin
3
= sibling to the right is a twin of unknown zygosity
alignped1()
,
alignped2()
,
alignped3()
,
alignped4()
,
auto_hint()
data(sampleped) ped <- Pedigree(sampleped) align(ped)
data(sampleped) ped <- Pedigree(sampleped) align(ped)
First alignment routine which create the subtree founded on a single subject as though it were the only tree.
alignped1(idx, dadx, momx, level, horder, packed, spouselist)
alignped1(idx, dadx, momx, level, horder, packed, spouselist)
idx |
Indexes of the subjects |
dadx |
Indexes of the fathers |
momx |
Indexes of the mothers |
level |
Vector of the level of each subject |
horder |
A named numeric vector with one element per subject in the Pedigree. It determines the relative horizontal order of subjects within a sibship, as well as the relative order of processing for the founder couples. (For this latter, the female founders are ordered as though they were sisters). The names of the vector should be the individual identifiers. |
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
spouselist |
Matrix of spouses with 4 columns:
|
In this routine the nid array consists of the final
nid array + 1/2
of the final spouse array.
Note that the spouselist matrix will only contain spouse pairs
that are not yet processed. The logic for anchoring is slightly tricky.
First, if col 4 of the spouselist matrix is 0, we anchor at the first
opportunity. Also note that if spouselist[, 3] == spouselist[, 4]
it is the husband who is the anchor (just write out the possibilities).
Create the set of 3 return structures, which will be matrices
with 1 + nspouse
columns.
If there are children then other routines will widen the result.
This two complimentary lists denote the spouses plotted on the left
and on the right.
For someone with lots of spouses we try to split them evenly.
If the number of spouses is odd, then men should have more on
the right than on the left, women more on the right.
Any hints in the spouselist matrix override.
We put the undecided marriages closest to idx, then add
predetermined ones to the left and right. The majority of marriages will
be undetermined singletons, for which nleft will be 1
for
female (put my husband to the left) and 0
for male. In one bug found
by plotting canine data, lspouse could initially be empty but
length(rspouse) > 1
. This caused nleft > length(indx)
.
A fix was to not let indx to be indexed beyond its length,
fix by JPS 5/2013.
For each spouse get the list of children. If there are any we
call alignped2()
to generate their tree and
then mark the connection to their parent.
If multiple marriages have children we need to join the trees.
To finish up we need to splice together the tree made up from
all the kids, which only has data from lev + 1
down, with the data here.
There are 3 cases:
No children were found.
The tree below is wider than the tree here, in which case we add the data from this level onto theirs.
The tree below is narrower, for instance an only child.
A list containing the elements to plot the Pedigree. It contains a set of matrices along with the spouselist matrix. The latter has marriages removed as they are processed.
n
: A vector giving the number of subjects on each horizonal
level of the plot
nid
: A matrix with one row for each level, giving the numeric
id of each subject plotted.
(A value of 17
means the 17th subject in the Pedigree).
pos
: A matrix giving the horizontal position of each plot point
fam
: A matrix giving the family id of each plot point.
A value of 3
would mean that the two subjects in positions
3 and 4, in the row above, are this subject's parents.
spouselist
: Spouse matrix with anchors informations
data(sampleped) ped <- Pedigree(sampleped) align(ped)
data(sampleped) ped <- Pedigree(sampleped) align(ped)
Second of the four co-routines which takes a collection of siblings, grows the tree for each, and appends them side by side into a single tree.
alignped2(idx, dadx, momx, level, horder, packed, spouselist)
alignped2(idx, dadx, momx, level, horder, packed, spouselist)
idx |
Indexes of the subjects |
dadx |
Indexes of the fathers |
momx |
Indexes of the mothers |
level |
Vector of the level of each subject |
horder |
A named numeric vector with one element per subject in the Pedigree. It determines the relative horizontal order of subjects within a sibship, as well as the relative order of processing for the founder couples. (For this latter, the female founders are ordered as though they were sisters). The names of the vector should be the individual identifiers. |
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
spouselist |
Matrix of spouses with 4 columns:
|
The input arguments are the same as those to alignped1()
with the
exception that idx will be a vector. This routine does nothing
to the spouselist matrix, but needs to pass it down the tree and back
since one of the routines called by alignped2()
might change the matrix.
The code below has one non-obvious special case. Suppose that two sibs marry.
When the first sib is processed by alignped1
then both partners
(and any children) will be added to the rval structure below.
When the second sib is processed they will come back as a 1 element tree
(the marriage will no longer be on the spouselist),
which should be added onto rval.
The rule thus is to not add any 1 element tree whose value
(which must be idx[i]
is already in the rval structure for this level.
A list containing the elements to plot the Pedigree. It contains a set of matrices along with the spouselist matrix. The latter has marriages removed as they are processed.
n
: A vector giving the number of subjects on each horizonal
level of the plot
nid
: A matrix with one row for each level, giving the numeric
id of each subject plotted.
(A value of 17
means the 17th subject in the Pedigree).
pos
: A matrix giving the horizontal position of each plot point
fam
: A matrix giving the family id of each plot point.
A value of 3
would mean that the two subjects in positions 3 and 4,
in the row above, are this subject's parents.
spouselist
: Spouse matrix with anchors informations
data(sampleped) ped <- Pedigree(sampleped) align(ped)
data(sampleped) ped <- Pedigree(sampleped) align(ped)
Third of the four co-routines to merges two pedigree trees which are side by side into a single object.
alignped3(alt1, alt2, packed, space = 1)
alignped3(alt1, alt2, packed, space = 1)
alt1 |
Alignment of the first tree |
alt2 |
Alignment of the second tree |
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
space |
Space between two subjects |
The primary special case is when the rightmost person in the left tree is the same as the leftmost person in the right tree; we need not plot two copies of the same person side by side. (When initializing the output structures do not worry about this, there is no harm if they are a column bigger than finally needed.) Beyond that the work is simple book keeping.
For the unpacked case, which is the traditional way to draw a Pedigree when we can assume the paper is infinitely wide, all parents are centered over their children. In this case we think if the two trees to be merged as solid blocks. On input they both have a left margin of 0. Compute how far over we have to slide the right tree.
Now merge the two trees. Start at the top level and work down.
A list containing the elements to plot the Pedigree. It contains a set of matrices along with the spouselist matrix. The latter has marriages removed as they are processed.
n
: A vector giving the number of subjects on each horizonal
level of the plot
nid
: A matrix with one row for each level, giving the numeric
id of each subject plotted.
(A value of 17
means the 17th subject in the Pedigree).
pos
: A matrix giving the horizontal position of each plot point
fam
: A matrix giving the family id of each plot point.
A value of 3
would mean that the two subjects in positions
3 and 4, in the row above, are this subject's parents.
spouselist
: Spouse matrix with anchors informations
data(sampleped) ped <- Pedigree(sampleped) align(ped)
data(sampleped) ped <- Pedigree(sampleped) align(ped)
Last routines which attempts to line up children under parents and put spouses and siblings "close" to each other, to the extent possible within the constraints of page width.
alignped4(rval, spouse, level, width, align, precision = 2)
alignped4(rval, spouse, level, width, align, precision = 2)
rval |
A list with components |
spouse |
A boolean matrix with one row per level representing if the subject is a spouse or not. |
level |
Vector of the level of each subject |
width |
For a packed output, the minimum width of the plot, in inches. |
align |
For a packed Pedigree, align children under parents |
precision |
The number of decimal places to round the solution to. |
The alignped4()
routine is the final step of alignment.
The current code does necessary setup and then calls the
quadprog::solve.QP()
function.
There are two important parameters for the function:
The maximum width specified. The smallest possible width is the maximum number of subjects on a line. If the user suggestion is too low it is increased to that amount plus one (to give just a little wiggle room).
The align vector of 2 alignment parameters a
and b
.
For each set of siblings x
with parents at p_1
and p_2
the alignment penalty is:
where k
is the number of siblings in the set.
Using the fact that when a = 1
:
then moving a sibship with k
sibs one unit to the left or
right of optimal will incur the same cost as moving one with only 1 or
two sibs out of place.
If a = 0
then large sibships are harder to move
than small ones.
With the default value a = 1.5
, they are slightly easier
to move than small ones.
The rationale for the default is as long as the
parents are somewhere between the first and last siblings the result looks
fairly good, so we are more flexible with the spacing of a large family.
By tethering all the sibs to a single spot they tend to be kept close to
each other.
The alignment penalty for spouses is , which tends to
keep them together. The size of
b
controls the relative importance of
sib-parent and spouse-spouse closeness.
We start by adding in these penalties.
The total number of parameters in the alignment problem
(what we hand to quadprog) is the set of sum(n)
positions.
A work array myid keeps track of the parameter number for each position
so that it is easy to find. There is one extra penalty added at the end.
Because the penalty amount would be the same if all the final positions
were shifted by a constant, the penalty matrix will not be positive
definite; solve.QP()
does not like this.
We add a tiny amount of leftward pull to the widest line.
If there are k
subjects on a line there will
be k+1
constraints for that line. The first point must be
, each subsequent one must be at least 1 unit to the right,
and the final point must be
the max width.
The updated position matrix
data(sampleped) ped <- Pedigree(sampleped) align(ped)
data(sampleped) ped <- Pedigree(sampleped) align(ped)
Compute an initial guess for the alignment of a Pedigree
## S4 method for signature 'Pedigree' auto_hint(obj, hints = NULL, packed = TRUE, align = FALSE, reset = FALSE)
## S4 method for signature 'Pedigree' auto_hint(obj, hints = NULL, packed = TRUE, align = FALSE, reset = FALSE)
obj |
A Pedigree object |
hints |
A Hints object or a named list containing
|
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
align |
For a packed Pedigree, align children under parents |
reset |
If |
A Pedigree structure can contain a Hints object which helps to reorder the Pedigree (e.g. left-to-right order of children within family) so as to plot with minimal distortion. This routine is used to create an initial version of the hints. They can then be modified if desired.
This routine would not normally be called by a user. It moves children within families, so that marriages are on the "edge" of a set children, closest to the spouse. For pedigrees that have only a single connection between two families this simple-minded approach works surprisingly well. For more complex structures hand-tuning of the hints may be required.
When auto_hint()
is called with a a vector of numbers as the
hints argument, the values for the founder females are used to
order the founder families left to right across the plot.
The values within a sibship are used as the preliminary order of
siblings within a family; this may be changed to move one of them to the
edge so as to match up with a spouse. The actual values in the vector are
not important, only their order.
The initial Hints object.
data(sampleped) ped <- Pedigree(sampleped[sampleped$famid == 1, ]) auto_hint(ped)
data(sampleped) ped <- Pedigree(sampleped[sampleped$famid == 1, ]) auto_hint(ped)
When computer time is cheap, use this routine to get a best Pedigree alignment. This routine will try all possible founder orders, and return the one with the least stress.
## S4 method for signature 'Pedigree' best_hint(obj, wt = c(1000, 10, 1), tolerance = 0)
## S4 method for signature 'Pedigree' best_hint(obj, wt = c(1000, 10, 1), tolerance = 0)
obj |
A Pedigree object |
wt |
A vector of three weights for the three error measures.
Default is
|
tolerance |
The maximum stress level to accept.
Default is |
The auto_hint()
routine will rearrange sibling order,
but not founder order.
This calls auto_hint()
with every possible founder
order, and finds that plot with the least "stress".
The stress is computed as a weighted sum of three error measures:
nbArcs The number of duplicate individuals in the plot
lgArcs The sum of the absolute values of the differences in the positions of duplicate individuals
lgParentsChilds The sum of the absolute values of the differences between the center of the children and the parents
If during the search, a plot is found with a stress level less than tolerance, the search is terminated.
The best Hints object out of all the permutations
data(sampleped) ped <- Pedigree(sampleped[sampleped$famid == 1,]) best_hint(ped)
data(sampleped) ped <- Pedigree(sampleped[sampleped$famid == 1,]) best_hint(ped)
Utility function used in the shrink()
function
to calculate the bit size of a Pedigree.
## S4 method for signature 'character_OR_integer' bit_size(obj, momid, missid = NA_character_) ## S4 method for signature 'Pedigree' bit_size(obj) ## S4 method for signature 'Ped' bit_size(obj)
## S4 method for signature 'character_OR_integer' bit_size(obj, momid, missid = NA_character_) ## S4 method for signature 'Pedigree' bit_size(obj) ## S4 method for signature 'Ped' bit_size(obj)
obj |
A Ped or Pedigree object or a vector of fathers identifiers |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
The bit size of a Pedigree is defined as :
Where NbNonFounders
is the number of non founders in the Pedigree
(i.e. individuals with identified parents) and
NbFounders
is the number of founders in the Pedigree
(i.e. individuals without identified parents).
A list with the following components:
bit_size The bit size of the Pedigree
nFounder The number of founders in the Pedigree
nNonFounder The number of non founders in the Pedigree
data(sampleped) ped <- Pedigree(sampleped) bit_size(ped)
data(sampleped) ped <- Pedigree(sampleped) bit_size(ped)
This function creates a table with the affection and availability information for all individuals in a pedigree object.
family_infos_table(pedi, col_val = NA)
family_infos_table(pedi, col_val = NA)
pedi |
A pedigree object. |
col_val |
The column name in the |
A cross table dataframe with the affection and availability information.
data(sampleped) pedi <- Pedigree(sampleped) pedi <- generate_colors(pedi, "num_child_tot", threshold = 2) Pedixplorer:::family_infos_table(pedi, "num_child_tot") Pedixplorer:::family_infos_table(pedi, "affection")
data(sampleped) pedi <- Pedigree(sampleped) pedi <- generate_colors(pedi, "num_child_tot", threshold = 2) Pedixplorer:::family_infos_table(pedi, "num_child_tot") Pedixplorer:::family_infos_table(pedi, "affection")
Finds one subject from among available non-parents with indicated affection status.
## S4 method for signature 'Ped' find_avail_affected(obj, avail = NULL, affected = NULL, affstatus = NA) ## S4 method for signature 'Pedigree' find_avail_affected(obj, avail = NULL, affected = NULL, affstatus = NA)
## S4 method for signature 'Ped' find_avail_affected(obj, avail = NULL, affected = NULL, affstatus = NA) ## S4 method for signature 'Pedigree' find_avail_affected(obj, avail = NULL, affected = NULL, affstatus = NA)
obj |
A Ped or Pedigree object. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
affstatus |
Affection status to search for. |
When used within shrink()
, this function is called
with the first affected indicator,
if the affected item in the Pedigree is a matrix of
multiple affected indicators.
If avail or affected is null, then the function will use the corresponding Ped accessor.
A list is returned with the following components
ped The new Ped object
newAvail Vector of availability status of trimmed individuals
idTrimmed Vector of IDs of trimmed individuals
isTrimmed logical value indicating whether Ped object has been trimmed
bit_size Bit size of the trimmed Ped
data(sampleped) ped <- Pedigree(sampleped) find_avail_affected(ped, affstatus = 1)
data(sampleped) ped <- Pedigree(sampleped) find_avail_affected(ped, affstatus = 1)
Finds subjects from among available non-parents with all affection
equal to 0
.
## S4 method for signature 'Ped' find_avail_noninform(obj, avail = NULL, affected = NULL) ## S4 method for signature 'Pedigree' find_avail_noninform(obj, avail = NULL, affected = NULL)
## S4 method for signature 'Ped' find_avail_noninform(obj, avail = NULL, affected = NULL) ## S4 method for signature 'Pedigree' find_avail_noninform(obj, avail = NULL, affected = NULL)
obj |
A Ped or Pedigree object. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
Identify subjects to remove from a Pedigree who are available but
non-informative (unaffected).
This is the second step to remove subjects in
shrink()
if the Pedigree does not meet
the desired bit size.
If avail or affected is null, then the function will use the corresponding Ped accessor.
Vector of subject ids who can be removed by having lowest informativeness.
data(sampleped) ped <- Pedigree(sampleped) find_avail_noninform(ped)
data(sampleped) ped <- Pedigree(sampleped) find_avail_noninform(ped)
Fix the sex of parents, add parents that are missing from the data. Can be used with a dataframe or a vector of the different individuals informations.
## S4 method for signature 'character' fix_parents(obj, dadid, momid, sex, famid = NULL, missid = NA_character_) ## S4 method for signature 'data.frame' fix_parents(obj, del_parents = NULL, filter = NULL, missid = NA_character_)
## S4 method for signature 'character' fix_parents(obj, dadid, momid, sex, famid = NULL, missid = NA_character_) ## S4 method for signature 'data.frame' fix_parents(obj, del_parents = NULL, filter = NULL, missid = NA_character_)
obj |
A data.frame or a vector of the individuals identifiers. If a
dataframe is given it must contain the columns |
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
sex |
A character, factor or numeric vector corresponding to
the gender of the individuals. This will be transformed to an ordered factor
with the following levels:
|
famid |
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
del_parents |
Boolean defining if missing parents needs to be deleted
or fixed. If |
filter |
Filtering column containing |
First look to add parents whose ids are given in momid/dadid. Second, fix
sex of parents. Last look to add second parent for children for whom only
one parent id is given.
If a famid vector is given the family id will be added to the
ids of all individuals (id
, dadid
, momid
)
separated by an underscore before proceeding.
Check for presence of both parents id in the id field. If not both presence behaviour depend of delete parameter
If TRUE
then use fix_parents function and merge back the
other fields in the dataframe then set availability to
O
for non available parents.
If FALSE
then delete the id of missing parents
A data.frame with id, dadid, momid, sex as columns with the relationships fixed.
Jason Sinnwell
test1char <- data.frame( id = paste('fam', 101:111, sep = ''), sex = c('male', 'female')[c(1, 2, 1, 2, 1, 1, 2, 2, 1, 2, 1)], father = c( 0, 0, 'fam101', 'fam101', 'fam101', 0, 0, 'fam106', 'fam106', 'fam106', 'fam109' ), mother = c( 0, 0, 'fam102', 'fam102', 'fam102', 0, 0, 'fam107', 'fam107', 'fam107', 'fam112' ) ) test1newmom <- with(test1char, fix_parents(id, father, mother, sex, missid = NA_character_ )) Pedigree(test1newmom)
test1char <- data.frame( id = paste('fam', 101:111, sep = ''), sex = c('male', 'female')[c(1, 2, 1, 2, 1, 1, 2, 2, 1, 2, 1)], father = c( 0, 0, 'fam101', 'fam101', 'fam101', 0, 0, 'fam106', 'fam106', 'fam106', 'fam109' ), mother = c( 0, 0, 'fam102', 'fam102', 'fam102', 0, 0, 'fam107', 'fam107', 'fam107', 'fam112' ) ) test1newmom <- with(test1char, fix_parents(id, father, mother, sex, missid = NA_character_ )) Pedigree(test1newmom)
Perform transformation uppon a dataframe given to compute the colors for the filling and the border of the individuals based on the affection and availability status.
## S4 method for signature 'character' generate_colors( obj, avail, mods_aff = NULL, is_num = FALSE, keep_full_scale = FALSE, colors_aff = c("yellow2", "red"), colors_unaff = c("white", "steelblue4"), colors_avail = c("green", "black"), colors_na = "grey" ) ## S4 method for signature 'numeric' generate_colors( obj, avail, threshold = 0.5, sup_thres_aff = TRUE, is_num = TRUE, keep_full_scale = FALSE, breaks = 3, colors_aff = c("yellow2", "red"), colors_unaff = c("white", "steelblue4"), colors_avail = c("green", "black"), colors_na = "grey" ) ## S4 method for signature 'Pedigree' generate_colors( obj, col_aff = "affected", add_to_scale = TRUE, col_avail = "avail", is_num = NULL, mods_aff = NULL, threshold = 0.5, sup_thres_aff = TRUE, keep_full_scale = FALSE, breaks = 3, colors_aff = c("yellow2", "red"), colors_unaff = c("white", "steelblue4"), colors_avail = c("green", "black"), colors_na = "grey", reset = TRUE )
## S4 method for signature 'character' generate_colors( obj, avail, mods_aff = NULL, is_num = FALSE, keep_full_scale = FALSE, colors_aff = c("yellow2", "red"), colors_unaff = c("white", "steelblue4"), colors_avail = c("green", "black"), colors_na = "grey" ) ## S4 method for signature 'numeric' generate_colors( obj, avail, threshold = 0.5, sup_thres_aff = TRUE, is_num = TRUE, keep_full_scale = FALSE, breaks = 3, colors_aff = c("yellow2", "red"), colors_unaff = c("white", "steelblue4"), colors_avail = c("green", "black"), colors_na = "grey" ) ## S4 method for signature 'Pedigree' generate_colors( obj, col_aff = "affected", add_to_scale = TRUE, col_avail = "avail", is_num = NULL, mods_aff = NULL, threshold = 0.5, sup_thres_aff = TRUE, keep_full_scale = FALSE, breaks = 3, colors_aff = c("yellow2", "red"), colors_unaff = c("white", "steelblue4"), colors_avail = c("green", "black"), colors_na = "grey", reset = TRUE )
obj |
A Pedigree object or a vector containing the affection status for each individuals. The affection status can be numeric or a character. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
mods_aff |
Vector of modality to consider as affected in the case
where the |
is_num |
Boolean defining if the values need to be considered as numeric. |
keep_full_scale |
Boolean defining if the affection values need to
be set as a scale. If |
colors_aff |
Set of increasing colors to use for the filling of the affected individuls. |
colors_unaff |
Set of increasing colors to use for the filling of the unaffected individuls. |
colors_avail |
Set of 2 colors to use for the box's border of an
individual. The first color will be used for available individual
( |
colors_na |
Color to use for individuals with no informations. |
threshold |
Numeric value separating the affected and healthy subject
in the case where the |
sup_thres_aff |
Boolean defining if the affected individual are above
the threshold or not.
If |
breaks |
Number of breaks to use when using full scale with numeric values. The same number of breaks will be done for values from affected individuals and unaffected individuals. |
col_aff |
A character vector with the name of the column to be used for the affection status. |
add_to_scale |
Boolean defining if the scales need to be added to the existing scales or if they need to replace the existing scales. |
col_avail |
A character vector with the name of the column to be used for the availability status. |
reset |
If |
The colors will be set using the
generate_fill()] and the
generate_border()
functions respectively for
the filling and the border.
A list of two elements
The list containing the filling colors processed and their description
The list containing the border colors processed and their description
The Pedigree object with the affected
and avail
columns
processed accordingly as well as the scales
slot updated.
generate_colors( c("A", "B", "A", "B", NA, "A", "B", "A", "B", NA), c(1, 0, 1, 0, NA, 1, 0, 1, 0, NA), mods_aff = "A" ) generate_colors( c(10, 0, 5, 7, NA, 6, 2, 1, 3, NA), c(1, 0, 1, 0, NA, 1, 0, 1, 0, NA), threshold = 3, keep_full_scale = TRUE ) data("sampleped") ped <- Pedigree(sampleped) ped <- generate_colors(ped, "affected", add_to_scale=FALSE) scales(ped)
generate_colors( c("A", "B", "A", "B", NA, "A", "B", "A", "B", NA), c(1, 0, 1, 0, NA, 1, 0, 1, 0, NA), mods_aff = "A" ) generate_colors( c(10, 0, 5, 7, NA, 6, 2, 1, 3, NA), c(1, 0, 1, 0, NA, 1, 0, 1, 0, NA), threshold = 3, keep_full_scale = TRUE ) data("sampleped") ped <- Pedigree(sampleped) ped <- generate_colors(ped, "affected", add_to_scale=FALSE) scales(ped)
The hints are used to specify the order of the individuals in the pedigree and to specify the order of the spouses.
You either need to provide horder or spouse in the dedicated parameters (together or separately), or inside a list.
Hints(horder, spouse) ## S4 method for signature 'list,missing_OR_NULL' Hints(horder, spouse) ## S4 method for signature 'numeric,data.frame' Hints(horder, spouse) ## S4 method for signature 'numeric,missing_OR_NULL' Hints(horder, spouse)
Hints(horder, spouse) ## S4 method for signature 'list,missing_OR_NULL' Hints(horder, spouse) ## S4 method for signature 'numeric,data.frame' Hints(horder, spouse) ## S4 method for signature 'numeric,missing_OR_NULL' Hints(horder, spouse)
horder |
A named numeric vector with one element per subject in the Pedigree. It determines the relative horizontal order of subjects within a sibship, as well as the relative order of processing for the founder couples. (For this latter, the female founders are ordered as though they were sisters). The names of the vector should be the individual identifiers. |
spouse |
A data.frame with one row per hinted marriage, usually only
a few marriages in a pedigree will need an added hint, for instance reverse
the plot order of a husband/wife pair.
Each row contains the id of the left spouse (i.e. |
A Hints object.
horder
A numeric named vector with one element per subject in the Pedigree. It determines the relative horizontal order of subjects within a sibship, as well as the relative order of processing for the founder couples. (For this latter, the female founders are ordered as though they were sisters).
spouse
A data.frame with one row per hinted marriage, usually
only a few marriages in a Pedigree will need an added hint, for
instance reverse the plot order of a husband/wife pair.
Each row contains the identifiers of the left spouse, the right hand spouse,
and the anchor (i.e : 1
= left, 2
= right, 0
= either).
horder(x)
: Get the horder vector
horder(x) <- value
: Set the horder vector
spouse(x)
: Get the spouse data.frame
spouse(x) <- value
: Set the spouse data.frame
as.list(x)
: Convert a Hints object to a list
subset(x, i, keep = TRUE)
: Subset a Hints object
based on the individuals identifiers given.
i
: A vector of individuals identifiers to keep.
keep
: A logical value indicating if the individuals
should be kept or deleted.
Hints( list( horder = c("1" = 1, "2" = 2, "3" = 3), spouse = data.frame( idl = c("1", "2"), idr = c("2", "3"), anchor = c(1, 2) ) ) ) Hints( horder = c("1" = 1, "2" = 2, "3" = 3), spouse = data.frame( idl = c("1", "2"), idr = c("2", "3"), anchor = c(1, 2) ) ) Hints( horder = c("1" = 1, "2" = 2, "3" = 3) )
Hints( list( horder = c("1" = 1, "2" = 2, "3" = 3), spouse = data.frame( idl = c("1", "2"), idr = c("2", "3"), anchor = c(1, 2) ) ) ) Hints( horder = c("1" = 1, "2" = 2, "3" = 3), spouse = data.frame( idl = c("1", "2"), idr = c("2", "3"), anchor = c(1, 2) ) ) Hints( horder = c("1" = 1, "2" = 2, "3" = 3) )
Transform identity by descent (IBD) matrix data from the form produced by external programs such as SOLAR into the compact form used by the coxme and lmekin routines.
ibd_matrix(id1, id2, ibd, idmap, diagonal)
ibd_matrix(id1, id2, ibd, idmap, diagonal)
id1 |
A character vector with the id of the first individuals of each pairs or a matrix or data frame with 3 columns: id1, id2, and ibd |
id2 |
A character vector with the id of the second individuals of each pairs |
ibd |
the IBD value for that pair |
idmap |
an optional 2 column matrix or data frame whose first element
is the internal value (as found in |
diagonal |
optional value for the diagonal element. If present, any missing diagonal elements in the input data will be set to this value. |
The IBD matrix for a set of n subjects will be an n by n symmetric matrix whose i,j element is the contains, for some given genetic location, a 0/1 indicator of whether 0, 1/2 or 2/2 of the alleles for i and j are identical by descent. Fractional values occur if the IBD fraction must be imputed. The diagonal will be 1. Since a large fraction of the values will be zero, programs such as Solar return a data set containing only the non-zero elements. As well, Solar will have renumbered the subjects as seq_len(n) in such a way that families are grouped together in the matrix; a separate index file contains the mapping between this new id and the original one. The final matrix should be labeled with the original identifiers.
a sparse matrix of class dsCMatrix
. This is the same form
used for kinship matrices.
df <- data.frame( id1 = c("1", "2", "1"), id2 = c("2", "3", "4"), ibd = c(0.5, 0.16, 0.27) ) ibd_matrix(df$id1, df$id2, df$ibd, diagonal = 2)
df <- data.frame( id1 = c("1", "2", "1"), id2 = c("2", "3", "4"), ibd = c(0.5, 0.16, 0.27) ) ibd_matrix(df$id1, df$id2, df$ibd, diagonal = 2)
Select the ids of the informative individuals.
## S4 method for signature 'character_OR_integer' is_informative(obj, avail, affected, informative = "AvAf") ## S4 method for signature 'Ped' is_informative(obj, informative = "AvAf", reset = FALSE) ## S4 method for signature 'Pedigree' is_informative(obj, col_aff = NULL, informative = "AvAf", reset = FALSE)
## S4 method for signature 'character_OR_integer' is_informative(obj, avail, affected, informative = "AvAf") ## S4 method for signature 'Ped' is_informative(obj, informative = "AvAf", reset = FALSE) ## S4 method for signature 'Pedigree' is_informative(obj, col_aff = NULL, informative = "AvAf", reset = FALSE)
obj |
A character vector with the id of the individuals or a
|
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
informative |
Informative individuals selection can take 5 values:
|
reset |
If |
col_aff |
A character vector with the name of the column to be used for the affection status. |
Depending on the informative parameter, the function will extract the ids of the informative individuals. In the case of a numeric vector, the function will return the same vector. In the case of a boolean, the function will return the ids of the individuals if TRUE, NA otherwise. In the case of a string, the function will return the ids of the corresponding informative individuals based on the avail and affected columns.
A vector of individuals informative identifiers.
The Pedigree object with its isinf
slot updated.
is_informative(c("A", "B", "C", "D", "E"), informative = c("A", "B")) is_informative(c("A", "B", "C", "D", "E"), informative = c(1, 2)) is_informative(c("A", "B", "C", "D", "E"), informative = c("A", "B")) is_informative(c("A", "B", "C", "D", "E"), avail = c(1, 0, 0, 1, 1), affected = c(0, 1, 0, 1, 1), informative = "AvAf") is_informative(c("A", "B", "C", "D", "E"), avail = c(1, 0, 0, 1, 1), affected = c(0, 1, 0, 1, 1), informative = "AvOrAf") is_informative(c("A", "B", "C", "D", "E"), informative = c(TRUE, FALSE, TRUE, FALSE, TRUE)) data("sampleped") ped <- Pedigree(sampleped) ped <- is_informative(ped, col_aff = "affection_mods") isinf(ped(ped)) data("sampleped") ped <- Pedigree(sampleped) ped <- is_informative(ped, col_aff = "affection_mods") isinf(ped(ped))
is_informative(c("A", "B", "C", "D", "E"), informative = c("A", "B")) is_informative(c("A", "B", "C", "D", "E"), informative = c(1, 2)) is_informative(c("A", "B", "C", "D", "E"), informative = c("A", "B")) is_informative(c("A", "B", "C", "D", "E"), avail = c(1, 0, 0, 1, 1), affected = c(0, 1, 0, 1, 1), informative = "AvAf") is_informative(c("A", "B", "C", "D", "E"), avail = c(1, 0, 0, 1, 1), affected = c(0, 1, 0, 1, 1), informative = "AvOrAf") is_informative(c("A", "B", "C", "D", "E"), informative = c(TRUE, FALSE, TRUE, FALSE, TRUE)) data("sampleped") ped <- Pedigree(sampleped) ped <- is_informative(ped, col_aff = "affection_mods") isinf(ped(ped)) data("sampleped") ped <- Pedigree(sampleped) ped <- is_informative(ped, col_aff = "affection_mods") isinf(ped(ped))
Check which individuals are parents.
## S4 method for signature 'character_OR_integer' is_parent(obj, dadid, momid, missid = NA_character_) ## S4 method for signature 'Ped' is_parent(obj, missid = NA_character_)
## S4 method for signature 'character_OR_integer' is_parent(obj, dadid, momid, missid = NA_character_) ## S4 method for signature 'Ped' is_parent(obj, missid = NA_character_)
obj |
A vector of each subjects identifiers or a Ped object |
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
A vector of boolean of the same size as obj with TRUE if the individual is a parent and FALSE otherwise
is_parent(c("1", "2", "3", "4"), c("3", "3", NA, NA), c("4", "4", NA, NA)) data(sampleped) ped <- Pedigree(sampleped) is_parent(ped(ped))
is_parent(c("1", "2", "3", "4"), c("3", "3", NA, NA), c("4", "4", NA, NA)) data(sampleped) ped <- Pedigree(sampleped) is_parent(ped(ped))
Computes the depth of each subject in the Pedigree.
## S4 method for signature 'character_OR_integer' kindepth(obj, dadid, momid, align_parents = FALSE, force = FALSE) ## S4 method for signature 'Pedigree' kindepth(obj, align_parents = FALSE, force = FALSE) ## S4 method for signature 'Ped' kindepth(obj, align_parents = FALSE, force = FALSE)
## S4 method for signature 'character_OR_integer' kindepth(obj, dadid, momid, align_parents = FALSE, force = FALSE) ## S4 method for signature 'Pedigree' kindepth(obj, align_parents = FALSE, force = FALSE) ## S4 method for signature 'Ped' kindepth(obj, align_parents = FALSE, force = FALSE)
obj |
A character vector with the id of the individuals or a
|
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
align_parents |
If |
force |
If |
Mark each person as to their depth in a Pedigree; 0
for a founder,
otherwise :
In the case of an inbred Pedigree a perfect alignment may not exist.
An integer vector containing the depth for each subject
Terry Therneau, updated by Louis Le Nézet
kindepth( c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0") ) data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == "1",]) kindepth(ped1)
kindepth( c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0") ) data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == "1",]) kindepth(ped1)
Compute the kinship matrix for a set of related autosomal subjects. The function is generic, and can accept a Pedigree, a Ped or a vector as the first argument.
## S4 method for signature 'Ped' kinship(obj, chrtype = "autosome") ## S4 method for signature 'character' kinship(obj, dadid, momid, sex, chrtype = "autosome") ## S4 method for signature 'Pedigree' kinship(obj, chrtype = "autosome")
## S4 method for signature 'Ped' kinship(obj, chrtype = "autosome") ## S4 method for signature 'character' kinship(obj, dadid, momid, sex, chrtype = "autosome") ## S4 method for signature 'Pedigree' kinship(obj, chrtype = "autosome")
obj |
A Pedigree or Ped object or a vector of subject identifiers. |
chrtype |
chromosome type. The currently supported types are 'autosome' and 'X' or 'x'. |
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
sex |
A character, factor or numeric vector corresponding to
the gender of the individuals. This will be transformed to an ordered factor
with the following levels:
|
The function will usually be called with a Pedigree. The call with a Ped or a vector is provided for backwards compatibility with an earlier release of the library that was less capable. Note that when using with a Ped or a vector, any information on twins is not available to the function.
When called with a Pedigree, the routine
will create a block-diagonal-symmetric sparse matrix object of class
dsCMatrix
. Since the [i, j]
value of the result is 0 for any two
unrelated individuals i and j and a Matrix
utilizes sparse
representation, the resulting object is often orders of magnitude smaller
than an ordinary matrix.
Two genes G1 and G2 are identical by descent (IBD) if they are both physical copies of the same ancestral gene; two genes are identical by state if they represent the same allele. So the brown eye gene that I inherited from my mother is IBD with hers; the same gene in an unrelated individual is not.
The kinship coefficient between two subjects is the probability that a randomly selected allele from a locus will be IBD between them. It is obviously 0 between unrelated individuals. For an autosomal site and no inbreeding it will be 0.5 for an individual with themselves, .25 between mother and child, .125 between an uncle and neice, etc.
The computation is based on a recursive algorithm described in Lange, which assumes that the founder alleles are all independent.
A matrix of kinship coefficients.
A matrix of kinship coefficients ordered by families present in the Pedigree object.
K Lange, Mathematical and Statistical Methods for Genetic Analysis, Springer-Verlag, New York, 1997.
kinship(c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0"), sex = c(1, 2, 1, 2, 1)) kinship(c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0"), sex = c(1, 2, 1, 2, 1), chrtype = "x" ) data(sampleped) ped <- Pedigree(sampleped) kinship(ped)
kinship(c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0"), sex = c(1, 2, 1, 2, 1)) kinship(c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0"), sex = c(1, 2, 1, 2, 1), chrtype = "x" ) data(sampleped) ped <- Pedigree(sampleped) kinship(ped)
Construct a family identifier from pedigree information
## S4 method for signature 'character' make_famid(obj, dadid, momid) ## S4 method for signature 'Pedigree' make_famid(obj)
## S4 method for signature 'character' make_famid(obj, dadid, momid) ## S4 method for signature 'Pedigree' make_famid(obj)
obj |
A character vector with the id of the individuals or a
|
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
Create a vector of length n, giving the family 'tree' number of each subject. If the Pedigree is totally connected, then everyone will end up in tree 1, otherwise the tree numbers represent the disconnected subfamilies. Singleton subjects give a zero for family number.
An integer vector giving family groupings
An updated Pedigree object with the family id added and with all ids updated
make_famid( c("A", "B", "C", "D", "E", "F"), c("C", "D", "0", "0", "0", "0"), c("E", "E", "0", "0", "0", "0") ) data(sampleped) ped1 <- Pedigree(sampleped[,-1]) make_famid(ped1)
make_famid( c("A", "B", "C", "D", "E", "F"), c("C", "D", "0", "0", "0", "0"), c("E", "E", "0", "0", "0", "0") ) data(sampleped) ped1 <- Pedigree(sampleped[,-1]) make_famid(ped1)
Compute the minimum distance between the informative individuals and all the others. This distance is a transformation of the maximum kinship degree between the informative individuals and all the others. This transformation is done by taking the log2 of the inverse of the maximum kinship degree.
Therefore, the minimum distance is 0 when the maximum kinship is 1 and is infinite when the maximum kinship is 0. For siblings, the kinship value is 0.5 and the minimum distance is 1. Each time the kinship degree is divided by 2, the minimum distance is increased by 1.
## S4 method for signature 'character' min_dist_inf(obj, dadid, momid, sex, id_inf) ## S4 method for signature 'Pedigree' min_dist_inf(obj, reset = FALSE, ...) ## S4 method for signature 'Ped' min_dist_inf(obj, reset = FALSE)
## S4 method for signature 'character' min_dist_inf(obj, dadid, momid, sex, id_inf) ## S4 method for signature 'Pedigree' min_dist_inf(obj, reset = FALSE, ...) ## S4 method for signature 'Ped' min_dist_inf(obj, reset = FALSE)
obj |
A character vector with the id of the individuals or a
|
... |
Additional arguments |
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
sex |
A character, factor or numeric vector corresponding to
the gender of the individuals. This will be transformed to an ordered factor
with the following levels:
|
id_inf |
An identifiers vector of informative individuals. |
reset |
If TRUE, the |
A vector of the minimum distance between the informative individuals
and all the others corresponding to the order of the individuals in the
obj
vector.
The Pedigree object with a new slot named 'kin' containing the minimum
distance between each individuals and the informative individuals.
The isinf
slot is also updated with the informative individuals.
min_dist_inf( c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0"), sex = c(1, 2, 1, 2, 1), id_inf = c("D", "E") ) data(sampleped) ped <- is_informative( Pedigree(sampleped), informative = "AvAf", col_aff = "affection_mods" ) kin(ped(min_dist_inf(ped, col_aff = "affection_mods")))
min_dist_inf( c("A", "B", "C", "D", "E"), c("C", "D", "0", "0", "0"), c("E", "E", "0", "0", "0"), sex = c(1, 2, 1, 2, 1), id_inf = c("D", "E") ) data(sampleped) ped <- is_informative( Pedigree(sampleped), informative = "AvAf", col_aff = "affection_mods" ) kin(ped(min_dist_inf(ped, col_aff = "affection_mods")))
Data from the Minnesota Breast Cancer Family Study. This contains extended pedigrees from 426 families, each identified by a single proband in 1945-1952, with follow up for incident breast cancer.
data(minnbreast)
data(minnbreast)
A data frame with 28081 observations, one line per subject, on the following 14 variables.
id
: Subject identifier
proband
: If 1, this subject is one of the original
426 probands
fatherid
: Identifier of the father, if the father is part of
the data set; zero otherwise
motherid
: Identifier of the mother, if the mother is part of
the data set; zero otherwise
famid
: Family identifier
endage
: Age at last follow-up or incident cancer
cancer
: 1
= breast cancer (females) or
prostate cancer (males),
0
= censored
yob
: Year of birth
education
: Amount of education: 1-8 years, 9-12 years, high
school graduate, vocational education beyond high school,
some college but did not graduate, college graduate,
post-graduate education, refused to answer on the questionnaire
marstat
: Marital status: married, living with someone in a
marriage-like relationship, separated
or divorced, widowed, never married, refused to answer the questionaire
everpreg
: Ever pregnant at the time of baseline survey
parity
: Number of births
nbreast
: Number of breast biopsies
sex
: M
or F
bcpc
: Part of one of the families in the breast / prostate
cancer substudy: 0
= no, 1
= yes.
Note that subjects who were recruited to the overall study after the date of
the BP substudy are coded as zero.
The original study was conducted by Dr. Elving Anderson at the Dight Institute for Human Genetics at the University of Minnesota. From 1944 to 1952, 544 sequential breast cancer cases seen at the University Hospital were enrolled, and information gathered on parents, siblings, offspring, aunts / uncles, and grandparents with the goal of understanding possible familial aspects of brest cancer. In 1991 the study was resurrected by Dr Tom Sellers.
Of the original 544 he excluded 58 prevalent cases, along with another 19 who had less than 2 living relatives at the time of Dr Anderson's survey. Of the remaining 462 families 10 had no living members, 23 could not be located and 8 refused, leaving 426 families on whom updated pedigrees were obtained.
This gave a study with 13351 males and 12699 females (5183 marry-ins). Primary questions were the relationship of early life exposures, breast density, and pharmacogenomics on incident breast cancer risk. For a subset of the families data was gathered on prostate cancer risk for male subjects via questionnaires sent to men over 40. Other than this, data items other than parentage are limited to the female subjects. In 2003 a second phase of the study was instituted. The pedigrees were further extended to the numbers found in this data set, and further data gathered by questionnaire.
Epidemiologic and genetic follow-up study of 544 Minnesota breast cancer families: design and methods. Sellers TA, Anderson VE, Potter JD, Bartow SA, Chen PL, Everson L, King RA, Kuni CC, Kushi LH, McGovern PG, et al. Genetic Epidemiology, 1995; 12(4):417-29.
Evaluation of familial clustering of breast and prostate cancer in the Minnesota Breast Cancer Family Study. Grabrick DM, Cerhan JR, Vierkant RA, Therneau TM, Cheville JC, Tindall DJ, Sellers TA. Cancer Detect Prev. 2003; 27(1):30-6.
Risk of breast cancer with oral contraceptive use in women with a family history of breast cancer. Grabrick DM, Hartmann LC, Cerhan JR, Vierkant RA, Therneau TM, Vachon CM, Olson JE, Couch FJ, Anderson KE, Pankratz VS, Sellers TA. JAMA. 2000; 284(14):1791-8.
data(minnbreast) breastped <- Pedigree(minnbreast, cols_ren_ped = list( "indId" = "id", "fatherId" = "fatherid", "motherId" = "motherid", "gender" = "sex", "family" = "famid" ), missid = "0", col_aff = "cancer" ) summary(breastped) scales(breastped) #plot family 8, proband is solid, slash for cancers if (interactive()) { plot(breastped[famid(ped(breastped)) == "8"], aff_mark = TRUE) }
data(minnbreast) breastped <- Pedigree(minnbreast, cols_ren_ped = list( "indId" = "id", "fatherId" = "fatherid", "motherId" = "motherid", "gender" = "sex", "family" = "famid" ), missid = "0", col_aff = "cancer" ) summary(breastped) scales(breastped) #plot family 8, proband is solid, slash for cancers if (interactive()) { plot(breastped[famid(ped(breastped)) == "8"], aff_mark = TRUE) }
Normalise dataframe for a Ped object
norm_ped( ped_df, na_strings = c("NA", ""), missid = NA_character_, try_num = FALSE, cols_used_del = FALSE )
norm_ped( ped_df, na_strings = c("NA", ""), missid = NA_character_, try_num = FALSE, cols_used_del = FALSE )
ped_df |
A data.frame with the individuals informations. The minimum columns required are:
The The following columns are also recognize and will be transformed with the
The values recognized for those columns are |
na_strings |
Vector of strings to be considered as NA values. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
try_num |
Boolean defining if the function should try to convert all the columns to numeric. |
cols_used_del |
Boolean defining if the columns that will be used should be deleted. |
Normalise a dataframe and check for columns correspondance
to be able to use it as an input to create a Ped object.
Multiple test are done and errors are checked.
Sex is calculated based on the gender
column.
The steril
column need to be a boolean either TRUE, FALSE or 'NA'.
Will be considered available any individual with no 'NA' values in the
available
column.
Duplicated indId
will nullify the relationship of the individual.
All individuals with errors will be remove from the dataframe and will
be transfered to the error dataframe.
A number of checks are done to ensure the dataframe is correct:
All ids (id, dadid, momid, famid) are not empty (!= ""
)
All id
are unique (no duplicated)
All dadid
and momid
are unique in the id column
(no duplicated)
id is not the same as dadid or momid
Either have both parents or none
All sex code are either male
, female
,
terminated
or unknown
.
No parents are steril
All fathers are male
All mothers are female
A dataframe with different variable correctly standardized
and with the errors identified in the error
column
df <- data.frame( indId = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), fatherId = c("A", 0, 1, 3, 0, 4, 1, 0, 6, 6), motherId = c(0, 0, 2, 2, 0, 5, 2, 0, 8, 8), gender = c(1, 2, "m", "man", "f", "male", "m", "m", "f", "f"), available = c("A", "1", 0, NA, 1, 0, 1, 0, 1, 0), famid = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2), sterilisation = c("TRUE", "FALSE", TRUE, FALSE, 1, 0, 1, 0, 1, "TRUE"), vitalStatus = c("TRUE", "FALSE", TRUE, FALSE, 1, 0, 1, 0, 1, 0), affection = c("TRUE", "FALSE", TRUE, FALSE, 1, 0, 1, 0, 1, 0) ) tryCatch( norm_ped(df), error = function(e) print(e) )
df <- data.frame( indId = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), fatherId = c("A", 0, 1, 3, 0, 4, 1, 0, 6, 6), motherId = c(0, 0, 2, 2, 0, 5, 2, 0, 8, 8), gender = c(1, 2, "m", "man", "f", "male", "m", "m", "f", "f"), available = c("A", "1", 0, NA, 1, 0, 1, 0, 1, 0), famid = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2), sterilisation = c("TRUE", "FALSE", TRUE, FALSE, 1, 0, 1, 0, 1, "TRUE"), vitalStatus = c("TRUE", "FALSE", TRUE, FALSE, 1, 0, 1, 0, 1, 0), affection = c("TRUE", "FALSE", TRUE, FALSE, 1, 0, 1, 0, 1, 0) ) tryCatch( norm_ped(df), error = function(e) print(e) )
Normalise a dataframe and check for columns correspondance to be able to use it as an input to create a Ped object.
norm_rel(rel_df, na_strings = c("NA", ""), missid = NA_character_)
norm_rel(rel_df, na_strings = c("NA", ""), missid = NA_character_)
rel_df |
A data.frame with the special relationships between
individuals. See
The value relation code recognized by the function are the one defined
by the |
na_strings |
Vector of strings to be considered as NA values. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
The famid
column, if provided, will be merged to the
ids field separated by an underscore using the
upd_famid()
function.
The code
column will be transformed with the
rel_code_to_factor()
.
Multiple test are done and errors are checked.
A number of checks are done to ensure the dataframe is correct:
All ids (id1, id2) are not empty (!= ""
)
id1
and id2
are not the same
All code are recognised as either "MZ twin", "DZ twin", "UZ twin" or "Spouse"
A dataframe with the errors identified
df <- data.frame( id1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), id2 = c(2, 3, 4, 5, 6, 7, 8, 9, 10, 1), code = c("MZ twin", "DZ twin", "UZ twin", "Spouse", 1, 2, 3, 4, "MzTwin", "sp oUse"), famid = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2) ) norm_rel(df)
df <- data.frame( id1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), id2 = c(2, 3, 4, 5, 6, 7, 8, 9, 10, 1), code = c("MZ twin", "DZ twin", "UZ twin", "Spouse", 1, 2, 3, 4, "MzTwin", "sp oUse"), famid = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2) ) norm_rel(df)
Compute the number of childs per individual
## S4 method for signature 'character_OR_integer' num_child(obj, dadid, momid, rel_df = NULL, missid = NA_character_) ## S4 method for signature 'Pedigree' num_child(obj, reset = FALSE)
## S4 method for signature 'character_OR_integer' num_child(obj, dadid, momid, rel_df = NULL, missid = NA_character_) ## S4 method for signature 'Pedigree' num_child(obj, reset = FALSE)
obj |
A character vector with the id of the individuals or a
|
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
rel_df |
A data.frame with the special relationships between
individuals. See
The value relation code recognized by the function are the one defined
by the |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
reset |
If TRUE, the |
Compute the number of direct child but also the number of indirect child given by the ones related with the linked spouses. If a relation ship dataframe is given, then even if no children is present between 2 spouses, the indirect childs will still be added.
A dataframe with the columns num_child_dir
, num_child_ind
and
num_child_tot
giving respectively the direct, indirect and total number
of child.
An updated Pedigree object with the columns num_child_dir
,
num_child_ind
and num_child_tot
added to the
Pedigree ped
slot.
num_child( obj = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), dadid = c("3", "3", "6", "8", "0", "0", "0", "0", "0", "0"), momid = c("4", "5", "7", "9", "0", "0", "0", "0", "0", "0"), rel_df = data.frame( id1 = "10", id2 = "3", code = "Spouse" ) ) data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == "1",]) ped1 <- num_child(ped1, reset = TRUE) summary(ped(ped1))
num_child( obj = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), dadid = c("3", "3", "6", "8", "0", "0", "0", "0", "0", "0"), momid = c("4", "5", "7", "9", "0", "0", "0", "0", "0", "0"), rel_df = data.frame( id1 = "10", id2 = "3", code = "Spouse" ) ) data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == "1",]) ped1 <- num_child(ped1, reset = TRUE) summary(ped(ped1))
Get the parents of individuals.
## S4 method for signature 'character_OR_integer' parent_of(obj, dadid, momid, id2) ## S4 method for signature 'Ped' parent_of(obj, id2) ## S4 method for signature 'Pedigree' parent_of(obj, id2)
## S4 method for signature 'character_OR_integer' parent_of(obj, dadid, momid, id2) ## S4 method for signature 'Ped' parent_of(obj, id2) ## S4 method for signature 'Pedigree' parent_of(obj, id2)
obj |
A character vector with the id of the individuals or a
|
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
id2 |
A vector of individuals identifiers to get the parents from |
A vector of individuals identifiers corresponding to the parents of the individuals in id2
data(sampleped) ped <- Pedigree(sampleped) parent_of(ped, "1_121")
data(sampleped) ped <- Pedigree(sampleped) parent_of(ped, "1_121")
This function creates a shiny application to manage and visualize
pedigree data using the ped_ui()
and ped_server()
functions.
ped_shiny( port = getOption("shiny.port"), host = getOption("shiny.host", "127.0.0.1"), precision = 2 )
ped_shiny( port = getOption("shiny.port"), host = getOption("shiny.host", "127.0.0.1"), precision = 2 )
port |
(optional) Specify port the application should list to. |
host |
(optional) The IPv4 address that the application should listen on. |
precision |
Number of decimal for the position of the boxes in the plot. |
The application is composed of several modules:
Data import
Data column selection
Data download
Family selection
Health selection
Informative selection
Subfamily selection
Plotting pedigree
Family information
Running Shiny Application
if (interactive()) { ped_shiny() }
if (interactive()) { ped_shiny() }
Convert a Pedigree to a legend data frame for it to
be plotted afterwards with plot_fromdf()
.
## S4 method for signature 'Pedigree' ped_to_legdf( obj, boxh = 1, boxw = 1, cex = 1, adjx = 0, adjy = 0, lwd = par("lwd") )
## S4 method for signature 'Pedigree' ped_to_legdf( obj, boxh = 1, boxw = 1, cex = 1, adjx = 0, adjy = 0, lwd = par("lwd") )
obj |
A Pedigree object |
boxh |
Height of the polygons elements |
boxw |
Width of the polygons elements |
cex |
Character expansion of the text |
adjx |
default=0. Controls the horizontal text adjustment of the labels in the legend. |
adjy |
default=0. Controls the vertical text adjustment of the labels in the legend. |
lwd |
default=par("lwd"). Controls the bordering line width of the elements in the legend. |
The data frame contains the following columns:
x0
, y0
, x1
, y1
: coordinates of the elements
type
: type of the elements
fill
: fill color of the elements
border
: border color of the elements
angle
: angle of the shading of the elements
density
: density of the shading of the elements
cex
: size of the elements
label
: label of the elements
tips
: tips of the elements (used for the tooltips)
adjx
: horizontal text adjustment of the labels
adjy
: vertical text adjustment of the labels
All those columns are used by
plot_fromdf()
to plot the graph.
A list containing the legend data frame and the user coordinates.
data("sampleped") ped <- Pedigree(sampleped) leg_df <- ped_to_legdf(ped) summary(leg_df$df) plot_fromdf(leg_df$df, usr = c(-1,15,0,7))
data("sampleped") ped <- Pedigree(sampleped) leg_df <- ped_to_legdf(ped) summary(leg_df$df) plot_fromdf(leg_df$df, usr = c(-1,15,0,7))
Convert a Pedigree to a data frame with all the elements and their
characteristic for them to be plotted afterwards with
plot_fromdf()
.
## S4 method for signature 'Pedigree' ped_to_plotdf( obj, packed = TRUE, width = 6, align = c(1.5, 2), align_parents = TRUE, force = FALSE, cex = 1, symbolsize = cex, pconnect = 0.5, branch = 0.6, aff_mark = TRUE, id_lab = "id", label = NULL, precision = 3, lwd = par("lwd"), tips = NULL, ... )
## S4 method for signature 'Pedigree' ped_to_plotdf( obj, packed = TRUE, width = 6, align = c(1.5, 2), align_parents = TRUE, force = FALSE, cex = 1, symbolsize = cex, pconnect = 0.5, branch = 0.6, aff_mark = TRUE, id_lab = "id", label = NULL, precision = 3, lwd = par("lwd"), tips = NULL, ... )
obj |
A Pedigree object |
... |
Other arguments passed to |
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
width |
For a packed output, the minimum width of the plot, in inches. |
align |
For a packed Pedigree, align children under parents |
align_parents |
If |
force |
If |
cex |
Character expansion of the text |
symbolsize |
Size of the symbols |
pconnect |
When connecting parent to children the program will try to
make the connecting line as close to vertical as possible, subject to it
lying inside the endpoints of the line that connects the children by at
least |
branch |
defines how much angle is used to connect various levels of nuclear families. |
aff_mark |
If |
id_lab |
The column name of the id for each individuals. |
label |
If not |
precision |
The number of decimal places to round the solution to. |
lwd |
default=par("lwd"). Controls the line width of the segments, arcs and polygons. |
tips |
A character vector of the column names of the data frame to
use as tooltips. If |
The data frame contains the following columns:
x0
, y0
, x1
, y1
: coordinates of the elements
type
: type of the elements
fill
: fill color of the elements
border
: border color of the elements
angle
: angle of the shading of the elements
density
: density of the shading of the elements
cex
: size of the elements
label
: label of the elements
tips
: tips of the elements (used for the tooltips)
adjx
: horizontal text adjustment of the labels
adjy
: vertical text adjustment of the labels
All those columns are used by
plot_fromdf()
to plot the graph.
A list containing the data frame and the user coordinates.
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == 1,]) plot_df <- ped_to_plotdf(ped1) summary(plot_df$df) plot_fromdf(plot_df$df, usr = plot_df$par_usr$usr, boxh = plot_df$par_usr$boxh, boxw = plot_df$par_usr$boxw )
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == 1,]) plot_df <- ped_to_plotdf(ped1) summary(plot_df$df) plot_fromdf(plot_df$df, usr = plot_df$par_usr$usr, boxh = plot_df$par_usr$boxh, boxw = plot_df$par_usr$boxw )
S4 class to represent the identity informations of the individuals in a pedigree.
You either need to provide a vector of the same size for each slot
or a data.frame
with the corresponding columns.
The metadata will correspond to the columns that do not correspond to the Ped slots.
## S4 method for signature 'data.frame' Ped(obj, cols_used_init = FALSE, cols_used_del = FALSE) ## S4 method for signature 'character_OR_integer' Ped( obj, sex, dadid, momid, famid = NA, steril = NA, status = NA, avail = NA, affected = NA, missid = NA_character_, useful = NA, isinf = NA, kin = NA_real_ )
## S4 method for signature 'data.frame' Ped(obj, cols_used_init = FALSE, cols_used_del = FALSE) ## S4 method for signature 'character_OR_integer' Ped( obj, sex, dadid, momid, famid = NA, steril = NA, status = NA, avail = NA, affected = NA, missid = NA_character_, useful = NA, isinf = NA, kin = NA_real_ )
obj |
A character vector with the id of the individuals or a
|
cols_used_init |
Boolean defining if the columns that will be used should be initialised to NA. |
cols_used_del |
Boolean defining if the columns that will be used should be deleted. |
sex |
A character, factor or numeric vector corresponding to
the gender of the individuals. This will be transformed to an ordered factor
with the following levels:
|
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
famid |
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore. |
steril |
A logical vector with the sterilisation status of the
individuals
(i.e. |
status |
A logical vector with the affection status of the
individuals
(i.e. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
useful |
A logical vector with the usefulness status of the
individuals (i.e. |
isinf |
A logical vector indicating if the individual is informative
or not (i.e. |
kin |
A numeric vector with minimal kinship value between the individuals and the informative individuals. |
The minimal needed informations are id
, dadid
,
momid
and sex
.
The other slots are used to store recognized informations.
Additional columns can be added to the Ped object and will be
stored in the elementMetadata
slot of the Ped object.
A Ped object.
id
A character vector with the id of the individuals.
dadid
A character vector with the id of the father of the individuals.
momid
A character vector with the id of the mother of the individuals.
sex
An ordered factor vector for the sex of the individuals
(i.e. male
< female
< unknown
< terminated
).
famid
A character vector with the family identifiers of the individuals (optional).
steril
A logical vector with the sterilisation status of the
individuals
(i.e. FALSE
= not sterilised,
TRUE
= sterilised,
NA
= unknown).
status
A logical vector with the affection status of the
individuals
(i.e. FALSE
= alive,
TRUE
= dead,
NA
= unknown).
avail
A logical vector with the availability status of the
individuals
(i.e. FALSE
= not available,
TRUE
= available,
NA
= unknown).
affected
A logical vector with the affection status of the
individuals
(i.e. FALSE
= not affected,
TRUE
= affected,
NA
= unknown).
useful
A logical vector with the usefulness status of the
individuals
(i.e. FALSE
= not useful,
TRUE
= useful).
isinf
A logical vector indicating if the individual is informative
or not
(i.e. FALSE
= not informative,
TRUE
= informative).
kin
A numeric vector with minimal kinship value between the individuals and the useful individuals.
num_child_tot
A numeric vector with the total number of children of the individuals.
num_child_dir
A numeric vector with the number of children of the individuals.
num_child_ind
A numeric vector with the number of children of the individuals.
elementMetadata
A DataFrame with the additional metadata columns of the Ped object.
metadata
Meta informations about the pedigree.
For all the following accessors, the x
parameters is a Ped object.
Each getters return a vector of the same length as x
with the values
of the corresponding slot. For each getter, you have a setter with the
same name, to be use as slot(x) <- value
.
The value
parameter is a vector of the same length as x
,
except for the mcols()
accessors where value
is a list
or a data.frame with each elements with the same length as x
.
id(x)
: Individuals identifiers
dadid(x)
: Individuals' father identifiers
momid(x)
: Individuals' mother identifiers
famid(x)
: Individuals' family identifiers
sex(x)
: Individuals' gender
affected(x)
: Individuals' affection status
avail(x)
: Individuals' availability status
status(x)
: Individuals' death status
isinf(x)
: Individuals' informativeness status
kin(x)
: Individuals' kinship distance to the
informative individuals
useful(x)
: Individuals' usefullness status
mcols(x)
: Individuals' metadata
summary(x)
: Compute the summary of a Ped object
show(x)
: Convert the Ped object to a data.frame
and print it with its summary.
as.list(x)
: Convert a Ped object to a list with
the metadata columns at the end.
as.data.frame(x)
: Convert a Ped object to a data.frame with
the metadata columns at the end.
subset(x, i, del_parents = FALSE, keep = TRUE)
: Subset a Ped object
based on the individuals identifiers given.
i
: A vector of individuals identifiers to keep.
del_parents
: A value indicating if the parents
of the individuals should be deleted.
keep
: A logical value indicating if the individuals
should be kept or deleted.
data(sampleped) Ped(sampleped) Ped( obj = c("1", "2", "3", "4", "5", "6"), dadid = c("4", "4", "6", "0", "0", "0"), momid = c("5", "5", "5", "0", "0", "0"), sex = c(1, 2, 3, 1, 2, 1), missid = "0" )
data(sampleped) Ped(sampleped) Ped( obj = c("1", "2", "3", "4", "5", "6"), dadid = c("4", "4", "6", "0", "0", "0"), momid = c("5", "5", "5", "0", "0", "0"), sex = c(1, 2, 3, 1, 2, 1), missid = "0" )
A pedigree is a ensemble of individuals linked to each other into a family tree. A Pedigree object store the informations of the individuals and the special relationships between them. It also permit to store the informations needed to plot the pedigree (i.e. scales and hints).
Main constructor of the package.
This constructor help to create a Pedigree
object from
different data.frame
or a set of vectors.
If any errors are found in the data, the function will return the data.frame with the errors of the Ped object and the Rel object.
Pedigree(obj, ...) ## S4 method for signature 'character_OR_integer' Pedigree( obj, dadid, momid, sex, famid = NA, avail = NULL, affected = NULL, status = NULL, steril = NULL, rel_df = NULL, missid = NA_character_, col_aff = "affection", normalize = TRUE, ... ) ## S4 method for signature 'data.frame' Pedigree( obj = data.frame(indId = character(), fatherId = character(), motherId = character(), gender = numeric(), family = character(), available = numeric(), vitalStatus = numeric(), affection = numeric(), sterilisation = numeric()), rel_df = data.frame(id1 = character(), id2 = character(), code = numeric(), famid = character()), cols_ren_ped = list(indId = "id", fatherId = "dadid", motherId = "momid", family = "famid", gender = "sex", sterilisation = "steril", affection = "affected", available = "avail", vitalStatus = "status"), cols_ren_rel = list(id1 = "indId1", id2 = "indId2", famid = "family"), hints = list(horder = NULL, spouse = NULL), normalize = TRUE, missid = NA_character_, col_aff = "affection", na_strings = c("NA", "N/A", "None", "none", "null", "NULL"), ... )
Pedigree(obj, ...) ## S4 method for signature 'character_OR_integer' Pedigree( obj, dadid, momid, sex, famid = NA, avail = NULL, affected = NULL, status = NULL, steril = NULL, rel_df = NULL, missid = NA_character_, col_aff = "affection", normalize = TRUE, ... ) ## S4 method for signature 'data.frame' Pedigree( obj = data.frame(indId = character(), fatherId = character(), motherId = character(), gender = numeric(), family = character(), available = numeric(), vitalStatus = numeric(), affection = numeric(), sterilisation = numeric()), rel_df = data.frame(id1 = character(), id2 = character(), code = numeric(), famid = character()), cols_ren_ped = list(indId = "id", fatherId = "dadid", motherId = "momid", family = "famid", gender = "sex", sterilisation = "steril", affection = "affected", available = "avail", vitalStatus = "status"), cols_ren_rel = list(id1 = "indId1", id2 = "indId2", famid = "family"), hints = list(horder = NULL, spouse = NULL), normalize = TRUE, missid = NA_character_, col_aff = "affection", na_strings = c("NA", "N/A", "None", "none", "null", "NULL"), ... )
obj |
A vector of the individuals identifiers or a data.frame
with the individuals informations.
See |
... |
Arguments passed on to |
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
sex |
A character, factor or numeric vector corresponding to
the gender of the individuals. This will be transformed to an ordered factor
with the following levels:
|
famid |
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
status |
A logical vector with the affection status of the
individuals
(i.e. |
steril |
A logical vector with the sterilisation status of the
individuals
(i.e. |
rel_df |
A data.frame with the special relationships between
individuals. See
The value relation code recognized by the function are the one defined
by the |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
col_aff |
A character vector with the name of the column to be used for the affection status. |
normalize |
A logical to know if the data should be normalised. |
cols_ren_ped |
A named list with the columns to rename for the
pedigree dataframe. This is useful if you want to use a dataframe with
different column names. The names of the list should be the new column
names and the values should be the old column names. The default values
are to be used with |
cols_ren_rel |
A named list with the columns to rename for the relationship matrix. This is useful if you want to use a dataframe with different column names. The names of the list should be the new column names and the values should be the old column names. |
hints |
A Hints object or a named list containing |
na_strings |
Vector of strings to be considered as NA values. |
If the normalization is set to TRUE
, then the data will be
standardized using the function norm_ped()
and norm_rel()
.
If a data.frame is given, the columns names needed will depend if the normalization is selected or not. If the normalization is selected, the columns names needed are as follow and if not the columns names needed are in parenthesis:
indID
: the individual identifier (id
)
fatherId
: the identifier of the biological father (dadid
)
motherId
: the identifier of the biological mother (momid
)
gender
: the sex of the individual (sex
)
family
: the family identifier of the individual (famid
)
sterilisation
: the sterilisation status of the individual
(steril
)
available
: the availability status of the individual (avail
)
vitalStatus
: the death status of the individual (status
)
affection
: the affection status of the individual (affected
)
...
: other columns that will be stored in the
elementMetadata
slot
The minimum columns required are :
indID
/ id
fatherId
/ dadid
motherId
/ momid
gender
/ sex
The family
/ famid
column can also be used to specify the
family of the individuals and will be merge to the
indId
/ id
field separated by an underscore.
The columns sterilisation
, available
,
vitalStatus
, affection
will be transformed with the vect_to_binary()
function when the normalisation is selected.
If you do not use the normalisation, the columns will be checked to
be 0
or 1
.
If affected
is a data.frame, col_aff will be
overwritten by the column names of the data.frame.
A Pedigree object.
ped
A Ped object for the identity informations.
See Ped()
for more informations.
rel
A Rel object for the special relationships.
See Rel()
for more informations.
scales
A Scales object for the filling and bordering
colors used in the plot.
See Scales()
for more informations.
hints
A Hints object for the ordering of the
individuals in the plot.
See Hints()
for more informations.
ped(x, slot)
: Get the value of a specific slot of the Ped object
ped(x)
: Get the Ped object
ped(x, slot) <- value
: Set the value of a specific slot of
the Ped object
Wrapper of slot(ped(x)) <- value
ped(x) <- value
: Set the Ped object
mcols(x)
: Get the metadata of a Pedigree object.
This function is a wrapper around mcols(ped(x))
.
mcols(x) <- value
: Set the metadata of a Pedigree object.
This function is a wrapper around mcols(ped(x)) <- value
.
rel(x, slot)
: Get the value of a specific slot of the Rel object
rel(x)
: Get the Rel object
rel(x, slot) <- value
: Set the value of a specific slot of the
Rel object
Wrapper of slot(rel(x)) <- value
rel(x) <- value
: Set the Rel object
scales(x)
: Get the Scales object
scales(x) <- value
: Set the Scales object
fill(x)
: Get the fill data.frame from the Scales object.
Wrapper of fill(scales(x))
fill(x) <- value
: Set the fill data.frame from the Scales object.
Wrapper of fill(scales(x)) <- value
border(x)
: Get the border data.frame from the Scales object.
Wrapper of border(scales(x))
border(x) <- value
: Set the border data.frame
from the Scales object.
Wrapper of border(scales(x)) <- value
hints(x)
: Get the Hints object
hints(x) <- value
: Set the Hints object
horder(x)
: Get the horder vector from the Hints object.
Wrapper of horder(hints(x))
horder(x) <- value
: Set the horder vector from the Hints object.
Wrapper of horder(hints(x)) <- value
spouse(x)
: Get the spouse data.frame from the Hints object.
Wrapper of spouse(hints(x))
.
spouse(x) <- value
: Set the spouse data.frame
from the Hints object.
Wrapper of spouse(hints(x)) <- value
.
length(x)
: Get the length of a Pedigree object.
Wrapper of length(ped(x))
.
show(x)
: Print the information of the Ped and Rel
object inside the Pedigree object.
summary(x)
: Compute the summary of the Ped and Rel object
inside the Pedigree object.
as.list(x)
: Convert a Pedigree object to a list
subset(x, i, keep = TRUE)
: Subset a Pedigree object
based on the individuals identifiers given.
i
: A vector of individuals identifiers to keep.
del_parents
: A logical value indicating if the parents
of the individuals should be deleted.
keep
: A logical value indicating if the individuals
should be kept or deleted.
x[i, del_parents, keep]
: Subset a Pedigree object
based on the individuals identifiers given.
Pedigree()
Ped()
Rel()
Scales()
Hints()
Pedigree( obj = c("1", "2", "3", "4", "5", "6"), dadid = c("4", "4", "6", "0", "0", "0"), momid = c("5", "5", "5", "0", "0", "0"), sex = c(1, 2, 3, 1, 2, 1), avail = c(0, 1, 0, 1, 0, 1), affected = matrix(c( 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1 ), ncol = 2), col_aff = c("aff1", "aff2"), missid = "0", rel_df = matrix(c( "1", "2", 2 ), ncol = 3, byrow = TRUE), ) data(sampleped) Pedigree(sampleped)
Pedigree( obj = c("1", "2", "3", "4", "5", "6"), dadid = c("4", "4", "6", "0", "0", "0"), momid = c("5", "5", "5", "0", "0", "0"), sex = c(1, 2, 3, 1, 2, 1), avail = c(0, 1, 0, 1, 0, 1), affected = matrix(c( 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1 ), ncol = 2), col_aff = c("aff1", "aff2"), missid = "0", rel_df = matrix(c( "1", "2", 2 ), ncol = 3, byrow = TRUE), ) data(sampleped) Pedigree(sampleped)
This function is used to create a plot from a data.frame.
If ggplot_gen = TRUE
, the plot will be generated with ggplot2 and
will be returned invisibly.
plot_fromdf( df, usr = NULL, title = NULL, ggplot_gen = FALSE, boxw = 1, boxh = 1, add_to_existing = FALSE )
plot_fromdf( df, usr = NULL, title = NULL, ggplot_gen = FALSE, boxw = 1, boxh = 1, add_to_existing = FALSE )
df |
A data.frame with the following columns:
|
usr |
The user coordinates of the plot. |
title |
The title of the plot. |
ggplot_gen |
If TRUE add the segments to the ggplot object |
boxw |
Width of the polygons elements |
boxh |
Height of the polygons elements |
add_to_existing |
If |
an invisible ggplot object and a plot on the current plotting device
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == 1,]) lst <- ped_to_plotdf(ped1) if (interactive()) { plot_fromdf(lst$df, lst$par_usr$usr, boxw = lst$par_usr$boxw, boxh = lst$par_usr$boxh ) }
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == 1,]) lst <- ped_to_plotdf(ped1) if (interactive()) { plot_fromdf(lst$df, lst$par_usr$usr, boxw = lst$par_usr$boxw, boxh = lst$par_usr$boxh ) }
This function is used to plot a Pedigree object.
It is a wrapper for plot_fromdf()
and ped_to_plotdf()
as well as
ped_to_legdf()
if legend = TRUE
.
## S4 method for signature 'Pedigree,missing' plot( x, aff_mark = TRUE, id_lab = "id", label = NULL, ggplot_gen = FALSE, cex = 1, symbolsize = 1, branch = 0.6, packed = TRUE, align = c(1.5, 2), align_parents = TRUE, force = FALSE, width = 6, title = NULL, subreg = NULL, pconnect = 0.5, fam_to_plot = 1, legend = FALSE, leg_cex = 0.8, leg_symbolsize = 0.5, leg_loc = NULL, leg_adjx = 0, leg_adjy = 0, precision = 2, lwd = par("lwd"), ped_par = list(), leg_par = list(), tips = NULL )
## S4 method for signature 'Pedigree,missing' plot( x, aff_mark = TRUE, id_lab = "id", label = NULL, ggplot_gen = FALSE, cex = 1, symbolsize = 1, branch = 0.6, packed = TRUE, align = c(1.5, 2), align_parents = TRUE, force = FALSE, width = 6, title = NULL, subreg = NULL, pconnect = 0.5, fam_to_plot = 1, legend = FALSE, leg_cex = 0.8, leg_symbolsize = 0.5, leg_loc = NULL, leg_adjx = 0, leg_adjy = 0, precision = 2, lwd = par("lwd"), ped_par = list(), leg_par = list(), tips = NULL )
x |
A Pedigree object. |
aff_mark |
If |
id_lab |
The column name of the id for each individuals. |
label |
If not |
ggplot_gen |
If TRUE add the segments to the ggplot object |
cex |
Character expansion of the text |
symbolsize |
Size of the symbols |
branch |
defines how much angle is used to connect various levels of nuclear families. |
packed |
Should the Pedigree be compressed. (i.e. allow diagonal lines connecting parents to children in order to have a smaller overall width for the plot.) |
align |
For a packed Pedigree, align children under parents |
align_parents |
If |
force |
If |
width |
For a packed output, the minimum width of the plot, in inches. |
title |
The title of the plot. |
subreg |
A 4-element vector for (min x, max x, min depth, max depth),
used to edit away portions of the plot coordinates returned by
|
pconnect |
When connecting parent to children the program will try to
make the connecting line as close to vertical as possible, subject to it
lying inside the endpoints of the line that connects the children by at
least |
fam_to_plot |
default=1. If the Pedigree contains multiple families,
this parameter can be used to select which family to plot.
It can be a numeric value or a character value. If numeric, it is the
index of the family to plot returned by |
legend |
default=FALSE. If TRUE, a legend will be added to the plot. |
leg_cex |
default=0.8. Controls the size of the legend text. |
leg_symbolsize |
default=0.5. Controls the size of the legend symbols. |
leg_loc |
default=NULL. If NULL, the legend will be placed in the upper right corner of the plot. Otherwise, a 4-element vector of the form (x0, x1, y0, y1) can be used to specify the location of the legend. The legend will be fitted to the specified and might be distorted if the aspect ratio of the legend is different from the aspect ratio of the specified location. |
leg_adjx |
default=0. Controls the horizontal labels adjustment of the legend. |
leg_adjy |
default=0. Controls the vertical labels adjustment of the legend. |
precision |
The number of decimal places to round the solution to. |
lwd |
default=par("lwd"). Controls the line width of the segments, arcs and polygons. |
ped_par |
default=list(). A list of parameters to use as graphical parameteres for the main plot. |
leg_par |
default=list(). A list of parameters to use as graphical parameters for the legend. |
tips |
A character vector of the column names of the data frame to
use as tooltips. If |
Two important parameters control the looks of the result. One is the user specified maximum width. The smallest possible width is the maximum number of subjects on a line, if the user's suggestion is too low it is increased to 1 + that amount (to give just a little wiggle room).
To make a Pedigree where all children are centered under parents simply make the width large enough, however, the symbols may get very small.
The second is align
, a vector of 2 alignment parameters a
and
b
.
For each set of siblings at a set of locations x
and with parents at
p=c(p1,p2)
the alignment penalty is
Where k is the number of siblings in the set.
When a = 1
moving a sibship with k
sibs one unit to the
left or right of optimal will incur the same cost as moving one with
only 1 or two sibs out of place.
If a = 0
then large sibships are harder to move than small ones,
with the default value a = 1.5
they are slightly easier to move
than small ones. The rationale for the default is as long as the parents
are somewhere between the first and last siblings the result looks fairly
good, so we are more flexible with the spacing of a large family.
By tethering all the sibs to a single spot they are kept close to each other.
The alignment penalty for spouses is ,
which tends to keep them together. The size of
b
controls the relative
importance of sib-parent and spouse-spouse closeness.
an invisible list containing
df : the data.frame used to plot the Pedigree
par_usr : the user coordinates used to plot the Pedigree
ggplot : the ggplot object if ggplot_gen = TRUE
Creates plot on current plotting device.
data(sampleped) pedAll <- Pedigree(sampleped) if (interactive()) { plot(pedAll) }
data(sampleped) pedAll <- Pedigree(sampleped) if (interactive()) { plot(pedAll) }
S4 class to represent the special relationships in a Pedigree.
You either need to provide a vector of the same size for each slot
or a data.frame
with the corresponding columns.
## S4 method for signature 'data.frame' Rel(obj) ## S4 method for signature 'character_OR_integer' Rel(obj, id2, code, famid = NA_character_)
## S4 method for signature 'data.frame' Rel(obj) ## S4 method for signature 'character_OR_integer' Rel(obj, id2, code, famid = NA_character_)
obj |
A character vector with the id of the first individuals
of each pairs or a |
id2 |
A character vector with the id of the second individuals of each pairs |
code |
A character, factor or numeric vector corresponding to the relation code of the individuals:
|
famid |
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore. |
A Rel object is a list of special relationships
between individuals in the pedigree.
It is used to create a Pedigree object.
The minimal needed informations are id1
, id2
and code
.
If a famid
is provided, the individuals id
will be aggregated
to the famid
character to ensure the uniqueness of the id
.
A Rel object.
id1
A character vector with the id of the first individual.
id2
A character vector with the id of the second individual.
code
An ordered factor vector with the code of the special relationship.
(i.e. MZ twin
< DZ twin
< UZ twin
< Spouse
).
famid
A character vector with the famid of the individuals.
For all the following accessors, the x
parameters is a Rel object.
Each getters return a vector of the same length as x
with the values
of the corresponding slot.
code(x)
: Relationships' code
id1(x)
: Relationships' first individuals' identifier
id2(x)
: Relationships' second individuals' identifier
famid(x)
: Relationships' individuals' family identifier
famid(x) <- value
: Set the relationships' individuals' family
identifier
value
: A character or integer vector of the same length as x
with the family identifiers
summary(x)
: Compute the summary of a Rel object
show(x)
: Convert the Rel object to a data.frame
and print it with its summary.
as.list(x)
: Convert a Rel object to a list
as.data.frame(x)
: Convert a Rel object to a data.frame
subset(x, i, keep = TRUE)
: Subset a Rel object
based on the individuals identifiers given.
i
: A vector of individuals identifiers to keep.
keep
: A logical value indicating if the individuals
should be kept or deleted.
rel_df <- data.frame( id1 = c("1", "2", "3"), id2 = c("2", "3", "4"), code = c(1, 2, 3) ) Rel(rel_df) Rel( obj = c("1", "2", "3"), id2 = c("2", "3", "4"), code = c(1, 2, 3) )
rel_df <- data.frame( id1 = c("1", "2", "3"), id2 = c("2", "3", "4"), code = c(1, 2, 3) ) Rel(rel_df) Rel( obj = c("1", "2", "3"), id2 = c("2", "3", "4"), code = c(1, 2, 3) )
Small set of related individuals for testing purposes.
data("relped")
data("relped")
The dataframe is composed of 4 columns:
id1
: the first individual identifier,
id2
: the second individual identifier,
code
: the relationship between the two individuals,
famid
: the family identifier.
The relationship codes are:
1
for Monozygotic twin
2
for Dizygotic twin
3
for Twin of unknown zygosity
4
for Spouse relationship
This is a small fictive data set of relation that accompanies the sampleped data set. The aim was to create a data set with a variety of relationships. There is 8 relations with 4 different types of relationships.
data("relped") data("sampleped") pedi <- Pedigree(sampleped, relped) summary(pedi) if (interactive()) { plot(pedi) }
data("relped") data("sampleped") pedi <- Pedigree(sampleped, relped) summary(pedi) if (interactive()) { plot(pedi) }
Small sample pedigree data set for testing purposes.
data("sampleped")
data("sampleped")
A data frame with 55 observations, one line per subject, on the following 7 variables.
famid
: Family identifier
id
: Subject identifier
dadid
: Identifier of the father, if the father is part of the
data set; zero otherwise
momid
: Identifier of the mother, if the mother is part of the
data set; zero otherwise
sex
: 1
for male or 2
for female
affected
: 1
or 0
avail
: 1
or 0
num
: Numerical test variable from 0 to 6 randomly distributed
This is a small fictive pedigree data set, with 55 individuals in 2 families. The aim was to create a data set with a variety of pedigree structures.
data("sampleped") pedi <- Pedigree(sampleped) summary(pedi) if (interactive()) { plot(pedi) }
data("sampleped") pedi <- Pedigree(sampleped) summary(pedi) if (interactive()) { plot(pedi) }
A Scales object is a list of two data.frame. The first one is used to represent the affection status of the individuals and therefore the filling of the individuals in the pedigree plot. The second one is used to represent the availability status of the individuals and therefore the border color of the individuals in the pedigree plot.
You need to provide both fill and border
in the dedicated parameters.
However this is usually done using the
generate_colors()
function with a
Pedigree object.
Scales(fill, border) ## S4 method for signature 'data.frame,data.frame' Scales(fill, border)
Scales(fill, border) ## S4 method for signature 'data.frame,data.frame' Scales(fill, border)
fill |
A data.frame with the informations for the affection status. The columns needed are:
|
border |
A data.frame with the informations for the availability status. The columns needed are:
|
A Scales object.
fill
A data.frame with the informations for the affection status. The columns needed are:
'order': the order of the affection to be used
'column_values': name of the column containing the raw values in the Ped object
'column_mods': name of the column containing the mods of the transformed values in the Ped object
'mods': all the different mods
'labels': the corresponding labels of each mods
'affected': a logical value indicating if the mod correspond to an affected individuals
'fill': the color to use for this mods
'density': the density of the shading
'angle': the angle of the shading
border
A data.frame with the informations for the availability status. The columns needed are:
'column_values': name of the column containing the raw values in the Ped object
'column_mods': name of the column containing the mods of the transformed values in the Ped object
'mods': all the different mods
'labels': the corresponding labels of each mods
'border': the color to use for this mods
fill(x)
: Get the fill data.frame
fill(x) <- value
: Set the fill data.frame
border(x)
: Get the border data.frame
border(x) <- value
: Set the border data.frame
from the Scales object.
as.list(x)
: Convert a Scales object to a list
Scales( fill = data.frame( order = 1, column_values = "affected", column_mods = "affected_mods", mods = c(0, 1), labels = c("unaffected", "affected"), affected = c(FALSE, TRUE), fill = c("white", "red"), density = c(NA, 20), angle = c(NA, 45) ), border = data.frame( column_values = "avail", column_mods = "avail_mods", mods = c(0, 1), labels = c("not available", "available"), border = c("black", "blue") ) )
Scales( fill = data.frame( order = 1, column_values = "affected", column_mods = "affected_mods", mods = c(0, 1), labels = c("unaffected", "affected"), affected = c(FALSE, TRUE), fill = c("white", "red"), density = c(NA, 20), angle = c(NA, 45) ), border = data.frame( column_values = "avail", column_mods = "avail_mods", mods = c(0, 1), labels = c("not available", "available"), border = c("black", "blue") ) )
Shrink Pedigree object to specified bit size with priority placed on trimming uninformative subjects. The algorithm is useful for getting a Pedigree condensed to a minimally informative size for algorithms or testing that are limited by size of the Pedigree.
If avail or affected are NULL
, they are extracted
with their corresponding accessors from the Ped object.
## S4 method for signature 'Pedigree' shrink(obj, avail = NULL, affected = NULL, max_bits = 16) ## S4 method for signature 'Ped' shrink(obj, avail = NULL, affected = NULL, max_bits = 16)
## S4 method for signature 'Pedigree' shrink(obj, avail = NULL, affected = NULL, max_bits = 16) ## S4 method for signature 'Ped' shrink(obj, avail = NULL, affected = NULL, max_bits = 16)
obj |
A Pedigree or Ped object. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
max_bits |
Optional, the bit size for which to shrink the Pedigree |
Iteratively remove subjects from the Pedigree. The random removal of members was previously controlled by a seed argument, but we remove this, forcing users to control randomness outside the function. First remove uninformative subjects, i.e., unavailable (not genotyped) with no available descendants. Next, available terminal subjects with unknown phenotype if both parents available. Last, iteratively shrinks Pedigrees by preferentially removing individuals (chosen at random if there are multiple of the same status):
Subjects with unknown affected status
Subjects with unaffected affected status
Affected subjects.
A list containing the following elements:
pedObj: Pedigree object after trimming
id_trim: Vector of ids trimmed from Pedigree
id_lst: List of ids trimmed by category
bit_size: Vector of bit sizes after each trimming step
avail: Vector of availability status after trimming
pedSizeOriginal: Number of subjects in original Pedigree
pedSizeIntermed: Number of subjects after initial trimming
pedSizeFinal: Number of subjects after final trimming
Original by Dan Schaid, updated by Jason Sinnwell and Louis Le Nézet
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == '1',]) shrink(ped1, max_bits = 12)
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == '1',]) shrink(ped1, max_bits = 12)
Update the family prefix in the individuals identifiers. Individuals identifiers are constructed as follow famid_id. Therefore to update their family prefix the ids are split by the first underscore and the first part is overwritten by famid.
## S4 method for signature 'character,ANY' upd_famid(obj, famid, missid = NA_character_) ## S4 method for signature 'Ped,character_OR_integer' upd_famid(obj, famid) ## S4 method for signature 'Ped,missing' upd_famid(obj) ## S4 method for signature 'Rel,character_OR_integer' upd_famid(obj, famid) ## S4 method for signature 'Rel,missing' upd_famid(obj) ## S4 method for signature 'Pedigree,character_OR_integer' upd_famid(obj, famid) ## S4 method for signature 'Pedigree,missing' upd_famid(obj)
## S4 method for signature 'character,ANY' upd_famid(obj, famid, missid = NA_character_) ## S4 method for signature 'Ped,character_OR_integer' upd_famid(obj, famid) ## S4 method for signature 'Ped,missing' upd_famid(obj) ## S4 method for signature 'Rel,character_OR_integer' upd_famid(obj, famid) ## S4 method for signature 'Rel,missing' upd_famid(obj) ## S4 method for signature 'Pedigree,character_OR_integer' upd_famid(obj, famid) ## S4 method for signature 'Pedigree,missing' upd_famid(obj)
obj |
Ped or Pedigree object or a character vector of individual ids |
famid |
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore. |
missid |
A character vector with the missing values identifiers.
All the id, dadid and momid corresponding to those values will be set
to |
If famid is missing, then the famid()
function will be called
on the object.
A character vector of individual ids with family prefix updated
upd_famid(c("1", "2", "B_3"), c("A", "B", "A")) upd_famid(c("1", "B_2", "C_3", "4"), c("A", NA, "A", NA)) data(sampleped) ped1 <- Pedigree(sampleped[,-1]) id(ped(ped1)) new_fam <- make_famid(id(ped(ped1)), dadid(ped(ped1)), momid(ped(ped1))) id(ped(upd_famid(ped1, new_fam))) data(sampleped) ped1 <- Pedigree(sampleped[,-1]) make_famid(ped1)
upd_famid(c("1", "2", "B_3"), c("A", "B", "A")) upd_famid(c("1", "B_2", "C_3", "4"), c("A", NA, "A", NA)) data(sampleped) ped1 <- Pedigree(sampleped[,-1]) id(ped(ped1)) new_fam <- make_famid(id(ped(ped1)), dadid(ped(ped1)), momid(ped(ped1))) id(ped(upd_famid(ped1, new_fam))) data(sampleped) ped1 <- Pedigree(sampleped[,-1]) make_famid(ped1)
Compute the usefulness of individuals
## S4 method for signature 'character' useful_inds( obj, dadid, momid, avail, affected, num_child_tot, id_inf, keep_infos = FALSE ) ## S4 method for signature 'Pedigree' useful_inds( obj, informative = "AvAf", keep_infos = FALSE, reset = FALSE, max_dist = NULL ) ## S4 method for signature 'Ped' useful_inds( obj, informative = "AvAf", keep_infos = FALSE, reset = FALSE, max_dist = NULL )
## S4 method for signature 'character' useful_inds( obj, dadid, momid, avail, affected, num_child_tot, id_inf, keep_infos = FALSE ) ## S4 method for signature 'Pedigree' useful_inds( obj, informative = "AvAf", keep_infos = FALSE, reset = FALSE, max_dist = NULL ) ## S4 method for signature 'Ped' useful_inds( obj, informative = "AvAf", keep_infos = FALSE, reset = FALSE, max_dist = NULL )
obj |
A character vector with the id of the individuals or a
|
dadid |
A vector containing for each subject, the identifiers of the biologicals fathers. |
momid |
A vector containing for each subject, the identifiers of the biologicals mothers. |
avail |
A logical vector with the availability status of the
individuals
(i.e. |
affected |
A logical vector with the affection status of the
individuals
(i.e. |
num_child_tot |
A numeric vector of the number of children of each individuals |
id_inf |
An identifiers vector of informative individuals. |
keep_infos |
Boolean to indicate if parents with unknown status but available or reverse should be kept |
informative |
Informative individuals selection can take 5 values:
|
reset |
Boolean to indicate if the |
max_dist |
The maximum distance to informative individuals |
Check for the informativeness of the individuals based on the
informative parameter given, the number of children and the usefulness
of their parents. A useful
slot is added to the Ped object with the
usefulness of the individual.
A vector of useful individuals identifiers
The Pedigree or Ped object with the slot 'useful' containing TRUE
for
useful individuals and FALSE
otherwise.
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == "1",]) ped(useful_inds(ped1, informative = "AvAf"))
data(sampleped) ped1 <- Pedigree(sampleped[sampleped$famid == "1",]) ped(useful_inds(ped1, informative = "AvAf"))