tracts.indiv.Indiv#
- class Indiv(Ls=None, label='POP', fname=None, labs=('_A', '_B'), selectchrom=None, chroms=None, allosomes=None, name=None)#
Bases:
objectThe class of diploid individuals. An individual is thought of as a list of pairs of chromosomes. Equivalently, a diploid individual is a pair of haploid individuals. Thus, it is possible to construct instances of this class from a pair of instances of the haploid class, as well as directly from a sequence of chropair instances.
The interface for loading individuals from files uses the haploid-oriented approach, since individual .bed files describe only one haplotype. The loading process is the following (i) load haploid individuals for each haplotype and (ii) combine the haploid individuals into a diploid individual.
- Ls#
The lengths of the chromosomes.
- Type:
list of float
- chroms#
The chromosome pairs that make up this individual. See the documentation for
Chropair.- Type:
list of
tracts.chromosome.Chropair
- name#
An optional name for the individual.
- Type:
str
- allosomes#
A dictionary mapping chromosome labels to chromosome objects, for chromosomes that are treated as allosomes.
- Type:
dict
- __init__(Ls=None, label='POP', fname=None, labs=('_A', '_B'), selectchrom=None, chroms=None, allosomes=None, name=None)#
Constructs a diploid individual. There are several ways to build individuals, either from files, from existing data, or programmatically. The most straightforward way to build an individual is from existing data, by supplying only the
Lsandchromsarguments.- Parameters:
Ls (
list[float] |None) – Default isNone. The lengths of the chromosomes in the order in which they appear in chroms.chroms (
list[Chropair] |None) – Default isNone. The chromosome pairs that make up this individual. See the documentation forChropair.label (
str) – Default isPOP. The label to use for building single-tract chromosomes when no other data is given to buid this individual.fname (
str|None) – Paths are generated by concatenating the first component offname, each label fromlabsin turn, and the second component offname.labs (
tuple[str,str]) – Default is("_A", "_B"). The labels used to identify maternal and paternal haplotypes in the paths leading to .bed files.selectchrom (
list[int|str] |None) – Default isNone. Selects which chromosomes to load. The default value ofNoneselects all chromosomes.name (
str|None) – Default isNone. An identifier for this individual.
Notes
If
Lsis given, butchromsis not, then chromosomes consisting each of a single tract will be created with the labellabeland lengths drawn fromLs. If thefnameargument is given, the constructor will perform path manipulation involving the components of fname and labs to generate file names that are commonly used when dealing with .bed files. The facilities in this constructor for loading individuals from files are deprecated. It is recommended to instead use the static methodsfrom_files()orfrom_haploids().
- ancestryAmt(ancestry)#
Calculates the total length of the genome in segments of the given ancestry.
- Parameters:
ancestry (
str) – The ancestry for which to calculate the total length.- Returns:
The total length of the genome in segments of the given ancestry.
- Return type:
float
- ancestryProps(ancestries, allosome_label=False, cutoff=0.0)#
Calculates the proportion of the genome represented by the given ancestries.
- Parameters:
ancestries (
list) – A list of ancestries for which to calculate the proportions.allosome_label (
bool) – An optional label for the allosome chromosome to consider when calculating ancestry proportions. If False (the default), allosomes will not be considered when calculating ancestry proportions. If a string is provided, only the chromosome with the corresponding label in self.allosomes will be considered when calculating ancestry proportions. If the specified label is not present in self.allosomes, a warning will be logged and no chromosomes will be considered as allosomes.cutoff (
float) – An optional cutoff value for tract lengths. Only tracts with lengths greater than this cutoff will be considered when calculating ancestry proportions. The default value is 0, meaning that all tracts will be considered regardless of their length.
- Returns:
A list of proportions corresponding to the input list of ancestries, where each proportion represents the fraction of the genome that is represented by segments of the corresponding ancestry. The order of the proportions corresponds to the order of the input ancestries.
- Return type:
list of float
- ancestryPropsByChrom(ancestries)#
Calculates the proportion of the genome represented by the given ancestries, separately for each chromosome.
- Parameters:
ancestries (
list[str]) – A list of ancestries for which to calculate the proportions.- Returns:
A list of lists of proportions, where the outer list corresponds to the input list of ancestries, and the inner lists correspond to the chromosomes of the individual. Each inner list contains the proportions of the corresponding chromosome that are represented by segments of the corresponding ancestry. The order of the outer list corresponds to the order of the input ancestries, and the order of the inner lists corresponds to the order of the chromosomes in the individual’s chroms attribute.
- Return type:
list of list of float
- applychrom(func)#
Apply the function func to each chromosome of the individual.
- Parameters:
func (
callable) – A function that takes a chromosome as input and returns a value. This function will be applied to each chromosome of the individual, and the results will be collected into a list.- Returns:
A list containing the results of applying func to each chromosome of the individual. The order of the results corresponds to the order of the chromosomes in the individual’s chroms attribute.
- Return type:
list
- create_gamete()#
Creates a haploid gamete from the individual.
- Returns:
A haploid genome representing a gamete produced by this individual. The gamete is generated by recombining the chromosome pairs of this individual, and then taking one chromosome from each pair.
- Return type:
- flat_imap(f)#
Lazily map a function over the full underlying structure of this individual.
- Parameters:
f (
callable) – A function that takes three parameters: chrom, the chromosome pair containing the tract, copy, the chromosome containing the tract and tract, the tract itself. This function will be applied to each tract in the individual’s genome, and the results will be collected into a list. The order of the results corresponds to the order of the tracts in the individual’s genome, as determined by iterating through the chromosomes, then the copies within each chromosome, and then the tracts within each copy.- Returns:
A list containing the results of applying f to each tract in the individual’s genome.
- Return type:
list
- static from_files(paths, selectchrom=None, name=None, allosomes=None)#
Constructs a diploid individual from two files, which describe the individuals haplotypes.
- Parameters:
paths (
list[str]) – A list of two file paths, each describing one haplotype of the individual. The files should be tab-delimited text files with columns: chrom, start, end, label, and optionally others. The first line may be a header, which will be automatically skipped if it contains the expected column names.selectchrom (
list[int|str] |None) – An optional list of chromosome labels to select from the files. If not provided, all chromosomes will be selected. Chromosome labels should be integers or strings corresponding to the chromosome numbers in the files (e.g., “1” for chromosome 1). Chromosome identifiers that cannot be converted to integers will be ignored, and the corresponding chromosomes will not be selected.name (
str|None) – An optional name for the individual. If not provided, the name will be set to the name of the first haploid individual loaded from the files.allosomes (
list[str] |None) – An optional list of chromosome labels that should be treated as allosomes. If not provided, no chromosomes will be treated as allosomes. Chromosome labels should be strings corresponding to the chromosome identifiers in the files (e.g., “X” for the X chromosome). Chromosome labels that are not present in the files will be ignored, and no chromosomes will be treated as allosomes.
- Returns:
An instance of the
Indivclass representing the combined diploid individual loaded from the files.- Return type:
Individual
- static from_haploids(haps, name=None, allosome_labels=None)#
Construct a diploid individual from a list of two haploid individuals.
- Parameters:
haps (
list[Haploid]) – A list of two haploid individuals to combine into a diploid individual.name (
str) – An optional name for the individual. If not provided, the name will be set to the name of the first haploid individual.allosome_labels (
list[str] |None) – An optional list of chromosome labels that should be treated as allosomes. If not provided, no chromosomes will be treated as allosomes. Chromosome labels should be strings corresponding to the chromosome identifiers in the haploid individuals (e.g., “X” for the X chromosome). Chromosome labels that are not present in the haploid individuals will be ignored, and no chromosomes will be treated as allosomes.
- Returns:
An instance of the
Indivclass representing the combined diploid individual.- Return type:
Individual
- iflatten(allosome_label=False)#
Lazily flatten this individual to the tract level.
- Parameters:
allosome_label (
str|bool|None) – An optional label for the allosome chromosome to consider when flattening. If False (the default), allosomes will not be considered when flattening. If a string is provided, only the chromosome with the corresponding label in self.allosomes will be considered when flattening. If the specified label is not present in self.allosomes, a warning will be logged and no chromosomes will be considered as allosomes.
- plot(colordict, win=None)#
Plots an individual.
- Parameters:
colordict (
dict) – A dictionary mapping population labels to colors, used to determine the color of each tract when plotting. E.g.:colordict = {"CEU":'r',"YRI":b}.win (
Tk) – An optional Tkinter window to plot on. If not provided, a new window will be created for this plot. If provided, the plot will be drawn on the given window instead of creating a new one. This can be used to plot multiple individuals on the same window, or to integrate the plot of this individual into a larger Tkinter application.
- Returns:
The Tkinter window on which the plot was drawn.
- Return type:
tk.Tk
- unnamed_counter = 0#