tracts.population

Functions

collect_pop(flatdat)

Organizes a list of tracts into a dictionary keyed on ancestry labels.

preprocess_color_dict(colordict, dat)

Classes

Population([list_indivs, names, fname, ...])

class tracts.population.Population(list_indivs=None, names=None, fname=None, labs=('_A', '_B'), selectchrom=None, allosomes=[], ignore_length_consistency=False, filenames_by_individual=None, male_list=None)

Bases: object

__init__(list_indivs=None, names=None, fname=None, labs=('_A', '_B'), selectchrom=None, allosomes=[], ignore_length_consistency=False, filenames_by_individual=None, male_list=None)

Constructs a population of diploid individuals. A population is essentially a simple list of indiv objects.

There are two ways to build populations, either from a dataset stored in files or from a list of individuals. The facilities for loading populations from files present in this constructor are deprecated. It is advised to instead load a list of individuals, using indiv.from_file, and to then pass that list to this constructor.

The population can be initialized by providing it with a list of “individual” objects, or a file format fname and a list of names. If reading from a file, fname should be a tuple with the start middle and end of the file names., where an individual file is specified by start–Indiv–Middle–_A–End. Otherwise, provide list of individuals. Distinguishing labels for maternal and paternal chromosomes are given in lab.

ancestry_at_pos(select_chrom=0, pos=0, cutoff=0.0)

Finds ancestry proportion at specific position. The cutoff is used to look only at tracts that extend beyond a given position.

ancestry_per_pos(select_chrom=0, npts=50, cutoff=0.0)

Prepare the ancestry per position across chromosome.

applychrom(func, indlist=None)

Apply func to chromosomes. If no indlist is supplied, apply to all individuals.

bootinds(seed)

Returns a bootstrapped list of individuals in the population. Use with get_global_tractlength inds=… to get a bootstrapped sample.

static calculate_allosome_lengths(indivs, allosome_labels)
calculate_allosome_proportions(population_labels, allosome_label, cutoff=0.0)
calculate_ancestry_proportions(population_labels, cutoff=0.0)
static calculate_num_sexes(indivs, allosome_labels)
flatpop(ls=None)

Returns a flattened version of a population-wide list at the tract level, and throws away the start and end information of the tract,

getMeansByChrom(ancestries)

Gets the ancestry proportions in each individual of the population for each chromosome.

get_global_allosome_tractlengths(allosome, npts=50, tol=0.01, indlist=None, exclude_tracts_below_cM=0)

Returns the allosomal tractlength histogram in males and the allosomal tractlength histogram in females.

Return type:

tuple[ndarray, dict[SexType, dict[str, ndarray]]]

get_global_tractlength_table(lenbound)

Calculates the fraction of the genome covered by ancestry tracts of different lengths, specified by lenbound (which must be sorted).

get_global_tractlengths(npts=50, tol=0.01, indlist=None, split_count=1, exclude_tracts_below_cM=0)
Parameters:
  • tol (float, default 0.01) – The tolerance for full chromosomes.

  • npts (int, default 50) – The number of bins for the histogram.

  • indlist (list, default None) – The individuals for which we want the tractlength. To bootstrap over individuals, provide a bootstrapped list of individuals.

  • split_count (int, default 1)

  • exclude_tracts_below_cM (float, default 0)

Return type:

tuple[ndarray, dict[str, ndarray]]

Returns:

tuple[np.ndarray, dict[str, np.ndarray]]
A tuple with

bins: The bins for the histogram. dat: A dictionary with ancestry labels as keys and a histogram of tract lengths as values.

Notes

Sometimes there are small issues at the edges of the chromosomes. If a segment is within tol Morgans of the full chromosome, it counts as a full chromosome note that we return an extra bin with the complete chromosome bin, so that we have one more data point than we have bins.

get_mean_ancestry_proportions(ancestries)

Gets the mean ancestry proportion averaged across individuals in the population.

get_means(ancestries)

Gets the mean ancestry proportion (only among ancestries in ancestries) for all individuals.

get_meanvar(ancestries)
get_variance(ancestries)

Ancestries is a set of ancestry labels. Calculates the total variance in ancestry proportions, and the genealogy variance, and the assortment variance (corresponds to the mean uncertainty about the proportion of genealogical ancestors, given observed ancestry patterns). Note that all ancestries not listed are considered uncalled. For example, calling the function with a single ancestry leads to no variance. (and some 0/0 errors).

iflatten(indivs=None)

Flattens a list of individuals to the tract level. If the list of individuals “indivs” is None, then the complete list of individuals contained in this population is flattened. The result is a generator.

list_chromosome(chronum)

Collects the chromosomes with the given number across the whole population.

merge_ancestries(ancestries, newlabel)

Treats ancestries in label list “ancestries” as a single population with label “newlabel”. Adjacent tracts of the new ancestry are merged.

new_indiv()
newgen()

Build a new generation from this population.

plot(colordict)
plot_all_ancestries(npts=50, colordict=None, startfig=0, cutoff=0)
plot_ancestries(chrom=0, npts=50, colordict=None, cutoff=0.0)
plot_chromosome(i, colordict, win=None)

plot a single chromosome across individuals

plot_global_tractlengths(colordict, npts=50, legend=True)
plot_indiv()
plot_next()
plot_previous()
save()
set_males(male_list, allosome_label='X')

Sets the list of males for each individual

smooth_unknowns(allosome_labels='X')
split_by_props(count)

Splits this population into groups according to their ancestry proportions. The individuals are sorted in ascending order of their ancestry named “anc”.

tractlength_histogram(tracts_by_population, npts=50, tol=0.01, exclude_tracts_below_cM=0, maxLen=None)
Return type:

tuple[ndarray, dict[str, ndarray]]

tracts.population.collect_pop(flatdat)

Organizes a list of tracts into a dictionary keyed on ancestry labels.

tracts.population.preprocess_color_dict(colordict, dat)