tracts.phase_type.monoecious.PhTMonoecious#
- class PhTMonoecious(migration_matrix, rho=1)#
Bases:
PhaseTypeDistributionA subclass of
PhaseTypeDistributionproviding the specific Phase-Type tools for the Monoecious Markov approximation.- migration_matrix#
The migration matrix given as input without contributions at generations 0 and 1.
- Type:
npt.ArrayLike
- num_populations#
The number of populations considered in the demographic model.
- Type:
int
- num_generations#
The number of generations considered in the demographic model.
- Type:
int
- t0_proportions#
The total contribution from each ancestral population.
- Type:
npt.ArrayLike
- full_transition_matrix#
The intensity matrix \(\mathbf{S}^M\) of the Monoecious Markov Model.
- Type:
npt.ArrayLike
- equilibrium_distribution#
The equilibrium distribution of the Monoecious Markov Model.
- Type:
npt.ArrayLike
- alpha_list#
A list containing, for each ancestral population, the initial state of the population-specific Phase-Type distribution.
- Type:
list
- transition_matrices#
A list containing, for each ancestral population, the submatrix of full_transition_matrix corresponding to transitions within the population. It is used to compute the population-specificdistribution of tract lengths.
- Type:
list
- S0_list#
A list containing the sum across columns of every transition matrix in transition_matrices.
- Type:
list
- inverse_S0_list#
A list containing the sum across columns of the inverse of every transition matrix in transition_matrices.
- Type:
list
- Parameters:
migration_matrix (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – An array containing the migration proportions from a discrete number of populations over the last generations. Each row is a time, each column is a population. Row zero corresponds to the current generation. The migration rate at the last generation (migration_matrix[-1,:]) is the founding generation and should sum up to 1.rho (
float) – The recombination rate.
Notes
Non-listed attributes are for internal use only.
- PhT_CDF(x, population_number, s1=None)#
Computes a Phase-type CDF at a given point \(x\) in \((0, \infty)\). The Phase-type parameters (initial state, transition matrix) are taken from a
PhTMonoeciousobject togther with the specification of a population of interest.- Parameters:
x (
float) – A point in \((0, \infty)\) where the density function is evaluted.population_number (
int) – The population of interest whose tract length distribution has to be computed. An integer from 0 to the number of populations - 1.s1 – Not used in the Monoecious model.
- Returns:
The CDF value at \(x\).
- Return type:
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]
- PhT_CDF_windowed(S, alpha, S0_inv, bins, L, pop_number, s1=None, exp_Sx_per_bin=None)#
Computes a Phase-type CDF on a finite chromosome of length \(L\) and evaluates it on a point grid. The Phase-type parameters (initial state, transition matrix) are taken from a
PhTMonoeciousobject (together with the specification of a population of interest) but also directly introduced as an input.- Parameters:
S (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The transition submatrix.alpha (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The initial state of the Phase-type distribution.S0_inv (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The sum across columns of the inverse of the transition submatrix.bins (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – A point grid on \((0, L)\) where the CDF has to be evaluated.L (
float) – The length of the finite chromosome.s1 (
float|None) – Not used in the Monoecious model.pop_number (
int) – The population of interest whose tract length distribution has to be computed. An integer from 0 to the number of populations - 1, corresponding to the column of the migration matrix.exp_Sx_per_bin (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The precomputed values of \(e^{Sx}\) for every \(x\) in bins. Used internally to speed up computation.
- Return type:
tuple[ndarray,float,float,float]- Returns:
npt.ArrayLike – The CDF evaluated on bins.
float – The tract length expectation of the corresponding model considering an infinite chromosome.
float – The normalization factor \(Z\) of the corresponding model.
float – The tract length expectation on the finite chromosome of the corresponding model.
- PhT_density(x, population_number, s1=None)#
Computes a Phase-type density at a given point \(x\) in \((0, \infty)\). The Phase-type parameters (initial state, transition matrix) are taken from a
PhTMonoeciousobject together with the specification of a population of interest.- Parameters:
x (
float) – A point in \((0, \infty)\) where the density function is evaluated.population_number (
int) – The population of interest whose tract length distribution has to be computed. An integer from 0 to the number of populations - 1.s1 – Not used in the Monoecious model.
- Returns:
The density value at \(x\).
- Return type:
float
- PhT_density_windowed(population_number, S, alpha, S0_inv, bins, L, s1=None, exp_Sx_per_bin=None)#
Computes a Phase-type density on a finite chromosome of length \(L\) and evaluates it on a point grid. The Phase-type parameters (initial state, transition matrix) are taken from a
PhTMonoeciousobject (together with the specification of a population of interest) but also directly introduced as an input.- Parameters:
S (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The transition submatrix.alpha (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The initial state of the Phase-type distribution.S0_inv (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The sum across columns of the inverse of the transition submatrix.bins (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – A point grid on \((0, L)\) where the density has to be evaluated.L (
float) – The length of the finite chromosome.s1 (float, default None) – Not used in the Monoecious model.
exp_Sx_per_bin (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The precomputed values of \(e^{Sx}\) for every \(x\) in bins. Used internally to speed up computation.
- Returns:
npt.ArrayLike – The corrected bins grid as described in Notes.
npt.ArrayLike – The density evaluated on bins.
float – The tract length expectation of the corresponding model.
Notes
The code truncates bins to the interval \([0, L]\) and adds the point \(L\) if it is not included in bins. This is done because the density is defined on the finite chromosome \([0, L]\) as a mixture of a continuous density on \([0, L)\) and a Dirac measure at \(L\). Consequently, the function returns as a first argument the transformed grid, that can be used as x-axis to plot the density.
Don’t run this function directly. To get a Phase-type density on a finite chromosome, use
tractlength_histogram_windowed()setting density=True.
- __init__(migration_matrix, rho=1)#
Initializes the PhTMonoecious object by constructing the transition matrix and the initial state of the Phase-Type distribution.
- distribution_scaling_factor(population_number)#
Computes the scaling factor to transform the CDF values into counts.
- Parameters:
population_number (
int) – The index of the population of interest whose tract length distribution has to be computed. An integer from 0 to the number of populations - 1, corresponding to the column of the migration matrix.- Returns:
The scaling factor to transform the CDF values into counts.
- Return type:
float
- get_equilibrium_distribution()#
Computes the equilibrium distribution of the Monoecious Phase-type model.
- Returns:
The equilibrium distribution of the Monoecious Phase-type model.
- Return type:
npt.ArrayLike
- get_transition_matrix()#
Computes the transition matrix of the Monoecious Phase-type model.
- Returns:
The transition matrix of the Monoecious Phase-type model. Each entry \((i,j)\) corresponds to the transition rate from state \(i\) to state \(j\).
- Return type:
npt.ArrayLike
- tract_length_histogram_multi_windowed(population_number, bins, chrom_lengths)#
Calculates the tract length histogram on multiple chromosomes of different lengths.
- Parameters:
population_number (
int) – The index of the population of interest whose tract length distribution has to be computed. An integer from 0 to the number of populations - 1, corresponding to the column of the migration matrix.bins (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – A point grid where the histogram has to be computed. The same grid is used for all chromosomes, and should be defined on the interval (0, max(chrom_lengths)).chrom_lengths (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – A list of chromosome lengths.
- Returns:
The histogram values on the intervals defined by bins, summed across all chromosomes.
- Return type:
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]
- tractlength_histogram_windowed(population_number, bins, L, exp_Sx_per_bin=None, density=False, freq=False)#
Calculates the tractlength histogram or density function on a finite chromosome, using the Monoecious (M) admixture model.
- Parameters:
population_number (
int) – The index of the population of interest whose tract length distribution has to be computed. An integer from 0 to the number of populations - 1, corresponding to the column of the migration matrix.bins (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – A point grid where the CDF or density have to be computed.L (
float) – The length of the finite chromosome.exp_Sx_per_bin (
Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]]) – The precomputed values of \(e^{Sx}\) for every \(x\) in bins. Used internally to speed up computation.density (bool, default False) – If density is True, computes the PhT density values evaluated on the grid. Else, returns the histogram values on the grid.
freq (bool, default False) – If density is True, whether to return density on the frequency scale.
- Return type:
tuple[Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]],Union[Buffer,_SupportsArray[dtype[Any]],_NestedSequence[_SupportsArray[dtype[Any]]],bool,int,float,complex,str,bytes,_NestedSequence[bool|int|float|complex|str|bytes]],float]- Returns:
npt.ArrayLike – If density is True, the corrected bins grid as described in Notes. Else, the bins introduced as input.
npt.ArrayLike – If density is True, the Phase-type density evaluated on the corrected bins grid. Returned on the frequency scale if freq = True. If density is False, the histogram values on the intervals defined by bins.
float – The tract length expectation of the corresponding model.