tracts.chromosome.Chrom#

class Chrom(ls=None, label='POP', tracts=None)#

Bases: object

A chromosome wraps a list of tracts, which form a partition on it. The chromosome has a finite, immutable length.

tracts#

The list of tracts that span this chromosome.

Type:

list of tract objects

len#

The length of this chromosome, in Morgans.

Type:

int

start#

The starting point of this chromosome, in Morgans. This is set to the starting point of the first known tract, which may be greater than zero if the chromosome starts with a segment of unknown ancestry.

Type:

int

end#

The ending point of this chromosome, in Morgans. This is set to the ending point of the last known tract, which may be less than the chromosome’s length if it ends with a segment of unknown ancestry.

Type:

int

unknown_labels#

The set of labels that are considered to correspond to unknown ancestry. This is used by the smooth_unknown() method to identify which segments to remove.

Type:

set of strings

__init__(ls=None, label='POP', tracts=None)#

Constructor.

Parameters:
  • ls (int) – The length of this chromosome, in Morgans.

  • label (str) – An identifier categorizing this chromosome.

  • tracts (list[Tract]) – The list of tracts that span this chromosome. If None is given, then a single, unlabeled tract is created to span the whole chromosome, according to the length len.

extract(start, end)#

Extracts a segment from the chromosome.

Parameters:
  • start (int) – The starting point of the desired segment to extract.

  • end (int) – The ending point of the desired segment to extract.

Returns:

A list of tract objects that span the desired interval.

Return type:

list

Notes

Uses the goto() method of this class to identify the starting and ending points of the segment, so if those positions are invalid, goto() will raise a ValueError.

goto(pos)#

Finds the first tract containing a given position, in Morgans, and returns its index in the underlying list.

Parameters:

pos (int) – The position, in Morgans, to find.

Returns:

The index of the first tract containing the given position.

Return type:

int

is_equal(chrom)#

Check if two chromosomes are equal, in terms of their tracts.

Parameters:

chrom (Chrom) – The chromosome to compare to.

Returns:

True if the two chromosomes have the same tracts, False otherwise.

Return type:

bool

merge_ancestries(ancestries, newlabel)#

Merges segments that are contiguous and either have the same ancestry or are labeled as belonging to a specified list. The label of each tract in the chromosome’s inner list is checked against the labels listed in ancestries. If a match is found, the tract is relabeled to newlabel. This batch relabeling allows several technically different ancestries to be treated as equivalent by assigning them the same label. The resulting list is then smoothed to combine adjacent tracts with identical labels. This new list replaces the original tracts list.

Parameters:
  • ancestries (list) – The ancestries to merge.

  • newlabel (str) – The identifier for the new ancestry to assign to the matching tracts.

plot(canvas, colordict, height=0, chrwidth=0.1)#

Plots this chromosome on the provided canvas.

Parameters:
  • canvas (Canvas) – The canvas to plot on.

  • colordict (dict) – A dictionary mapping tract labels to colors, used to determine the color of each tract when plotting.

  • height (float) – The height at which to plot this chromosome. This is used to stack multiple chromosomes on top of each other when plotting a population.

  • chrwidth (float) – The width of the chromosome when plotting. This is used to stack the two copies of a chromosome pair on top of each other when plotting a population.

smooth_unknown()#

Removes segments of unknown ancestry. Unknown segments at begining and end of chromosomes are removed. Internal unknwon segments are removed, extending the neighboring segments to occupy the space previously assigned to the unknown segments.

tractlengths()#

Gets the list of tract lengths. Make sure that proper smoothing is implemented.

Returns:

A list of tuples, where each tuple contains the ancestry label of a tract, the length of the tract, and the length of the chromosome.

Return type:

list of tuples