Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
fastcxt
fastcxt

User Guide

  • Quickstart
  • End-to-End Tutorial
  • cxt vs fastcxt
  • Architecture
  • Simulation
  • Preprocessing
  • Training
  • Inference

Applications

  • Mosquito Analysis Protocol
  • Demographic Inference
  • TimeAtlas
  • Scaling & Benchmarks
  • Visualization
  • Figure Gallery

Reference

  • API Reference
    • fastcxt.config
    • fastcxt.paths
    • fastcxt.model
    • fastcxt.modules
    • fastcxt.translate
    • fastcxt.train
    • fastcxt.simulate
    • fastcxt.preprocess
    • fastcxt.dataset
    • fastcxt.sfs
    • fastcxt.tree_utils
    • fastcxt.atlas
    • fastcxt.mosquito
    • fastcxt.benchmark
  • CLI Reference
  • Changelog
Back to top
View this page
Edit this page

Figure Gallery¶

Publication-quality figures from the comprehensive Ag1000G Anopheles gambiae analysis. All plots are auto-generated from inference results across 16 African populations, 5 chromosome arms, 3 karyotype groups, and both intra- and inter-individual pair types.

Live results

These figures update automatically as inference completes. Sections marked PENDING will populate as more populations finish processing. Re-run generate_all_figures.py after syncing new results.

Overview¶

Cross-population summary of mean coalescence times across all karyotype groups and pair types.

Overview heatmap — mean TMRCA across populations and groups

Inversion Signals¶

Intra-individual coalescence along chromosome 2L reveals the In(2L)a inversion signature. Heterozygous individuals show deep coalescence inside the inversion where recombination is suppressed between standard and inverted arrangements. Homozygotes recombine freely within their arrangement.

Burkina Faso 2L inversion signal by karyotype

Burkina Faso — three karyotype groups overlaid. The red shaded region marks In(2L)a (20.5–42.2 Mb). Accessibility track at bottom shows data quality.

Cameroon 2L inversion signal by karyotype

Cameroon — same three karyotype groups. Note similar inversion signature.

Central African Republic 2L inversion signal by karyotype

Central African Republic — 2L inversion signal (2L data only, 2R in progress).

Other populations (click to expand)

PENDING — will populate as inference completes for the remaining 13 populations.

Outlier Skylines¶

Candidate genomic regions with extreme coalescence times, filtered by accessibility to avoid false positives from missing data. Gene annotations from VectorBase AgamP4 are overlaid using adjustText for non-overlapping labels.

Outlier skyline — 2L heterozygous intra

Burkina Faso, 2L, 2La-heterozygous, intra-individual. Blue = young credible outliers (accessible regions only), red = old credible, gray x = suspect (low accessibility). Bottom panels: Z-score and accessibility fraction.

Outlier skyline — 2L hom inverted intra

Same for homozygous inverted. Note the deep dip near Rdl (28.5 Mb) and CDSB21 — potential selection signatures within the inverted arrangement.

Outlier skyline — Cameroon 2L heterozygous intra

Cameroon, 2L, 2La-heterozygous, intra-individual.

Outlier skyline — Cameroon 2R hom standard intra

Cameroon, 2R, 2Rb-hom-standard, intra-individual.

Karyotype Comparisons¶

Box plots comparing block-level coalescence distributions across karyotype groups and pair types. Heterozygous intra pairs show elevated TMRCA on inversion-bearing chromosomes (2L, 2R), while 3L/3R show uniform patterns.

Burkina Faso karyotype comparison boxplots

Chromosome-Wide Profiles¶

TMRCA profiles across all chromosome arms with per-arm accessibility tracks.

Burkina Faso all arms profile

Each column is one chromosome arm (2L, 2R, 3L, 3R, X). Karyotype groups overlaid by color. Red shading on 2L marks In(2L)a.

Density Distributions¶

KDE density overlays showing the distribution of block-level coalescence times per karyotype group for each arm.

Burkina Faso density overlay Density grid — all populations x groups

Demographic Inference¶

Inverse instantaneous coalescence rates (IICR), a proxy for effective population size Ne(t), estimated from the TMRCA distribution and compared to the stdpopsim A. gambiae Gabon reference model (GabonAg1000G_1A17).

Burkina Faso 2L IICR by karyotype

Burkina Faso 2L: IICR by karyotype. The heterozygous curve (orange) is inflated at intermediate times due to deep coalescence inside the inversion. Dashed black = stdpopsim Gabon reference.

Burkina Faso IICR all arms

All chromosome arms overlaid. 3L and 3R (no inversions) track the stdpopsim reference more closely, providing a cleaner demographic signal.

Cameroon IICR all arms

Cameroon — all chromosome arms overlaid.

Cross-population IICR comparison 3L

Cross-population IICR comparison on 3L (no inversions) — 9 completed populations overlaid.

Geographic Maps¶

Coalescence patterns projected onto Africa using population coordinates from the Ag3 metadata.

Geographic bubble map

Bubble size = sample count, color = mean log-TMRCA. Three panels for hom standard, heterozygous, and hom inverted karyotypes on 2L.

Geographic sparkline map

TMRCA profiles embedded as sparklines at each population’s location. Orange lines show coalescence along 2L, red shading = In(2L)a region.

Inversion effect map

How much deeper is coalescence inside vs outside In(2L)a for each population. Positive values = inversion creates deeper coalescence.

Population Structure¶

PCA and clustering of populations based on their coalescence profiles across all chromosome arms and karyotype groups.

PCA of coalescence profiles

PCA on all features (arms x karyotype groups x pair types). PC1 (66%) separates East Africa (Kenya) from West/Central. Blue = West, purple = Central, red = East Africa.

PCA projected onto Africa map

PC1 projected onto geography. A smooth west-to-east gradient in coalescence profiles mirrors geographic distance.

Isolation by distance

Geographic distance vs coalescence profile distance (r = 0.318, p = 0.009). Dots = within-region pairs, crosses = between-region pairs.

Hierarchical clustering dendrogram

Average-linkage clustering. West African populations form a tight cluster, Central African populations group together, Kenya is an outgroup.

High-Resolution Structure¶

PCA and UMAP using per-block TMRCA profiles across 950 genomic blocks (3L + 3R), capturing fine-scale coalescence variation rather than just population means.

High-resolution PCA

PCA on block-resolution TMRCA profiles. PC1 (74%) captures the dominant west-to-east differentiation axis.

UMAP of TMRCA profiles

UMAP embedding reveals finer population clustering: West African populations separate from Central, with Kenya and Gambia as outliers.

Per-pair PCA

Per-pair PCA on chr3L (375 individual haplotype pairs). Each dot is one within-individual pair, colored by population. Populations form overlapping but distinguishable clouds.

PC1 loadings along the genome

PC1 loadings across 3L and 3R with gene annotations (AGAP IDs) at peak positions. Bottom track shows raw data accessibility (missingness). Peaks indicate candidate regions for geographically varying selection.

PC1 loadings all 5 chromosome arms

PC1 loadings across all 5 chromosome arms (2L, 2R, 3L, 3R, X) with gene annotations and accessibility track. The full ~230 Mb genome in one view.

Gene Region Zooms¶

High-resolution TMRCA profiles (200 bp windows) at candidate gene regions, comparing all populations. Bottom track shows coefficient of variation across populations — peaks indicate regions of geographically varying coalescence.

Gene zoom multi-panel

Six candidate gene regions: para/Vgsc (pyrethroid resistance), Rdl (dieldrin resistance), dpr2 (cell adhesion), Tep1 (immunity), CuSOD3 (oxidative stress), CYP9K1 (metabolic resistance). Each panel shows median TMRCA per population (blue = West, green = Central, red = East Africa).

Vgsc gene zoom

para/Vgsc locus on 2L — voltage-gated sodium channel conferring pyrethroid resistance. Gene bodies annotated. Bottom: CV across populations peaks near the gene, indicating population-specific selection signatures.

dpr2 gene zoom

dpr2 locus on 3R — a top PC1 loading peak. Population-specific TMRCA variation visible around the gene.

Statistical Enrichment¶

Permutation testing and functional enrichment analysis of TMRCA outlier regions and population differentiation peaks.

Permutation test

Resistance loci are significantly enriched among TMRCA outlier blocks (p < 0.0001, 10,000 permutations). Observed = 21 overlaps across all populations, expected by chance = 4.9. Individual population tests show strongest signal in Ghana (14.8x fold, p = 0.0005) and DRC (19.8x fold, p = 0.003).

Functional enrichment

Functional categories enriched at PC1 loading peaks (population differentiation drivers). Ribosomal genes are significantly enriched (1.8x, p = 0.035). Heat shock/chaperone and calcium signaling genes show trends (p < 0.1).

Annotation status

44% of genes at population differentiation peaks are uncharacterized — candidates for functional follow-up.

Progress & Pending Figures¶

Completed populations (12/16): Burkina Faso, Cameroon, Central African Republic, Democratic Republic of the Congo, Equatorial Guinea, Gabon, Gambia, Ghana, Guinea, Guinea-Bissau, Kenya, Mali (all 5 arms each).

Currently running: Mayotte on poppy (4 populations remaining).

The following will be generated as inference completes:

PENDING
Per-population inversion signals (4 remaining)
PENDING
Full cross-population IICR comparison (all 16 overlaid)
PENDING
Geographic maps with all populations filled in
PENDING
Full density grid with all populations (16 rows x 9 cols)
Next
API Reference
Previous
Visualization
Copyright © 2025–2026, Kevin Korfmann
Made with Sphinx and @pradyunsg's Furo
On this page
  • Figure Gallery
    • Overview
    • Inversion Signals
    • Outlier Skylines
    • Karyotype Comparisons
    • Chromosome-Wide Profiles
    • Density Distributions
    • Demographic Inference
    • Geographic Maps
    • Population Structure
      • High-Resolution Structure
      • Gene Region Zooms
      • Statistical Enrichment
    • Progress & Pending Figures