API Tutorial
************

Aldy can be used as a standalone application or as a Python module.

Here is a quick overview of the functions that Aldy provides.
Detailed API documentation is available in :ref:`aldy_api`.


One-step genotyping API
=======================

You can get the genotypes of a SAM/BAM with a single call::

  import aldy.genotype

  result = aldy.genotype.genotype(
    "cyp2d6", "/path/to/sample.bam", profile_name="wgs", output_file=None
  )
  print(result)

Please check :py:class:`aldy.solution.SolvedAllele` and :py:mod:`aldy.genotype` for more details.


Loading a gene
==============

We always start by loading a gene database for a gene that is to be genotyped::

  import aldy.gene
  gene = aldy.gene.Gene("path/to/gene.yaml", genome="hg38")


Alternatively, we can use genes shipped with Aldy::

  import aldy.gene
  import aldy.common

  gene_path = aldy.common.script_path("aldy.resources.genes/cyp2d6.yml")
  gene = aldy.gene.Gene(gene_path)  # for CYP2D6


Loading a SAM/BAM or VCF file
=============================

Before working with alignment files, it is necessary to load a profile as follows::

  import aldy.profile
  profile = aldy.profile.Profile.load(gene, "illumina")

If using VCF files, construct the profile as follows::

  profile = Profile("user_provided", cn_solution=["1", "1"])

Various model parameters can be passed to the :py:meth:`aldy.profile.Profile.load` method.
One such parameter is a custom copy-number neutral region that can be passed via the ``cn_region`` parameter
(by default, the *CYP2D8* region is used).
Please consult :py:class:`aldy.profile.Profile` for details about the other parameters.

A SAM/BAM/CRAM or a VCF file can be loaded as follows::

  import aldy.sam
  sample = sample = aldy.sam.Sample(gene, profile, path="my/sample.bam")

Pass a path to the reference genome (``reference`` parameter) when using a CRAM file.

Detecting copy number configurations and fusions
================================================

To detect copy number configurations, run::

  import aldy.cn
  cn_sols = aldy.cn.estimate_cn(gene, profile, sample.coverage, solver="cbc")

You can also use ``solver="gurobi"`` if you have Gurobi installed.

The result is a list of :py:class:`aldy.solutions.CNSolution` objects described in API.


Calling the major and minor star-alleles
========================================

Once you get copy number solutions, you can call valid major star-alleles for each copy number solution::

  import aldy.major

  major_sols = [sol
                for cn_sol in cn_sols
                for sol in aldy.major.estimate_major(gene, sample.coverage, cn_sol, "cbc")]
  # Get the best major star-allele score
  min_score = min(major_sols, key=lambda m: m.score).score
  # Take the best major star-allele calls
  major_sols = sorted([m for m in major_sols if abs(m.score - min_score) < 1e-3],
                     key=lambda m: m.score)

You are pretty much set if you need only major star-alleles.
However, if you have multiple equally likely major star-allele calls, or if you need
to get the complete star-allele decomposition, then do the following::

  import aldy.minor

  minor_sols = aldy.minor.estimate_minor(gene, sample.coverage, major_sols, "cbc")
  # Get the best minor star-allele score
  min_score = min(minor_sols, key=lambda m: m.score).score
  # Get the best minor (and major) star-allele
  minor_sols = [m for m in minor_sols if abs(m.score - min_score) < 1e-3]

``minor_sols`` will contain a list of :py:class:`aldy.solutions.MinorSolution` objects
that point to the optimal major star-alleles and the corresponding copy numbers.

Finally, if you want to get a nice diplotype (e.g., \*1/\*2+\*3), just type::

  minor_solution.get_major_diplotype()
  minor_solution.get_minor_diplotype()  # if interested in minor star-alleles

A more detailed explanation of these functions is available in the :ref:`aldy_api`.