API

pydamage.main.pydamage_analyze(bam, wlen=30, minlen=0, threshold=0.5, show_al=False, process=1, outdir='', plot=False, verbose=False, force=False, group=False, rescale=False, subsample=None, no_ga=False)[source]

Runs the pydamage analysis for each reference separately

Parameters:
  • bam (str) – Path to alignment (sam/bam/cram) file

  • minlen (int) – minimum reference sequence length threshold

  • threshold (float) – Predicted accuracy threshold

  • wlen (int) – window length

  • show_al (bool) – print alignments representations

  • process (int) – Number of processes for parellel computing

  • outdir (str) – Path to output directory

  • verbose (bool) – verbose mode

  • force (bool) – force overwriting of results directory

  • group (bool) – Use entire BAM file as single reference for analysis

  • plot (bool) – Write damage fitting plots to disk

  • rescale (bool) – Rescale base quality scores using the PyDamage damage model

  • subsample (float) – Subsample a fraction of the reads for damage modelling

  • no_ga (bool) – Do not use G->A transitions

Returns:

pandas DataFrame containg pydamage results

Return type:

pd.DataFrame