Alignment module¶
This module contains rules for for carrying out sequence alignment, calculating summary statistics of the results, and generating plots.
Rules¶
- Rule align[source]¶
Use MAFFT to align the combined sequence file against the project reference.
- Input¶:
original – the combined sequence file generated from the
combine
rulereference – the project reference sequence, provided during McCoy project creation
- Config¶:
align.mafft – a list of command line arguments passed directly to MAFFT
default:
['--6merpair', '--keeplength', '--addfragments']
align.threads – the number of threads (cores) to use for a single MAFFT call
default:
4
align.resources – the resources to request when submitting to a cluster
default:
{'runtime': 10, 'mem_mb': 8000}
- Output¶:
the aligned version of the original input file
- Params¶:
the command-line arguments passed to MAFFT (set in align.mafft config entry)
- Threads¶:
set to align.threads from the config file if present, else set by the number of cores available to the workflow (up-to threads_max)
- Resources¶:
set to align.resources in the project config, if present
- Conda¶:
channels: - bioconda - conda-forge dependencies: - mafft==7.471 - seqkit==2.1.0
- Rule alignment_stats[source]¶
- Conda¶:
channels: - jlsteenwyk dependencies: - phykit - jlsteenwyk-biokit
- Rule pairwise_identity_histogram[source]¶
- Conda¶:
channels: - conda-forge dependencies: - python=3.9 - numpy - typer - pandas - plotly - pip - pip: - kaleido