FASTA¶
Prebuilt FASTA Data¶
Load yeast test files (sourced from YeastGenome)
-
bio_test_artifacts.prebuilt.
fasta_yeast_chr01
() Make a FASTA file containing yeast chromosome 1 (chr1).
- Returns
An absolute path to the FASTA file and a list [(header, sequence)]
- Return type
str, list(tuple(str, str))
-
bio_test_artifacts.prebuilt.
fasta_protein_yeast_chr01
() Make a FASTA file containing the first two protein sequences located on yeast chromosome 1 (chr1).
- Returns
A path to the FASTA file and a list [(header0, sequence0), (header1, sequence1)]
- Return type
str, list(tuple(str, str))
Synthesized FASTA Data¶
Synthesize a FASTA file
-
bio_test_artifacts.generate.
fasta_generate
(num_records=20, record_length=100, random_seed=42, target_file=None, alphabet='ATGC', probabilities=None) Generate a random FASTA file
- Parameters
num_records (int, optional) – Number of separate records to include in the file. Defaults to 20 records
record_length (int, optional) – Length in bases of each individual record in the file, Defaults to 100 bases per record
random_seed (int, optional) – Seeding for RNG. Defaults to 42
target_file (str, optional) – File target. Optional. Will create a temp file in $TMP if not set.
alphabet (str, optional) – A string containing the alphabet for FASTA record, Defaults to ATGC
probabilities (tuple(float), optional) – An iterable with probabilities for each character in the alphabet string. Defaults to balanced
- Returns
An absolute pathname to a TSV file and a list of (header, seq) tuples
- Return type
str, list(tuple())