Utilities¶
Utilities¶
Util holds general methods for file naming, os differences, chemical formula parsing, etc.
- class sasmol.utilities.Copy_Using_Mask[source]¶
Bases:
object- classmethod from_sasmol(class_instance, mask, **kwargs)[source]¶
- Parameters
class_instance – system object
mask – integer list
kwargs – optional future keyword arguments
- Returns
new system object
- Return type
molecule
Examples
>>> import sasmol.system as system >>> import sasmol.util as utilties >>> molecule = system.Molecule('hiv1_gag.pdb') >>> molecule.name()[0] 'N' >>> molecule.name()[4] 'CA' >>> molecule.name()[14] 'HB1' >>> mask = [0, 4, 14] >>> new_molecule = utilities.Copy_Using_Mask.from_sasmol(molecule, mask) >>> new_molecule.name() ['N', 'CA', 'HB1'] >>> new_molecule.mass() array([ 14.00672 , 12.01078 , 1.007947])
Note
- if more attributes are added in system.Atom() then the key lists below need
to be updated
currently only list_keys and numpy_keys are returned
int_keys would have to be recalculated based on mask
short_keys would have to be re-initialized (init_children)
may want to consider passing a list of specific attributes to extract if memory is an issue
copies all frames (untested)
why can’t this method be in subset?
- sasmol.utilities.parse_fasta(fasta_sequence, **kwargs)[source]¶
method to convert fasta_sequence object to list of strings for each valid sequence in the initial object
format is based on the NCBI fasta format convention https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=BlastHelp
notes:
- fasta_sequence
list with formatted fasta input
- kwargs
optional future arguments
all_sequences
a list containing sequences without comments, spaces, carriage returns, numbers, or termination flags.
or
an error indicating an empty line in the file
>>> import sasmol.sasutil as sasutil >>> fasta_sequence = open('test_fasta.txt', 'r').readlines() >>> all_sequences = sasutil.parse_fasta(fasta_sequence) >>> print(all_sequences)
lines beginning with > or ; are treated as comments and passed
spaces are ignored
are ignored (should be a termination)
4) numbers are ignored so you can have numbering at the beginning of a line 5)
- are processed
comment lines are NOT required in the input
comment lines cause a new sequence to be started