Utilities

Utilities

Util holds general methods for file naming, os differences, chemical formula parsing, etc.

class sasmol.utilities.Copy_Using_Mask[source]

Bases: object

classmethod from_sasmol(class_instance, mask, **kwargs)[source]
Parameters
  • class_instance – system object

  • mask – integer list

  • kwargs – optional future keyword arguments

Returns

new system object

Return type

molecule

Examples

>>> import sasmol.system as system
>>> import sasmol.util as utilties
>>> molecule = system.Molecule('hiv1_gag.pdb')
>>> molecule.name()[0]
'N'
>>> molecule.name()[4]
'CA'
>>> molecule.name()[14]
'HB1'
>>> mask = [0, 4, 14]
>>> new_molecule = utilities.Copy_Using_Mask.from_sasmol(molecule, mask)
>>> new_molecule.name()
['N', 'CA', 'HB1']
>>> new_molecule.mass()
array([ 14.00672 ,  12.01078 ,   1.007947])

Note

if more attributes are added in system.Atom() then the key lists below need

to be updated

currently only list_keys and numpy_keys are returned

int_keys would have to be recalculated based on mask

short_keys would have to be re-initialized (init_children)

may want to consider passing a list of specific attributes to extract if memory is an issue

copies all frames (untested)

why can’t this method be in subset?

class sasmol.utilities.Element(symbol, name, atomicnumber, molweight)[source]

Bases: object

addsyms(weight, result)[source]
getweight()[source]
class sasmol.utilities.ElementSequence(*seq)[source]

Bases: object

addsyms(weight, result)[source]
append(thing)[source]
displaysyms(sym2elt)[source]
getweight()[source]
set_count(n)[source]
class sasmol.utilities.Tokenizer(input)[source]

Bases: object

error(msg)[source]
gettoken()[source]
sasmol.utilities.build_dict(s)[source]
sasmol.utilities.check_integrity(obj, fast_check=False, warn=True)[source]
sasmol.utilities.duplicate_molecule(molecule, number_of_duplicates)[source]
sasmol.utilities.find_unique(this_list)[source]
sasmol.utilities.get_chemical_formula(formula_string)[source]
sasmol.utilities.parse(s, sym2elt)[source]
sasmol.utilities.parse_fasta(fasta_sequence, **kwargs)[source]

method to convert fasta_sequence object to list of strings for each valid sequence in the initial object

format is based on the NCBI fasta format convention https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=BlastHelp

notes:

fasta_sequence

list with formatted fasta input

kwargs

optional future arguments

all_sequences

a list containing sequences without comments, spaces, carriage returns, numbers, or termination flags.

or

an error indicating an empty line in the file

>>> import sasmol.sasutil as sasutil
>>> fasta_sequence = open('test_fasta.txt', 'r').readlines()
>>> all_sequences = sasutil.parse_fasta(fasta_sequence)
>>> print(all_sequences)
  1. lines beginning with > or ; are treated as comments and passed

  2. spaces are ignored

    • are ignored (should be a termination)

4) numbers are ignored so you can have numbering at the beginning of a line 5)

are processed
  1. comment lines are NOT required in the input

  2. comment lines cause a new sequence to be started

sasmol.utilities.parse_sequence(sym2elt)[source]