gbseq Class Reference

#include <gbseq.h>

Inheritance diagram for gbseq:

gbdnaseq gbprtseq

List of all members.

Public Member Functions

 gbseq ()
virtual ~gbseq ()
virtual bool read (istream &ins)
virtual void writeAce (ostream &ous, ostream &sub, ostream &snp)=0
void writeSpecies (ostream &ous)
int getLength () const
int getSeqlen () const
string getDate () const
void clear ()
string getKey () const
string getLocusName () const
string getSegmentLocus () const
string getType () const
string getMolType () const
string getOrgAcronym () const
string getOrganism () const
bool hasFeature (const string &f, int b, int e) const
bool isSegment () const

Static Public Member Functions

static void init ()
static void loadOrgmap (const string &file)
static void dumpOrgmap (const string &file)

Static Public Attributes

static const int VAL_START = 12

Protected Attributes

string locus
int seqLength
string definition
vector< string > accession
string version [2]
string dbsource
string keywords
int seg
int segtotal
string source
string organism
string taxonomy
vector< Ref * > references
string comment
vector< featurefeatures
string sequence
string line
string orgacronym

Static Protected Attributes

static map< string, string > orgmap = map<string,string>()
static set< string > orgacronyms = set<string>()
static set< string > taxons = set<string>()


Constructor & Destructor Documentation

gbseq::gbseq (  )  [inline]

References line, and sequence.

virtual gbseq::~gbseq (  )  [inline, virtual]

References references.


Member Function Documentation

bool gbseq::read ( istream ins  )  [virtual]

virtual void gbseq::writeAce ( ostream &  ous,
ostream &  sub,
ostream &  snp 
) [pure virtual]

Implemented in gbprtseq, and gbdnaseq.

Referenced by processOneFile().

void gbseq::writeSpecies ( ostream &  ous  ) 

A smart method that will only write the actual information about a species when it is not in the orgmap

References features, getKey(), organism, split(), taxonomy, and taxons.

Referenced by gbdnaseq::writeAce(), and gbprtseq::writeAce().

int gbseq::getLength (  )  const

get length information from the locus line The Sequence may have a different length if the original source GB file is currpted

References locus, and seqLength.

Referenced by gbdnaseq::read(), gbprtseq::read(), gbdnaseq::writeAce(), gbprtseq::writeAce(), feature::writeAceProtein(), feature::writemRNA(), and feature::writeRNA().

int gbseq::getSeqlen (  )  const [inline]

return the actual sequence length This number is the same as getLength() because I have check the identity of these two numbers druing the reading stage.

References sequence.

string gbseq::getDate (  )  const [inline]

References locus.

Referenced by gbdnaseq::writeAce(), and gbprtseq::writeAce().

void gbseq::clear (  ) 

Reimplemented in gbdnaseq.

References comment, dbsource, features, orgacronym, references, seg, segtotal, seqLength, and sequence.

Referenced by read().

string gbseq::getKey (  )  const [inline]

string gbseq::getLocusName (  )  const [inline]

string gbseq::getSegmentLocus (  )  const

References getLocusName().

Referenced by gbdnaseq::writeAce().

string gbseq::getType (  )  const [inline]

return the type of the molecule DNA, RNA This info comes from the LOCUS line

References locus, and trim().

Referenced by gbdnaseq::writeAce(), feature::writeGene(), feature::writemRNA(), and feature::writeRNA().

string gbseq::getMolType (  )  const

Info comes from the first source annotation that covers the whole sequence. Some sequences describe integration site and will not have the first source feature describing the whole sequence.

will get the mol_type from the first source sequence

References features.

Referenced by gbdnaseq::writeAce(), feature::writeCDS(), and feature::writeGene().

string gbseq::getOrgAcronym (  )  const

First look into the cached member: orgacronym Then look into the orgmap if all fails, make a acronym, starting with 2 letters if the acronym is not unique then make with more letters

this methods should rely on externally defined acronyms

found in map

References length, orgacronym, orgacronyms, organism, orgmap, and split().

Referenced by feature::outgeneline(), feature::subCDS(), feature::writeAceProtein(), feature::writeGene(), and feature::writeProtein().

string gbseq::getOrganism (  )  const [inline]

References organism.

Referenced by feature::writeProtein().

bool gbseq::hasFeature ( const string &  f,
int  b,
int  e 
) const

References features.

Referenced by feature::writeRNA().

bool gbseq::isSegment (  )  const [inline]

References seg.

Referenced by gbdnaseq::writeAce().

void gbseq::init (  )  [static]

initialize a few commonly used organism maps

References orgacronyms, and orgmap.

Referenced by main().

void gbseq::loadOrgmap ( const string &  file  )  [static]

loads orgmap and orgacronyms

References ifstream(), line, orgacronyms, orgmap, and split().

Referenced by main().

void gbseq::dumpOrgmap ( const string &  file  )  [static]

write orgmap to file

References orgmap.

Referenced by main().


Member Data Documentation

const int gbseq::VAL_START = 12 [static]

string gbseq::locus [protected]

int gbseq::seqLength [mutable, protected]

Referenced by clear(), and getLength().

string gbseq::definition [protected]

vector<string> gbseq::accession [protected]

string gbseq::version[2] [protected]

string gbseq::dbsource [protected]

Referenced by clear(), and read().

string gbseq::keywords [protected]

int gbseq::seg [protected]

Referenced by clear(), isSegment(), and read().

int gbseq::segtotal [protected]

Referenced by clear(), and read().

string gbseq::source [protected]

Referenced by read(), and gbdnaseq::writeAce().

string gbseq::organism [protected]

string gbseq::taxonomy [protected]

Referenced by read(), and writeSpecies().

vector<Ref*> gbseq::references [protected]

string gbseq::comment [protected]

vector<feature> gbseq::features [protected]

string gbseq::sequence [protected]

string gbseq::line [protected]

string gbseq::orgacronym [mutable, protected]

Referenced by clear(), and getOrgAcronym().

map< string, string > gbseq::orgmap = map<string,string>() [static, protected]

organism info is an external resource, and the keys made should be unique in the scope. For example, all vertebrates the acronym should be unique among all of them so that when we name genes we can distinguish the same gene from different species. This table should be read and dumpped before each parsing except for one organism genome annotationns.

Referenced by dumpOrgmap(), getOrgAcronym(), init(), and loadOrgmap().

set< string > gbseq::orgacronyms = set<string>() [static, protected]

Referenced by getOrgAcronym(), init(), and loadOrgmap().

set< string > gbseq::taxons = set<string>() [static, protected]

Referenced by writeSpecies().


The documentation for this class was generated from the following files:

Generated on Wed Aug 10 11:57:10 2011 for Softwares from Orpara by  doxygen 1.5.6