mRNAModel Class Reference

#include <RNAModel.h>

Inheritance diagram for mRNAModel:

RNAModel Noschain Range ESTAssembly JGIModel mRNAModelUpdate ESTAssemblyid

List of all members.

Public Member Functions

 mRNAModel ()
 mRNAModel (const Noschain &seg, const string &chrom, const string &gs)
 mRNAModel (const Noschain &seg, int gcdsb_, int gcdse_, const string &chrom, const string &gs, int fr=0)
 mRNAModel (const string &exstr, const string &gname, const string &gs, int oi, int gcb, int gce, int fr=0)
 mRNAModel (const string &exstr, const string &gname, const string &gs, int oi, int gcb, int gce, int gene_id, int fr=0)
 mRNAModel (const string &exstarts, const string &exends, char strand, int gcb, int gce, const string &gi, const string &genomic) throw (PointOutChain, Badinput, exception)
 mRNAModel (const string &exstarts, const string &exends, char strand, int gcb, int gce, const string &gi, const string &genomic, int oo) throw (PointOutChain, Badinput)
 mRNAModel (const mRNAModel &mm)
char guessStrand () const
mRNAModeloperator= (const mRNAModel &mm)
bool samePeptide (const mRNAModel &mm) const
bool sameGene (const mRNAModel &mod) const
Noschain FivePrimeUTR () const
Noschain ThreePrimeUTR () const
int num5NoncodingExons () const
int num3NoncodingExons () const
Noschain CDSChain () const throw (PointOutChain)
Range CDSRange () const
pair< int, int > genomicCDSBound () const
Range RNACDSRange () const
pair< int, int > RNACDSBound () const
int CDSLength () const
int genomicCDSLength () const
int genomicCDSEnd () const
int genomicCDSBegin () const
const string & proteinSequence () const
int proteinLength () const
int proteinLengthNoTail () const
int getFrame () const
string CDSSeq () const
string CDSSequence () const
char CDSDirection () const
double CDSFraction () const
double CDSFractionRNA () const
double CDSFractionGenomic () const
int FivePrimeUTRLength () const
int ThreePrimeUTRLength () const
int UTRLength () const
void UTR3Sequence (string &seq) const
string UTR3Sequence () const
bool hasStart () const
bool hasStop () const
bool complete () const
int numberOfInternalStops () const
const string & getProtein () const
void trimAfterPoint (const int p) throw (PointOutChain)
void trimBeforePoint (const int p) throw (PointOutChain)
void trimBeforePoint (const int gp, const int rp) throw (PointOutChain)
void setProtein (const string &pseq)
void setRNACDS (int bb, int ee) throw (OutsideGenomicSequence)
void resetRNACDS (int bb, int ee) throw (OutsideGenomicSequence)
void setGenomicCDS (int gb, int ge) throw (OutsideGenomicSequence, InvalidModel)
void setGenomicCDS (pair< int, int > gcr)
bool growCDS3Prime (int len) throw (OutsideGenomicSequence)
bool trimCDSTail ()
bool trimCDSStop ()
void reset ()
void resetProtein ()
void setLongestCDSAndProtein ()
bool append (mRNAModel &mod, int &comment)
string JGIModelRow (const char sep='\t') const
ostream & printJGIModelRow (ostream &ous, const char sep='\t') const
string JGITranscriptRow (const char sep='\t') const
ostream & printJGITranscriptRow (ostream &ous, const char sep='\t') const
ostream & printJGITranscriptRowNoId (ostream &ous, const char sep='\t') const
string JGIProteinRow (char sep='\t') const
string sfCDSGenomic () const
Noschain CDSOnewayChain () const
void CDSOnewayChain (Noschain &ch) const
string sfCDSTranscript () const
string sfExonsProtein () const
ostream & show (ostream &ous) const
bool genuine () const
bool semiGenuine () const
bool isStar () const
modeltype objtype () const
bool valid () const
ostream & print (ostream &ous) const
mRNAModelreverse (int newRNACDSB, int newRNACDSE)
string name (const string &prefix)
void writetab (ostream &mod, ostream &ex, ostream &track, ostream &orna, ostream &oprt, char sep='\t') const
ostream & writeModelTable (ostream &ous, char sep='\t') const

Static Public Member Functions

static void setShortestModel (int len)

Static Public Attributes

static const char modelheader [] = "modid\tgeneid\tgenomicId\tbegin\tend\tgenomicCDSb\tgenomicCDSe\tnumberOfExons\tsumexonLength\texonstring\tCDSb\tCDSe\tRNAseq\tpepSeq\tframe"
static const char jgiModelCol [] = "id\tchrom\tstrand\tstart\tend\tcdsStart\tcdsEnd\tsfCount\tsfStarts\tsfEnds"
static const char jgiTranscriptCol [] = "transcriptId\tlengthGenomic\tlengthTranscript\tlengthCDS\tsfExonsGenomic\tsfCDSGenomic\tsfExonsTranscript\tsfCDSTranscript\tseqGenomic\tseqTranscript\tseqCDS"
static const char jgiProteinCol [] = "proteinId\ttranscriptId\tlengtht\tsfExons\tseq"
static int shortestpep = 30
static int shortestmodel = 90
static int utrlen5max = 2000
static int utrlen3max = 1900

Protected Attributes

int cdsb
int cdse
int gcdsb
int gcdse
string pep
int frame

Static Protected Attributes

static char header [350] = ""


Detailed Description

for mRNA models on the Genomic Sequence The class added cds information in addition to RNAModel. If the CDS information is provided it will use it, otherwise it will compute it from the spliced mRNA product. Partial mRNA models are possible.

Constructor & Destructor Documentation

mRNAModel::mRNAModel (  )  [inline]

mRNAModel::mRNAModel ( const Noschain seg,
const string &  chrom,
const string &  gs 
) [explicit]

guess the CDS range as the longest ORF More or less EST model. Done more for the work of ESTModel. Set the frame to 0. May not be the correct ORF. There are about 10 such models in one fungal genome.

References RNAModel::reset(), and setLongestCDSAndProtein().

mRNAModel::mRNAModel ( const Noschain seg,
int  gcdsb_,
int  gcdse_,
const string &  chrom,
const string &  gs,
int  fr = 0 
) [explicit]

Parameters:
gcdsb_ genomic CDS Start position.
gcdse_ genomic CDS end position. Last base of stop codon.
fr CDS frame. 0,1, or 2.
gs. Genomic sequence.
chrom genomic id. This is mainly used by breakup method in the ESTAssembly class.
This constructor will not compute the frame from the gcdsb_ and gcdse_, it is considered final.

References cdsb, cdse, frame, gcdsb, gcdse, pep, RNAModel::rna, RNAModel::RNAIndex(), Noschain::show(), and translate().

mRNAModel::mRNAModel ( const string &  exstr,
const string &  gname,
const string &  gs,
int  oi,
int  gcb,
int  gce,
int  fr = 0 
)

Used to read models stored in files. All model information is given, given genomic CDS range RNA cds range is computed. Pepseq is also computed. This is used to read stored objects. The default is 0 for frame, this is useful for JGI models where frame is not recorded.

References cdsb, cdse, frame, gcdsb, gcdse, pep, RNAModel::rna, RNAModel::RNAIndex(), and translate().

mRNAModel::mRNAModel ( const string &  exstr,
const string &  gname,
const string &  gs,
int  oi,
int  gcb,
int  gce,
int  gene_id,
int  fr = 0 
)

mRNAModel::mRNAModel ( const string &  exstarts,
const string &  exends,
char  strand,
int  gcb,
int  gce,
const string &  gi,
const string &  genomic 
) throw (PointOutChain, Badinput, exception)

this is the JGI input format, where strand indicate the direction, all numbers are from small to large

Parameters:
gi. Genomic identifier, such as scallfol_1 .
genomic. Genomic sequence.

References cdsb, cdse, gcdsb, gcdse, guessStrand(), pep, RNAModel::reverse(), RNAModel::rna, RNAModel::RNAIndex(), and translate().

mRNAModel::mRNAModel ( const string &  exstarts,
const string &  exends,
char  strand,
int  gcb,
int  gce,
const string &  gi,
const string &  genomic,
int  oo 
) throw (PointOutChain, Badinput)

with extra oid assigned from external key.

References cdsb, cdse, gcdsb, gcdse, guessStrand(), pep, RNAModel::reverse(), RNAModel::rna, RNAModel::RNAIndex(), and translate().

mRNAModel::mRNAModel ( const mRNAModel mm  )  [inline]

copy constructor of the same type


Member Function Documentation

char mRNAModel::guessStrand (  )  const

use the intron bound to determine strand of the model if it is not given or given in the wrong format (other than +,-). Cannot guess single exon models

Returns:
+,-, or ?. This is a helper function for the constructor.

References Noschain::exons, RNAModel::gseq, Noschain::numberOfRanges(), and str2upper().

Referenced by mRNAModel().

mRNAModel & mRNAModel::operator= ( const mRNAModel mm  ) 

bool mRNAModel::samePeptide ( const mRNAModel mm  )  const [inline]

no overloading of the base ==

References pep.

bool mRNAModel::sameGene ( const mRNAModel mod  )  const

Noschain mRNAModel::FivePrimeUTR (  )  const [inline]

Noschain mRNAModel::ThreePrimeUTR (  )  const [inline]

int mRNAModel::num5NoncodingExons (  )  const [inline]

int mRNAModel::num3NoncodingExons (  )  const [inline]

Noschain mRNAModel::CDSChain (  )  const throw (PointOutChain)

Range mRNAModel::CDSRange (  )  const [inline]

the genomic Range of the CDS. As oppose to RNACDSRange() that is not very useful.

References gcdsb, gcdse, and Range::Range().

Referenced by append(), incompatible(), updateCompatible(), and JGIModel::valid().

pair<int,int> mRNAModel::genomicCDSBound (  )  const [inline]

References gcdsb, and gcdse.

Range mRNAModel::RNACDSRange (  )  const [inline]

References cdsb, cdse, and Range::Range().

Referenced by ESTAssemblyid::breakup(), and readESTModel().

pair<int,int> mRNAModel::RNACDSBound (  )  const [inline]

References cdsb, and cdse.

int mRNAModel::CDSLength (  )  const [inline]

int mRNAModel::genomicCDSLength (  )  const [inline]

References gcdsb, and gcdse.

Referenced by append(), and sameGene().

int mRNAModel::genomicCDSEnd (  )  const [inline]

References gcdse.

Referenced by checkstop().

int mRNAModel::genomicCDSBegin (  )  const [inline]

References gcdsb.

const string& mRNAModel::proteinSequence (  )  const [inline]

References pep.

Referenced by checkBadStopIndex(), and writeResult().

int mRNAModel::proteinLength (  )  const [inline]

this method will return protein length one longer than known amino acid sequence if the CDS is partial at 3' end. For example, if you only know 2 of the 3 codon of the last AA, the protein will be ...GADETAL2 If you don't want the tails then you use another function

References pep.

Referenced by checkBadStopIndex(), incompatible(), and updateCompatible().

int mRNAModel::proteinLengthNoTail (  )  const

not including the partial AA if present

References pep.

Referenced by checkBadStopIndex().

int mRNAModel::getFrame (  )  const [inline]

References frame.

Referenced by readESTModel().

string mRNAModel::CDSSeq (  )  const [inline]

shorter name for CDSSequence()

References cdsb, cdse, and RNAModel::rna.

Referenced by JGIModel::valid().

string mRNAModel::CDSSequence (  )  const

return the underlying CDS ORF Do error checking for debugging stage.

References cdsb, cdse, RNAModel::rna, show(), and string().

Referenced by checkstop(), JGITranscriptRow(), printJGITranscriptRow(), printJGITranscriptRowNoId(), trimCDSStop(), and trimCDSTail().

char mRNAModel::CDSDirection (  )  const [inline]

References gcdsb, and gcdse.

Referenced by JGIModel::valid().

double mRNAModel::CDSFraction (  )  const [inline]

fraction of the CDS in genomic sequence see also CDSFractionRNA() and CDSFractionGenomic()

References CDSLength(), and Range::length().

Referenced by isStar().

double mRNAModel::CDSFractionRNA (  )  const [inline]

fraction of CDS in RNA, this is usefule in judging the quality of model

References CDSLength(), and RNAModel::exonLength().

Referenced by append(), genuine(), ESTAssembly::isChimera(), and updatedWorse().

double mRNAModel::CDSFractionGenomic (  )  const [inline]

CDS compared to genomic fraction, not very useful because it depends on the average size of introns

References CDSLength(), and Range::length().

int mRNAModel::FivePrimeUTRLength (  )  const [inline]

Non coding length excluding introns. This is the part in the RNA before CDS begin

References cdsb.

Referenced by genuine(), ESTAssembly::isChimera(), isStar(), ESTAssembly::prune5PrimeUTR(), semiGenuine(), ESTAssembly::shouldBreakPrefix(), updatedWorse(), and UTRLength().

int mRNAModel::ThreePrimeUTRLength (  )  const [inline]

int mRNAModel::UTRLength (  )  const [inline]

void mRNAModel::UTR3Sequence ( string &  seq  )  const [inline]

set seq to the 3'-UTR sequence of the mature mRNA

References cdse, and RNAModel::rna.

string mRNAModel::UTR3Sequence (  )  const [inline]

References cdse, and RNAModel::rna.

bool mRNAModel::hasStart (  )  const [inline]

CDS divided by the genomic Ranges

References pep.

Referenced by complete(), ESTAssembly::shouldBreakPrefix(), and JGIModel::toSQLString().

bool mRNAModel::hasStop (  )  const [inline]

bool mRNAModel::complete (  )  const [inline]

the ORF is complete

References hasStart(), and hasStop().

Referenced by append(), checkBadStopIndex(), genuine(), isStar(), and semiGenuine().

int mRNAModel::numberOfInternalStops (  )  const

compare the translated product which is 3x faster than comparing RNA sequences but they may have different exon structure! so we have to compare both exon structure and translation product. Translation product is not sufficient. shold not overload! couunt the * inside the protein sequence

References pep.

Referenced by append(), readJGIModel(), updatedWorse(), and JGIModel::valid().

const string& mRNAModel::getProtein (  )  const [inline]

reader function

References pep.

Referenced by bestORF(), readESTModel(), and ESTAssembly::write().

void mRNAModel::trimAfterPoint ( const int  p  )  throw (PointOutChain)

this should not be defined, I am defining it here to check exceptions. After the debug stage this function should be removed. RNA::trimAfterPoint is sufficient.

Reimplemented from RNAModel.

References cdse, RNAModel::RNAIndex(), and RNAModel::trimAfterPoint().

Referenced by ESTAssembly::budMinusSuffixModel(), ESTAssembly::budPlusSuffixModel(), and ESTAssembly::prune3PrimeUTR().

void mRNAModel::trimBeforePoint ( const int  p  )  throw (PointOutChain)

void mRNAModel::trimBeforePoint ( const int  gp,
const int  rp 
) throw (PointOutChain)

void mRNAModel::setProtein ( const string &  pseq  )  [inline]

void mRNAModel::setRNACDS ( int  bb,
int  ee 
) throw (OutsideGenomicSequence)

will set RNA CDS range, and genomic CDS Ranges. Will not set peptide sequence. The caller must call reset protein to update the protein sequence. This operation will only change the bounds and will not able to reverse the direction of the CDS. The frame info will be derived from bb.

References cdsb, cdse, frame, gcdsb, gcdse, RNAModel::genomicIndex(), itos(), RNAModel::rna, and OutsideGenomicSequence::what().

Referenced by resetRNACDS().

void mRNAModel::resetRNACDS ( int  bb,
int  ee 
) throw (OutsideGenomicSequence) [inline]

only change the range of RNACDS, bb should < ee; otherwise, it will crash.

References resetProtein(), and setRNACDS().

Referenced by ESTAssemblyid::breakup(), and reverse().

void mRNAModel::setGenomicCDS ( int  gb,
int  ge 
) throw (OutsideGenomicSequence, InvalidModel)

Set genomic CDS Range, the set RNA CDS range. Will not reset peptide sequence. this is to make the operation more automatic.

References Range::begin(), cdsb, cdse, Range::direction(), Range::end(), gcdsb, gcdse, RNAModel::gseq, RNAModel::RNAIndex(), and show().

Referenced by setGenomicCDS().

void mRNAModel::setGenomicCDS ( pair< int, int >  gcr  )  [inline]

References cdsb, cdse, and setGenomicCDS().

bool mRNAModel::growCDS3Prime ( int  len  )  throw (OutsideGenomicSequence)

this method is more efficient than set*CDS() methods since it only deals with one end, We should assume len is > 0. If grow outside genomic sequence then it will throw an exception.

Returns:
true if success, false if running out of genomic. Should only use this method for short and not in a loop. It is a very expensive operation.

References cdse, delta(), Range::direction(), Range::end(), gcdse, RNAModel::genomicIndex(), RNAModel::genomicLength(), Noschain::growEnd(), itos(), resetProtein(), RNAModel::resetRNA(), RNAModel::rna, RNAModel::RNALength(), and OutsideGenomicSequence::what().

Referenced by checkBadStopIndex().

bool mRNAModel::trimCDSTail (  ) 

return true for success false for failure. When protein has AA*L situation use this method to remove the aa after the stop codon.

References cdse, CDSSequence(), delta(), frame, gcdse, RNAModel::genomicIndex(), pep, resetProtein(), RNAModel::resetRNA(), and subseqIsStop().

Referenced by checkBadStopIndex().

bool mRNAModel::trimCDSStop (  ) 

AKPNE*** fgenesh1_kg makes such models!

References cdse, CDSSequence(), gcdse, RNAModel::genomicIndex(), resetProtein(), RNAModel::resetRNA(), and subseqIsStop().

Referenced by checkBadStopIndex().

void mRNAModel::reset (  )  [virtual]

reset RNA sequence and recompute peptide sequence This is an expensive operation.

Reimplemented from RNAModel.

References cdsb, cdse, frame, gcdsb, gcdse, RNAModel::genomicIndex(), longestORFPlus(), Noschain::numberOfRanges(), pep, RNAModel::resetRNA(), Noschain::reverse(), reverseComplement(), RNAModel::rna, and Noschain::show().

Referenced by ESTAssembly::fixIntronBound().

void mRNAModel::resetProtein (  )  [inline]

simplly regenerate protein sequence from this objects rna sequence according to [cdsb+frame, cdse] information. Does not find optimal ORF.

References cdsb, cdse, frame, pep, RNAModel::rna, and translate().

Referenced by growCDS3Prime(), resetRNACDS(), trimCDSStop(), and trimCDSTail().

void mRNAModel::setLongestCDSAndProtein (  ) 

find longest ORF, if single exon model, will also find in reverse strand. Then set CDS range, and protein.

assume RNA has been reset

References cdsb, cdse, frame, gcdsb, gcdse, RNAModel::genomicIndex(), longestORFPlus(), Noschain::numberOfRanges(), pep, Noschain::reverse(), reverseComplement(), RNAModel::rna, and Noschain::show().

Referenced by append(), mRNAModel(), and readJGIModel().

bool mRNAModel::append ( mRNAModel mod,
int &  comment 
)

mRNAModel version, use RNAModel append. then reset protein.

Parameters:
comment 0 for regular, 1 for frame-shift
return false if the resulting CDS has shorter CDS than the parent model and the parent model has internal stop. Some aggressive algorithms could build models over fake introns.

References RNAModel::append(), CDSFractionRNA(), CDSLength(), CDSRange(), complete(), RNAModel::geneId(), genomicCDSLength(), Range::length(), max, numberOfInternalStops(), Noschain::numberOfRanges(), Range::overlap(), Range::sameDirection(), RNAModel::setGeneId(), setLongestCDSAndProtein(), and show().

Referenced by mRNAModelUpdate::append().

string mRNAModel::JGIModelRow ( const char  sep = '\t'  )  const

output one row of JGI Model Schema data dump excep the useless columns (id, chrom, strand, start, end, cdsStart, cdsEnd, sfCount, sfStarts, sfEnds) Note name column will be produced by the derived class that is usually class-specific.

References Range::begin(), Range::direction(), Range::end(), gcdsb, gcdse, RNAModel::getOid(), RNAModel::gid, Noschain::numberOfRanges(), and Noschain::startEnd().

Referenced by ESTAssemblyid::writetab(), ESTAssembly::writetab(), mRNAModelUpdate::writetab(), and writetab().

ostream & mRNAModel::printJGIModelRow ( ostream &  ous,
const char  sep = '\t' 
) const

string mRNAModel::JGITranscriptRow ( const char  sep = '\t'  )  const

NP= not produced transcriptId, locusId (NP), name (NP), description (NP), status (NP), type (NP) lengthGenomic, lengthTranscript, lengthCDS, sfExonsGenomic, sfCDSGenomic, sfExonsTranscript, sfCDSTranscript, seqGenomic, seqTranscript, seqCDS, annotatorId (NP), annotatable (NP), creationDate (NP) transcriptId will be the same as modelid. This is the worst creation (crime) by any programmer!

name, and description will be delayed by derived classes. (transcriptId, lengthGenomic, lengthTranscript, lengthCDS, sfExonsGenomic, sfCDSGenomic, sfExonsTranscript, sfCDSTranscript, seqGenomic, seqTranscript, seqCDS)

References CDSLength(), CDSSequence(), RNAModel::getOid(), Range::length(), RNAModel::rna, RNAModel::RNASequence(), RNAModel::seqGenomic(), sfCDSGenomic(), sfCDSTranscript(), Noschain::sfExonsGenomic(), and Noschain::sfExonsTranscript().

Referenced by ESTAssemblyid::writetab(), ESTAssembly::writetab(), mRNAModelUpdate::writetab(), and writetab().

ostream & mRNAModel::printJGITranscriptRow ( ostream &  ous,
const char  sep = '\t' 
) const

ostream & mRNAModel::printJGITranscriptRowNoId ( ostream &  ous,
const char  sep = '\t' 
) const

string mRNAModel::JGIProteinRow ( char  sep = '\t'  )  const

proteinId, transcriptId, name (NP), description (NP), length, sfExons, seq, annotatable (NP)

References RNAModel::getOid(), pep, and sfExonsProtein().

Referenced by ESTAssemblyid::writetab(), ESTAssembly::writetab(), mRNAModelUpdate::writetab(), and writetab().

string mRNAModel::sfCDSGenomic (  )  const

Noschain mRNAModel::CDSOnewayChain (  )  const

one way chain in transcript coordinate system

References cdsb, CDSChain(), Noschain::exonOnlyChain(), and Noschain::onewayChain().

void mRNAModel::CDSOnewayChain ( Noschain ch  )  const

string mRNAModel::sfCDSTranscript (  )  const [inline]

string mRNAModel::sfExonsProtein (  )  const

ostream & mRNAModel::show ( ostream &  ous  )  const [virtual]

bool mRNAModel::genuine (  )  const [inline]

reasonable full length model, no restriction on the number of UTR exons. This will promote models that are likely to have sequence error in the genomic sequences. This is for testing protein models. There could be non-coding RNAs that should be tested with Genuine Non-coding models.

Reimplemented in ESTAssembly.

References CDSFractionRNA(), CDSLength(), complete(), FivePrimeUTRLength(), num3NoncodingExons(), num5NoncodingExons(), pep, and ThreePrimeUTRLength().

Referenced by ESTAssembly::genuine().

bool mRNAModel::semiGenuine (  )  const [inline]

bool mRNAModel::isStar (  )  const [inline]

good quality model: UTR < 600 nt, CDS length > 330 nt, numUTR exon < 2 or CDS fraction > 65%

Reimplemented in ESTAssembly.

References CDSFraction(), CDSLength(), complete(), FivePrimeUTRLength(), num3NoncodingExons(), num5NoncodingExons(), ThreePrimeUTRLength(), utrlen3max, and utrlen5max.

Referenced by ESTAssembly::isStar().

modeltype mRNAModel::objtype (  )  const [inline, virtual]

Reimplemented from RNAModel.

Reimplemented in JGIModel, and ESTAssembly.

bool mRNAModel::valid (  )  const [virtual]

check the following 1. CDS genomic range outside model 2. CDS length < 5 3. RNA length should agree with exonLength

Reimplemented from RNAModel.

References Range::begin(), cdsb, cdse, Range::direction(), Range::end(), gcdsb, gcdse, RNAModel::rna, and RNAModel::valid().

Referenced by ESTAssembly::prune3PrimeUTR(), and ESTAssembly::prune5PrimeUTR().

ostream & mRNAModel::print ( ostream &  ous  )  const [virtual]

for operator<<

Reimplemented from RNAModel.

Reimplemented in JGIModel, and ESTAssembly.

References cdsb, cdse, CDSLength(), pep, and RNAModel::print().

Referenced by ESTAssembly::print(), and JGIModel::print().

mRNAModel & mRNAModel::reverse ( int  newRNACDSB,
int  newRNACDSE 
)

prevents reversing of multiexon models. it will crash, if you try to. This will prevent the programmer from making mistakes. cds range, pep, and frame will all needs to be reset.

Reimplemented in ESTAssembly.

References Noschain::numberOfRanges(), resetRNACDS(), and RNAModel::reverse().

string mRNAModel::name ( const string &  prefix  )  [inline]

References RNAModel::getOid(), and itos().

static void mRNAModel::setShortestModel ( int  len  )  [inline, static]

References shortestmodel.

Referenced by main().

void mRNAModel::writetab ( ostream &  mod,
ostream &  ex,
ostream &  track,
ostream &  orna,
ostream &  oprt,
char  sep = '\t' 
) const

ostream & mRNAModel::writeModelTable ( ostream &  ous,
char  sep = '\t' 
) const

following columns: (modid geneid genomicId begin end genomicCDSb genomicCDSe numberOfExons sumexonLength exonstring CDSb CDSe RNAseq pepSeq frame)

Produces one row of tab-delimited text for loading into database. Should make derived class use this method. But will need to change the order of the columns. So will just use copy-pasting at this point.

Reimplemented in mRNAModelUpdate.

References Range::begin(), cdsb, cdse, Range::end(), RNAModel::exonLength(), frame, gcdsb, gcdse, RNAModel::geneId(), RNAModel::genomicId(), RNAModel::getOid(), Noschain::numberOfRanges(), pep, RNAModel::rna, and Noschain::toString().

Referenced by writetab().


Member Data Documentation

const char mRNAModel::modelheader = "modid\tgeneid\tgenomicId\tbegin\tend\tgenomicCDSb\tgenomicCDSe\tnumberOfExons\tsumexonLength\texonstring\tCDSb\tCDSe\tRNAseq\tpepSeq\tframe" [static]

Reimplemented in mRNAModelUpdate, and ESTAssembly.

Referenced by main().

const char mRNAModel::jgiModelCol = "id\tchrom\tstrand\tstart\tend\tcdsStart\tcdsEnd\tsfCount\tsfStarts\tsfEnds" [static]

const char mRNAModel::jgiTranscriptCol = "transcriptId\tlengthGenomic\tlengthTranscript\tlengthCDS\tsfExonsGenomic\tsfCDSGenomic\tsfExonsTranscript\tsfCDSTranscript\tseqGenomic\tseqTranscript\tseqCDS" [static]

const char mRNAModel::jgiProteinCol = "proteinId\ttranscriptId\tlengtht\tsfExons\tseq" [static]

int mRNAModel::shortestpep = 30 [static]

currently my default is 30. This is the size of smallest proteins identified in Fungi.

Referenced by JGIModel::valid().

int mRNAModel::shortestmodel = 90 [static]

currently set to 90. This is smaller than 120 nt for smallest mRNA known in fungi. So we are not discarding any useful information.

Referenced by JGIModel::JGIModel(), setShortestModel(), and JGIModel::valid().

int mRNAModel::utrlen5max = 2000 [static]

95% of the gene's UTR length upper limit for both 3' and 5' default 5' 2000, 3' 1900

Referenced by isStar(), and semiGenuine().

int mRNAModel::utrlen3max = 1900 [static]

Referenced by isStar(), and semiGenuine().

int mRNAModel::cdsb [protected]

int mRNAModel::cdse [protected]

int mRNAModel::gcdsb [protected]

int mRNAModel::gcdse [protected]

string mRNAModel::pep [protected]

int mRNAModel::frame [protected]

marks which translation frame it the peptide derived from. There are three frames: 0, 1, and 2. This is only useful when the 5'-end of the RNA is partial. In this case, CDS starts from 1, but translation could start from 0, 1, or 2. In other cases, when CDS begin > 2, the frame is 0. This frame is the frame to start translation after taking a subsequence [cdsb,cdse], that is different from the reading frame that cdsb is in which can be computed by (cdsb-1)3.

In another word, this is only for translation instruction, not the absolute frame of the CDS with regards to the whole mRNA.

Referenced by ESTAssemblyid::breakup(), ESTAssembly::budMinusSuffixModel(), ESTAssembly::budPlusPrefixModel(), getFrame(), mRNAModel(), operator=(), reset(), resetProtein(), ESTAssembly::setCDSInfo(), setLongestCDSAndProtein(), setRNACDS(), show(), JGIModel::toSQLString(), trimCDSTail(), JGIModel::valid(), ESTAssembly::writeModel(), mRNAModelUpdate::writeModelTable(), and writeModelTable().

char mRNAModel::header = "" [static, protected]


The documentation for this class was generated from the following files:

Generated on Wed Aug 10 11:57:13 2011 for Softwares from Orpara by  doxygen 1.5.6