Noschain Class Reference

#include <GenModel.h>

Inheritance diagram for Noschain:

Range Alnchain ESTModel GenModel mRNAModelLight RNAModel Alnchainid mRNAModelCaiwe mRNAModel ESTAssembly JGIModel mRNAModelUpdate ESTAssemblyid

List of all members.

Public Member Functions

 Noschain ()
 Noschain (const string &exstarts, const string &exends, char strand) throw (Badinput)
 Noschain (const string &exstr) throw (Badinput)
 Noschain (const Noschain &eg)
void fixAndBuild (const vector< int > &bb, const vector< int > &ee, char strand)
 ~Noschain ()
Noschainoperator= (const Noschain &eg)
Noschainoperator+= (const Range &r)
Noschain operator+ (int c) const
Noschain operator- (int c) const
Noschainoperator-= (int c)
Noschainoperator+= (int c)
Rangeoperator[] (int idx)
const Rangeoperator[] (int idx) const
Range firstExon () const
Range lastExon () const
bool operator== (const Noschain &nc) const
bool operator!= (const Noschain &nc) const
bool operator< (const Noschain &nc) const
int compareByDirection (const Noschain &nc) const
Noschain commonExons (const Noschain &eg) const
Noschain commonIntrons (const Noschain &nc) const
bool isNull () const
bool empty () const
void clear ()
void setNull ()
int exonLength () const
int intronLength () const
Range maxIntron () const
int maxIntronLength () const
Range minIntron () const
int minIntronLength () const
vector< Rangeintrons () const
bool containIntron (int p1, int p2) const
Range outerRange () const
vector< RangeintronsInside (const int len) const
int numberOfRanges () const
Noschain subchainBeforePoint (int p) const throw (PointOutChain)
Noschain subchainAfterPoint (int p) const throw (PointOutChain)
void trimAfterPoint (const int p) throw (PointOutChain)
void trimBeforePoint (const int p) throw (PointOutChain)
Noschain subchain (const Range &r) const throw (PointOutChain)
Noschain subchain (int bg, int ed) const throw (PointOutChain)
const vector< Range > & getExons () const
void setExons (const vector< Range > &ex)
void setChain (const Noschain &chain)
void updateOuterRange ()
void set (const Range &r) throw (Badinput)
void setEnd (int ne) throw (PointOutChain)
void setBegin (int nb)
void growEnd (int icr)
int advancePosOnExon (int pos, unsigned int d) const
int retreatPosOnExon (int pos, unsigned int d) const
int exonIndex (int p) const
int exonIndex (const Range &r) const
bool insideExon (int p) const
bool insideExon (const Range &r, bool samedir=false) const
virtual ostream & show (ostream &ous=cout) const
virtual ostream & print (ostream &ous) const
string toString () const
pair< string, string > startEnd (char sep=',') const
string jgiformat (char sep='\t') const
string sfExonsGenomic () const
string sfExonsTranscript () const
Noschain onewayChain () const
void onewayChain (Noschain &owc) const
void exonOnlyChain (Noschain &eoc) const
Noschain exonOnlyChain () const
string asDelimitedString (const char sep=':', const char rsep[]=",") const
string asDelimitedString (const char sep=':', const char rsep=',') const
bool exonContain (const Noschain &nc, bool samedirection=false) const
int compareChainAndFix (Noschain &nc, int &edit)
bool exonOverlap (const Noschain &nc) const
int exonOverlapLength (const Noschain &nc) const
int exonIntersectLength (const Noschain &nc) const
pair< float, float > exonIntersectFraction (const Noschain &nc) const
bool append (Noschain &nc)
void extendEnd (const Noschain &nc)
Noschainreverse ()
void subsequence (string &sub, const string &gs) const throw (Badinput)
bool isDangle (const int intronlen=900, const int exonlen=27) const
bool directionAgree () const

Protected Attributes

vector< Rangeexons

Friends

Noschain operator+ (int c, const Noschain &chain)
Noschain operator- (int c, const Noschain &chain)
ostream & operator<< (ostream &ous, const Noschain &nc)


Detailed Description

This class mainly models the Gene Model with exons as dominant data structure.

Non-overlapping sorted Range Could be either in the + or - direction


Constructor & Destructor Documentation

Noschain::Noschain (  )  [inline]

Noschain::Noschain ( const string &  exstarts,
const string &  exends,
char  strand 
) throw (Badinput)

this is an input structure from JGI model track's two columns: sfStarts, sfEnds 949,1166,1734,2619,3101,4104,5056,5763,6351,6808,8040, 1050,1442,2218,2833,3168,4848,5423,5824,6540,7539,8518, These numbers are 1-based index.

References allLessVector(), checkCoordinates(), nondecreasingVector(), nonincreasingVector(), Range::set(), and string2Vector().

Noschain::Noschain ( const string &  exstr  )  throw (Badinput)

This constructor can take two different input formats 5462578->5462635 5462918->5463000 The second format of input is as follows: 15468-15365,15315-15270,15205-15157,15101-13481 This is the storeage format of this set of programs.

References E, and Range::set().

Noschain::Noschain ( const Noschain eg  ) 

copy constructor

References exons, numberOfRanges(), and Range::set().

Noschain::~Noschain (  )  [inline]


Member Function Documentation

void Noschain::fixAndBuild ( const vector< int > &  bb,
const vector< int > &  ee,
char  strand 
)

constructor help for bad inputs This make this class very expensive. I am checking possible input errors for every input.

will identify Genewise format if only on - strand: Bn-->En, Bn-1 --> En-1, ....

References exons, Range::Range(), and Range::set().

Noschain & Noschain::operator= ( const Noschain eg  ) 

Noschain & Noschain::operator+= ( const Range r  ) 

add one Range at a time. This can be used to construct a long chain.

References Range::begin(), Range::end(), exons, Range::setBegin(), and Range::setEnd().

Noschain Noschain::operator+ ( int  c  )  const

scalar shifting of all ranges

Reimplemented from Range.

References exons.

Noschain Noschain::operator- ( int  c  )  const

shift to the left

Reimplemented from Range.

References exons.

Noschain & Noschain::operator-= ( int  c  ) 

Reimplemented from Range.

References Range::begin(), exons, and Range::set().

Noschain & Noschain::operator+= ( int  c  ) 

Reimplemented from Range.

References Range::begin(), exons, and Range::set().

Range& Noschain::operator[] ( int  idx  )  [inline]

returns the exon as Range index on idx 0-based index

References exons.

const Range& Noschain::operator[] ( int  idx  )  const [inline]

return a const reference to the underying exons

References exons.

Range Noschain::firstExon (  )  const [inline]

References exons.

Referenced by exonOverlap().

Range Noschain::lastExon (  )  const [inline]

References exons.

Referenced by exonOverlap(), tilingMinusLength(), and tilingPlusLength().

bool Noschain::operator== ( const Noschain nc  )  const [inline]

all the exons are identical

References exons.

Referenced by GenModel::operator==(), RNAModel::sameExons(), and GenModel::sameExons().

bool Noschain::operator!= ( const Noschain nc  )  const [inline]

References exons.

bool Noschain::operator< ( const Noschain nc  )  const

Ordering is directionless first by outer range "<" if outer range identical, then by direction + before -o after than compare each component exon with "<". This operator is essential for associative container where different objects should not be discarded. For example, in set containner. if you insert the two objects 1-->2 5-->7 9-50 and 1-->2 5-->50 The container will have only one object if we use the Range less operator only.

30->1076 is less than 1076->573 507->30

References Range::copyReverse(), Range::direction(), exons, Range::operator<(), and outerRange().

int Noschain::compareByDirection ( const Noschain nc  )  const

overload Range method, add more fine scale comparision

References Range::compareByDirection(), exons, and numberOfRanges().

Referenced by lessByChainDirection::operator()(), lessByChainDirectionPtr::operator()(), and testOperator().

Noschain Noschain::commonExons ( const Noschain eg  )  const

return the common exons as Noschain object Right now only if they are both in the same direction, will this function return common exons. Otherwise, will return an emty object.

References exons, Range::greaterByDirection(), I, Range::lessByDirection(), numberOfRanges(), Range::overlap(), and Range::sameDirection().

Noschain Noschain::commonIntrons ( const Noschain nc  )  const

It is easier to have common introns than to have common exons.

Left to right scan, skipping non common regions.

References Range::begin(), Range::direction(), Range::end(), exons, I, numberOfRanges(), Range::Range(), and Range::sameDirection().

Referenced by mRNAModelLight::sameGene(), mRNAModel::sameGene(), and testIntersect().

bool Noschain::isNull (  )  const [inline, virtual]

operations that return information about this object return true if there is no Range in the chain.

Reimplemented from Range.

References exons, and Range::isNull().

Referenced by onewayChain(), operator=(), print(), RNAModel::RNAModel(), and show().

bool Noschain::empty (  )  const [inline]

An empty range is one whose begin and end are both at the origin.

Reimplemented from Range.

References exons.

void Noschain::clear (  )  [inline]

References exons, and Range::setNull().

void Noschain::setNull (  )  [inline, virtual]

make everying empty

Reimplemented from Range.

References exons, and Range::setNull().

Referenced by onewayChain(), and operator=().

int Noschain::exonLength (  )  const

int Noschain::intronLength (  )  const

Returns:
the total intron length, or sum intron length

References exons, and numberOfRanges().

Referenced by isChimera(), print(), and show().

Range Noschain::maxIntron (  )  const

int Noschain::maxIntronLength (  )  const [inline]

Range Noschain::minIntron (  )  const

int Noschain::minIntronLength (  )  const [inline]

References Range::length(), and minIntron().

Referenced by readandstoreSam().

vector< Range > Noschain::introns (  )  const

return the introns as vector of Range, not Noschain. This is a more basic type than Noschain.

References Range::begin(), Range::direction(), Range::end(), exons, numberOfRanges(), and Range::Range().

bool Noschain::containIntron ( int  p1,
int  p2 
) const [inline]

there is intron between p1 and p2 the order of p1 and p2 is not important. p1 can be before or after p2 but they must be inside exons.

References exonIndex().

Range Noschain::outerRange (  )  const [inline]

Return the outer Range which is the same as the parent class Range. May get the same object through pointer casting.

References Range::b, Range::e, and Range::Range().

Referenced by clustergene(), operator<(), and writeModGeneRow().

vector< Range > Noschain::intronsInside ( const int  len  )  const

take intron inside len ranges |--len--|====== Pick introns inside this region =========|--len --| Shrink both ends by len, if the remaining gene model is still long enough. Otherwise will return empty vector.

References Range::begin(), Range::direction(), E, Range::end(), exons, Range::length(), numberOfRanges(), and Range::Range().

Referenced by checkUTRChimera().

int Noschain::numberOfRanges (  )  const [inline]

References exons.

Referenced by mRNAModelUpdate::addESTCover(), adskip(), alternativeAcceptor(), alternativeDonor(), mRNAModel::append(), append(), Coverdepth::append(), appendChainSingle2Many(), Graph::assembleOneRound(), Coverdepth::assign(), astypes(), bestORF(), bothADAA(), ESTAssembly::breakPrefixModel(), ESTAssembly::breakSuffixModel(), ESTAssemblyid::breakup(), buildGeneCluster(), CEorAA(), ESTAssembly::checkIntronBound(), checkUTRChimera(), cmpEditMinusChainFirstFront(), cmpEditPlusChainFirstFront(), cmpEditPlusChainIdenticalFront(), commonExons(), commonIntrons(), compareByDirection(), compareChainAndFix(), compareMinusChainFirstBefore(), compareMinusChainIdenticalRight(), compareMultipleWithSingle(), comparePlusChainFirstBefore(), comparePlusChainIdenticalLeft(), countNInIntrons(), Coverdepth::Coverdepth(), directionAgree(), ESTModel::ESTModel(), exonContain(), exonIntersectLength(), exonOnlyChain(), exonOverlap(), exonOverlapLength(), exonOverlapLengthManyToOne(), extendEnd(), ESTAssembly::findDip(), ESTAssembly::fixIntronBound(), mRNAModel::guessStrand(), incompatible(), RNAModel::intronBound(), intronLength(), introns(), intronsInside(), ESTAssembly::isChimera(), isChimera(), jgiformat(), mRNAModel::JGIModelRow(), makeExonLength(), maxIntron(), minIntron(), mutualExclusive(), Noschain(), mRNAModel::num3NoncodingExons(), GenModel::num3NoncodingExons(), mRNAModel::num5NoncodingExons(), GenModel::num5NoncodingExons(), operator=(), pickGoodModel(), print(), GenModel::printAllmod(), mRNAModel::printJGIModelRow(), readandstoreSam(), removeChimeraModels(), mRNAModel::reset(), retainIntron(), mRNAModel::reverse(), mRNAModel::sameGene(), ESTAssembly::setCDSInfo(), mRNAModel::setLongestCDSAndProtein(), mRNAModel::sfCDSGenomic(), mRNAModel::sfExonsProtein(), ESTAssembly::shouldBreakPrefix(), ESTAssembly::shouldBreakSuffix(), skipExon(), skipExonVariant(), stopsInIntrons(), Gmapres::storeChain(), subchain(), subchainAfterPoint(), subchainBeforePoint(), tilingMinus(), tilingMinusLength(), tilingPlus(), tilingPlusLength(), GenModel::toJGIModel(), JGIModel::toJGIString(), updateCompatible(), JGIModel::valid(), ESTAssembly::write(), ESTModel::write(), ESTCombined::write(), RNAModel::writeExon(), ESTAssembly::writeModel(), mRNAModelUpdate::writeModelTable(), and mRNAModel::writeModelTable().

Noschain Noschain::subchainBeforePoint ( int  p  )  const throw (PointOutChain)

if in + direction, to the left if in - direction to the right Not including this point (exclusive).

References Range::begin(), Range::contain(), Range::direction(), Range::end(), exons, numberOfRanges(), Range::Range(), and show().

Referenced by mRNAModel::FivePrimeUTR(), GenModel::FivePrimeUTR(), GenModel::FivePrimeUTRLength(), and testsubchain().

Noschain Noschain::subchainAfterPoint ( int  p  )  const throw (PointOutChain)

opposite of beforePoint() Point must be inside exons. If outside of Top range then throw exception. If not inside exons also throw exceptions. We may implement version that can have input point inside introns.

The returned chain will have start point at (p+1) |p | then start in the next exon =====----==== For - chains, it return p-1 and below. So after refers to a 5'->3' direction regardless of the direction of the chain.

References Range::begin(), Range::contain(), Range::direction(), Range::end(), exons, numberOfRanges(), and Range::Range().

Referenced by testsubchain(), mRNAModel::ThreePrimeUTR(), GenModel::ThreePrimeUTR(), and GenModel::ThreePrimeUTRLength().

void Noschain::trimAfterPoint ( const int  p  )  throw (PointOutChain)

require p inside exon, can be end of an exon. ===-->==== ===<---====== on - remove left | | p p Discard after p, p is retained.

Reimplemented in RNAModel, and mRNAModel.

References exonIndex(), exons, and setEnd().

Referenced by RNAModel::trimAfterPoint().

void Noschain::trimBeforePoint ( const int  p  )  throw (PointOutChain)

retain every thing before p The direction is dictated by the 5' to 3' direction. For - it is from large to small. for + it is from small to large.

Reimplemented in RNAModel, and mRNAModel.

References Range::end(), exonIndex(), exons, itos(), Range::Range(), Range::setBegin(), and setBegin().

Referenced by mRNAModel::trimBeforePoint(), and RNAModel::trimBeforePoint().

Noschain Noschain::subchain ( const Range r  )  const throw (PointOutChain)

Noschain Noschain::subchain ( int  bg,
int  ed 
) const throw (PointOutChain) [inline]

References Range::Range(), and subchain().

const vector<Range>& Noschain::getExons (  )  const [inline]

References exons.

Referenced by GenModel::CDSSeq().

void Noschain::setExons ( const vector< Range > &  ex  )  [inline]

References exons, and updateOuterRange().

Referenced by setChain().

void Noschain::setChain ( const Noschain chain  )  [inline]

simple replacing the underlying exons with the new chain Will also update outer range. This methods imply calls setExons()

References exons, and setExons().

Referenced by ESTAssembly::budMinusPrefixModel(), ESTAssembly::budMinusSuffixModel(), ESTAssembly::budPlusPrefixModel(), ESTAssembly::budPlusSuffixModel(), ESTAssembly::prune3PrimeUTR(), and ESTAssembly::prune5PrimeUTR().

void Noschain::updateOuterRange (  )  [inline]

only sets the parent Range

References Range::begin(), exons, and Range::set().

Referenced by setExons().

void Noschain::set ( const Range r  )  throw (Badinput)

Set outer range with Rang::set() This function first does a check, if the chain is multiexon, and r is in different direction then it will throw an Badinput exception.

Reimplemented from Range.

References Range::set().

Referenced by exonOnlyChain().

void Noschain::setEnd ( int  ne  )  throw (PointOutChain)

extend the parrent method to set both chain end and the parent end. I am only allowing modification of the first and last exons; this is mainly for a narrow range of operation so that I can catch errors.

Reimplemented from Range.

Reimplemented in GenModel.

References Range::direction(), exons, and Range::setEnd().

Referenced by append(), appendChainSingle2Many(), Alnchainid::averageWith(), cmpEditMinusChainFirstFront(), cmpEditPlusChainFirstFront(), cmpEditPlusChainIdenticalFront(), compareMinusChainFirstBefore(), compareMinusChainIdenticalRight(), compareMultipleWithSingle(), comparePlusChainFirstBefore(), comparePlusChainIdenticalLeft(), extendEnd(), GenModel::GenModel(), JGIModel::JGIModel(), GenModel::setEnd(), trimAfterPoint(), and JGIModel::valid().

void Noschain::setBegin ( int  nb  ) 

void Noschain::growEnd ( int  icr  ) 

simple extend parent End and the exon end using Range::growEnd() The chain has no concept of limits the domain is [-inf, +inf], so this operation will always make sense.

Reimplemented from Range.

Reimplemented in GenModel, and JGIModel.

References exons, and Range::growEnd().

Referenced by mRNAModel::growCDS3Prime(), JGIModel::growEnd(), and GenModel::growEnd().

int Noschain::advancePosOnExon ( int  pos,
unsigned int  d 
) const

use this method to operate a index inside the exon space such that if it is at the end of an exon, increament by 1 will jump to the start of the next exon. I will not do range checking to see it is outside of this object or not. I will simply return a number in the direction of this object by d units.

References Range::direction(), Range::end(), exonIndex(), and exons.

Referenced by ESTAssembly::prune3PrimeUTR(), and testAdvance().

int Noschain::retreatPosOnExon ( int  pos,
unsigned int  d 
) const

not implemented yet

References Range::begin(), Range::direction(), Range::end(), exonIndex(), and exons.

Referenced by ESTAssembly::prune5PrimeUTR(), and testAdvance().

int Noschain::exonIndex ( int  p  )  const

return the index of the exon inside this chain return -1 if not inside exons

Parameters:
p is a 1-based position in the sequence usually.

References Range::contain(), and exons.

Referenced by advancePosOnExon(), appendChainSingle2Many(), compareMultipleWithSingle(), containIntron(), exonOverlapLengthManyToOne(), extendEnd(), insideExon(), retreatPosOnExon(), tilingMinus(), tilingMinusLength(), tilingPlus(), tilingPlusLength(), trimAfterPoint(), trimBeforePoint(), JGIModel::valid(), and GenModel::valid().

int Noschain::exonIndex ( const Range r  )  const

References exons, and Range::overlap().

bool Noschain::insideExon ( int  p  )  const [inline]

testing a point inside the range of one exon

References exonIndex().

Referenced by exonContain(), GenModel::GenModel(), and JGIModel::JGIModel().

bool Noschain::insideExon ( const Range r,
bool  samedir = false 
) const

test r inside one of the exons regardless of the directions of r

Parameters:
samedir the default is false, so it does not care about r in the same direction as the chain. If set to true, then if r in the oppositdirection then this function will return false.

References Range::contain(), and exons.

ostream & Noschain::show ( ostream &  ous = cout  )  const [virtual]

ostream & Noschain::print ( ostream &  ous  )  const [virtual]

this is mostly a utility function for implementing the <<operator so that is works properly with derived classes. It is still the human readable format.

Reimplemented in Alnchain, Alnchainid, RNAModel, mRNAModel, JGIModel, and ESTAssembly.

References Range::direction(), exonLength(), exons, intronLength(), isNull(), and numberOfRanges().

Referenced by RNAModel::print(), and Alnchain::print().

string Noschain::toString (  )  const

output the exons in a format b1-e1,b2-e2,... format. This format can be fed into one of the constructors. Possible storeage format in file for future debugging.

References exons.

Referenced by ESTAssembly::ESTAssembly(), ESTModel::ESTModel(), JGIModel::toSQLString(), GenModel::toSQLString(), JGIModel::toString(), GenModel::toString(), Alnchainid::toString(), Alnchain::toString(), ESTAssembly::write(), ESTModel::write(), ESTCombined::write(), ESTAssembly::writeModel(), mRNAModelUpdate::writeModelTable(), and mRNAModel::writeModelTable().

pair< string, string > Noschain::startEnd ( char  sep = ','  )  const

output starts and ends as a pair of strings The ranges are always from small to large. This is mainly for output in JGI sfStart and sfEnds column.

References Range::direction(), and exons.

Referenced by jgiformat(), mRNAModel::JGIModelRow(), GenModel::printAllmod(), mRNAModel::printJGIModelRow(), GenModel::toJGIModel(), and JGIModel::toJGIString().

string Noschain::jgiformat ( char  sep = '\t'  )  const

strand, start, end, sfCount, sfStarts, sfEnds columns for data dump, and table upload. it only needs the chrom column.

References Range::begin(), Range::direction(), Range::end(), numberOfRanges(), and startEnd().

Referenced by ESTAssembly::write(), ESTModel::write(), and ESTCombined::write().

string Noschain::sfExonsGenomic (  )  const

string Noschain::sfExonsTranscript (  )  const

Noschain Noschain::onewayChain (  )  const

This chain is the same as the parent chain except that it was derived from the parent chain by making the begin() as 1. It is always from 5' to 3' direction as in DNA sequence or RNA sequence

References Range::begin(), and Range::direction().

Referenced by mRNAModel::CDSOnewayChain(), mRNAModel::sfCDSTranscript(), mRNAModel::sfExonsProtein(), and sfExonsTranscript().

void Noschain::onewayChain ( Noschain owc  )  const

more efficient version, saved from returning a large object

References Range::begin(), Range::direction(), isNull(), and setNull().

void Noschain::exonOnlyChain ( Noschain eoc  )  const

produce exons only chain, no introns This is used for sfExonsTranscript and sfCDSTranscript for JGI output. The coordinate system is the RNA not genomic DNA. More efficient version. This method should be called after onewayChain to make the formate of sfExonsTranscript column. It preserved the first starting point begin()

References Range::end(), Range::length(), numberOfRanges(), and set().

Referenced by mRNAModel::CDSOnewayChain(), mRNAModel::sfExonsProtein(), and sfExonsTranscript().

Noschain Noschain::exonOnlyChain (  )  const

save as above, but returning an object

References Range::end(), exons, Range::length(), numberOfRanges(), and Range::Range().

Referenced by mRNAModel::sfCDSTranscript().

string Noschain::asDelimitedString ( const char  sep = ':',
const char  rsep[] = "," 
) const

for string in sfExonsTranscripts

Parameters:
sep delimiter between ranges.
rsep delimiter within ranges

References exons.

Referenced by mRNAModel::sfCDSGenomic(), mRNAModel::sfExonsProtein(), and sfExonsTranscript().

string Noschain::asDelimitedString ( const char  sep = ':',
const char  rsep = ',' 
) const

References exons.

bool Noschain::exonContain ( const Noschain nc,
bool  samedirection = false 
) const

This chain contain nc in terms of all exons To deal with the fuzzy ends of some sequence alignments I will implement a similar method

If nc has only one exon, then the direction of nc is not important if samedirection is set to false wich is the default.

This function is for EST assembly, so for identical case is not considered contain but it belongs to the overlap case.

PointOutChain exception translates to NO answer. Should not throw exceptions.

References Range::contain(), exons, insideExon(), numberOfRanges(), show(), and subchain().

Referenced by testOverlap(), updateOnePredicted(), and updateOneUpdated().

int Noschain::compareChainAndFix ( Noschain nc,
int &  edit 
)

this function will also alter the Noschain to fix the ends of this or nc.

A ======<------======<-------========<-------==== B =======<-------========<-------========<-------==== | trim from here

return [1,0,2,-2] 1. exon compatible overlapping 2. exon contain -2. exon contained by nc 0. no meaningful exon overlap Should not use this method if you don't want to modify the object.

Parameters:
edit indicator of editing action applied to [0,1,2,3]. 0 no edit. 1 edit first or this object, 2 edit second object or nc 3 edited both first and second object. I need to write a const version of this method.

all the following are dealing with chains both

References Range::begin(), cmpEditMinusChainFirstFront(), cmpEditPlusChainFirstFront(), cmpEditPlusChainIdenticalFront(), compareMinusChainIdenticalRight(), compareMultipleWithSingle(), Range::contain(), Range::direction(), numberOfRanges(), Range::overlay(), reverseCompareResult(), and reverseEdit().

Referenced by Graph::assembleOneRound(), main(), and testOverlap().

bool Noschain::exonOverlap ( const Noschain nc  )  const

overlap with another chain

If the outer Range does not overlay then no overlap. 1. Both have one exon a. if overlay, the true b. else false

2. One has one exon, the other has multiple exon == ==--->== regardless of direction. This will be used for assembly gene model fragments. Will not consider exon contain situation.

3. both are multiple exons then ====---====---=== ===---=====---==== Strictly all overlapping introns have to be identical. When two chains have different introns such as 807669->807725 807838->808143 808327->808572 808625->809110 807547->807725 808327->808473 the this two chain should not have exonOverlap This is the consant version of compareChainAndFix()

References Range::begin(), Range::direction(), Range::end(), firstExon(), Range::largerEnd(), lastExon(), numberOfRanges(), Range::overlay(), Range::smallerEnd(), tilingMinus(), and tilingPlus().

int Noschain::exonOverlapLength ( const Noschain nc  )  const

exactly the same as exonOverlap I am adding extra return information about the length of overlap. This is useful for making decisions such model update by EST. This version is more useful. Require identical intron between the two chains. I should implement another function where introns don't have to be identical.

Requires identical intron structure. This method is used to test the condition for appending another model to this one.

For looking at actual total overlap of exons regardless of intron structure, we need to use exonInterceptLength()

References Range::begin(), Range::direction(), exonOverlapLengthManyToOne(), numberOfRanges(), Range::overlay(), tilingMinusLength(), and tilingPlusLength().

Referenced by mRNAModelLight::sameGene(), testOverlap(), updateOnePredicted(), and updateOneUpdated().

int Noschain::exonIntersectLength ( const Noschain nc  )  const

Even two chains have different intron structure, they could have intercept length. ======----===---======--------------=== ======--=====---======----====------= This function will be useful for detecting models from the same gene.

References Range::direction(), exons, numberOfRanges(), and Range::overlap().

Referenced by exonIntersectFraction(), and testIntersect().

pair< float, float > Noschain::exonIntersectFraction ( const Noschain nc  )  const

the first one is the smaller, and the second one is the larger

References exonIntersectLength(), and exonLength().

Referenced by mRNAModel::sameGene().

bool Noschain::append ( Noschain nc  ) 

this operation should only be applied to chains that are compatible. This is used for the assembly of EST fragments into larger regions on genomic seq. This operation grows the chain by merging the two chains relying on the overlapping region.

Could change nc, such as reversing its direction, this is kind of the extension of the compareChainAndFix function, such that it will continue to finish the merging. Sometimes, we only need to know the relationship, sometime we need to merge the segments. Instead of throwing exceptions, I will return true for sucess and false for failure. This make the production and debug separate. This function is mainly used by the assembler. mRNAModelUpdate also use this function.

Reimplemented in RNAModel.

References appendChainSingle2Many(), Range::begin(), Range::direction(), E, Range::end(), extendEnd(), max, min, numberOfRanges(), reverse(), setBegin(), setEnd(), and show().

Referenced by RNAModel::append(), Graph::assembleOneRound(), and testOverlap().

void Noschain::extendEnd ( const Noschain nc  ) 

use by append to extend the end with a compatible chain. this: ====----====---==== nc ===---======---====

References Range::end(), exonIndex(), exons, numberOfRanges(), Range::setEnd(), and setEnd().

Referenced by append().

Noschain & Noschain::reverse (  ) 

void Noschain::subsequence ( string &  sub,
const string &  gs 
) const throw (Badinput)

Parameters:
gs the input genomic sequence. This is passed as a reference. The coordinates of the exons should refer to this genomic sequence.
sub it the output sequence that is the splicing product according to this objects intron structure. I used a reference instead of returning a large object to make the performance more tolerable.

References Range::begin(), Range::direction(), Range::end(), exons, itos(), Range::largerEnd(), Range::length(), and reverseComplementInPlace().

Referenced by ESTModel::ESTModel(), RNAModel::reset(), RNAModel::resetRNA(), and RNAModel::RNAModel().

bool Noschain::isDangle ( const int  intronlen = 900,
const int  exonlen = 27 
) const

Describe a situation where the end exon is very short and the end intron is very large. Such as =----------------=====--====-------------------= Normal introns could be 55nt for Fungi

References Range::begin(), Range::end(), exons, and Range::length().

Referenced by readandstoreSam(), and Gmapres::storeChain().

bool Noschain::directionAgree (  )  const

simple check to see the outer range direction agrees with all the exons' directions. This is not required for a chain. A chain can have subranges of different directions, but in most biological objects such as exons, they are in the same directions.

References Range::direction(), exons, numberOfRanges(), and show().

Referenced by Graphid::assemble(), and RNAModel::valid().


Friends And Related Function Documentation

Noschain operator+ ( int  c,
const Noschain chain 
) [friend]

Noschain operator- ( int  c,
const Noschain chain 
) [friend]

Each Range of the chain becomes (c-exons[i])

ostream& operator<< ( ostream &  ous,
const Noschain nc 
) [friend]

user friendly debug format Human readable. to make this work for derived class, this must use a virtual function print()


Member Data Documentation

vector<Range> Noschain::exons [protected]


The documentation for this class was generated from the following files:

Generated on Wed Aug 10 11:57:13 2011 for Softwares from Orpara by  doxygen 1.5.6