[Mesquitelist] sampling and resampling

James Schulte jschulte at clarkson.edu
Tue Dec 9 08:11:46 PST 2008


Hello Mesquite Users,

I have a general inquiry as well as perhaps a specific request for
future versions of Mesquite.

I have a multigene dataset with 7 nuclear genes, all of which are
over 1200 bp and a few between 2-3 kb.  What I would like to do is to
test alternative strategies of data sampling for large multiple
nuclear gene based projects.  That is, is it better to get lots (>10)
of smaller gene fragments (~<700 bp) or use fewer (<10) larger gene
fragments (~>1200) to reconstruct a robustly supported tree for a
particular group.

What I would like to do is to take my dataset with 7 genes and ~11500
bp and rarefy it to make several new datasets that are composed of
difference sized rarefied "genes".  So, for example, I would like to
make 100 new datasets, each 10000 bp that are made up of twenty 500bp
"fragments/gene regions" from the larger dataset.  Or make 100
datasets 10kb that are each made up of five 2000 bp "fragments/gene
regions". At the moment, I can make a single set of 100 datasets with
a "gene" of a particular size using the rarefy option and batch
architecture but what I haven't been able to figure out, and I'm not
sure if Mesquite can do this yet, is if I can easily concatenate
those 100 datasets with 100 other datasets of another "gene".

I hope this is making at least a little bit of sense and I very much
appreciate any advice anyone may have.

Best regards,
Jim


James A. Schulte, II
Department of Biology
177 Clarkson Science Center, MRC 5805
8 Clarkson Avenue
Clarkson University
Potsdam, NY  13699-5805
Phone:  315-268-4401
Fax:  315-268-7118



More information about the Mesquitelist mailing list