[Mesquitelist] sampling and resampling
James Schulte
jschulte at clarkson.edu
Tue Dec 9 08:11:46 PST 2008
Hello Mesquite Users,
I have a general inquiry as well as perhaps a specific request for
future versions of Mesquite.
I have a multigene dataset with 7 nuclear genes, all of which are
over 1200 bp and a few between 2-3 kb. What I would like to do is to
test alternative strategies of data sampling for large multiple
nuclear gene based projects. That is, is it better to get lots (>10)
of smaller gene fragments (~<700 bp) or use fewer (<10) larger gene
fragments (~>1200) to reconstruct a robustly supported tree for a
particular group.
What I would like to do is to take my dataset with 7 genes and ~11500
bp and rarefy it to make several new datasets that are composed of
difference sized rarefied "genes". So, for example, I would like to
make 100 new datasets, each 10000 bp that are made up of twenty 500bp
"fragments/gene regions" from the larger dataset. Or make 100
datasets 10kb that are each made up of five 2000 bp "fragments/gene
regions". At the moment, I can make a single set of 100 datasets with
a "gene" of a particular size using the rarefy option and batch
architecture but what I haven't been able to figure out, and I'm not
sure if Mesquite can do this yet, is if I can easily concatenate
those 100 datasets with 100 other datasets of another "gene".
I hope this is making at least a little bit of sense and I very much
appreciate any advice anyone may have.
Best regards,
Jim
James A. Schulte, II
Department of Biology
177 Clarkson Science Center, MRC 5805
8 Clarkson Avenue
Clarkson University
Potsdam, NY 13699-5805
Phone: 315-268-4401
Fax: 315-268-7118
More information about the Mesquitelist
mailing list