<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Tea leaves and population substructure</title>
	<atom:link href="http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/</link>
	<description></description>
	<lastBuildDate>Fri, 24 May 2013 00:04:00 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
	<item>
		<title>By: My genotyping results, plus a brief introduction to population genetics &#124; Opening Delinda&#039;s Box</title>
		<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/#comment-30840</link>
		<dc:creator>My genotyping results, plus a brief introduction to population genetics &#124; Opening Delinda&#039;s Box</dc:creator>
		<pubDate>Sat, 26 Feb 2011 20:57:16 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=9999#comment-30840</guid>
		<description>[...] with this kind of analysis is that it is often hard to determine which K&#8217;s are meaningful. Razib has a good caveat on the limitations of ADMIXTURE here. Anyhow, my data is at the very bottom. As you can see, when K=2 I am overwhelmingly European with [...] </description>
		<content:encoded><![CDATA[<p>[...] with this kind of analysis is that it is often hard to determine which K&#8217;s are meaningful. Razib has a good caveat on the limitations of ADMIXTURE here. Anyhow, my data is at the very bottom. As you can see, when K=2 I am overwhelmingly European with [...] </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ohwilleke</title>
		<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/#comment-30839</link>
		<dc:creator>ohwilleke</dc:creator>
		<pubDate>Wed, 23 Feb 2011 00:57:14 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=9999#comment-30839</guid>
		<description>It is a matter of using the right tools for the right purposes.  Admixture comes into the analysis with a preconceived bias based on an arbitrarily set K number and an admixture of multiple ancesteral population eigenvector model.

When you know how many ancestral populations there are with some accuracy, the program fits the data to the model without computational agony for the user.  But, this program is not designed to figure out how many clusters there are in a set.

There are other statistical tools that simply look for clusters.  PCA analysis and eyeballing the data is pretty good, although the problem it has that computer programs can solve, is that people have a very hard time seeing more than three or four dimensions at once (motion and colors and 3D can get you to five), while computers can see in many dimensions at once.  When you have good reason to believe that a tree-like model is appropriate, there are some very good statistical computer programs that create tree-like clusters of data in a phylogenetic relationship.

Neither admixture nor cluster analysis tells you anything about how closely related the clusters are to each other in an absolute sense as opposed to relative to each other.  Tools like Fst measure that aspect.

There is also nothing wrong with going into statistical analysis with strong Baysean priors about how you expect the data to come out IF YOUR PRIORS ARE ACCURATE.  A lot of the time in anthropology, your priors may actually be more accurate than your main data set.  You may know exactly how many ancestral populations there are and when they came along, but not what they looked like genetically.

Indeed, statistics are at their most powerful when you ask them simple questions.  For example, the statistics of hypothesis testing, where one compares a small number of possibilities for likelihood (e.g. did modern European populations dervive from predominantly hunter-gatherer populations, predominantly from LBK agriculturalists or predominantly from some other source) can have much more power at resolving a question in a way that supercedes your biases about the choices than when you ask them open ending questions without clear choices.

One problem with the statistics of dating divergence dates from genetic mutation rates is that the priors that are used to calibrate the dating aren&#039;t very good themselves.</description>
		<content:encoded><![CDATA[<p>It is a matter of using the right tools for the right purposes.  Admixture comes into the analysis with a preconceived bias based on an arbitrarily set K number and an admixture of multiple ancesteral population eigenvector model.</p>
<p>When you know how many ancestral populations there are with some accuracy, the program fits the data to the model without computational agony for the user.  But, this program is not designed to figure out how many clusters there are in a set.</p>
<p>There are other statistical tools that simply look for clusters.  PCA analysis and eyeballing the data is pretty good, although the problem it has that computer programs can solve, is that people have a very hard time seeing more than three or four dimensions at once (motion and colors and 3D can get you to five), while computers can see in many dimensions at once.  When you have good reason to believe that a tree-like model is appropriate, there are some very good statistical computer programs that create tree-like clusters of data in a phylogenetic relationship.</p>
<p>Neither admixture nor cluster analysis tells you anything about how closely related the clusters are to each other in an absolute sense as opposed to relative to each other.  Tools like Fst measure that aspect.</p>
<p>There is also nothing wrong with going into statistical analysis with strong Baysean priors about how you expect the data to come out IF YOUR PRIORS ARE ACCURATE.  A lot of the time in anthropology, your priors may actually be more accurate than your main data set.  You may know exactly how many ancestral populations there are and when they came along, but not what they looked like genetically.</p>
<p>Indeed, statistics are at their most powerful when you ask them simple questions.  For example, the statistics of hypothesis testing, where one compares a small number of possibilities for likelihood (e.g. did modern European populations dervive from predominantly hunter-gatherer populations, predominantly from LBK agriculturalists or predominantly from some other source) can have much more power at resolving a question in a way that supercedes your biases about the choices than when you ask them open ending questions without clear choices.</p>
<p>One problem with the statistics of dating divergence dates from genetic mutation rates is that the priors that are used to calibrate the dating aren&#8217;t very good themselves.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Roth</title>
		<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/#comment-30838</link>
		<dc:creator>John Roth</dc:creator>
		<pubDate>Wed, 23 Feb 2011 00:27:32 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=9999#comment-30838</guid>
		<description>I have to agree with Peter Ellis. I find these plots difficult to compare at higher Ks, partly because the colors keep changing, partly because the assignment randomly flips top to bottom, and partly because of the apparently random occurrence of vertical white lines. The the first two have been a problem with just about every series of these plots I&#039;ve seen, and tend to be extremely off-putting.

As far as the analysis is concerned, in the first series, the pattern of three distinguishable populations is fairly obvious in the first chart and continues in the others. The second group isn&#039;t quite as clear. If the higher Ks are trying to tell me anything, it&#039;s not at all obvious what that should be. On the other hand, I&#039;ve long been an advocate of the viewpoint that staring at anything for too long will show you patterns that simply aren&#039;t there.</description>
		<content:encoded><![CDATA[<p>I have to agree with Peter Ellis. I find these plots difficult to compare at higher Ks, partly because the colors keep changing, partly because the assignment randomly flips top to bottom, and partly because of the apparently random occurrence of vertical white lines. The the first two have been a problem with just about every series of these plots I&#8217;ve seen, and tend to be extremely off-putting.</p>
<p>As far as the analysis is concerned, in the first series, the pattern of three distinguishable populations is fairly obvious in the first chart and continues in the others. The second group isn&#8217;t quite as clear. If the higher Ks are trying to tell me anything, it&#8217;s not at all obvious what that should be. On the other hand, I&#8217;ve long been an advocate of the viewpoint that staring at anything for too long will show you patterns that simply aren&#8217;t there.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Markk</title>
		<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/#comment-30837</link>
		<dc:creator>Markk</dc:creator>
		<pubDate>Tue, 22 Feb 2011 14:35:10 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=9999#comment-30837</guid>
		<description>Andrew Gelman had an interesting post a while back about what people takeaway. Someone recommended an exercise for a statistics class where several graphs are presented with no description on them. The students are asked to write the labels and a description of what the graph told them. This could be a similar example. We all can take away different things.

In a larger sense isn&#039;t this why we have real statistical tests? One has to =beforehand= decide what they are looking for and come up with some number for a test where they could say &quot;I found it!&quot; or &quot;It isn&#039;t there&quot; or &quot;I can&#039;t tell&quot;. What you are doing by playing is absolutely necessary to get the &quot;some number&quot; I mentioned above. In the end, in most cases, that number and test are really based on playing around.</description>
		<content:encoded><![CDATA[<p>Andrew Gelman had an interesting post a while back about what people takeaway. Someone recommended an exercise for a statistics class where several graphs are presented with no description on them. The students are asked to write the labels and a description of what the graph told them. This could be a similar example. We all can take away different things.</p>
<p>In a larger sense isn&#8217;t this why we have real statistical tests? One has to =beforehand= decide what they are looking for and come up with some number for a test where they could say &#8220;I found it!&#8221; or &#8220;It isn&#8217;t there&#8221; or &#8220;I can&#8217;t tell&#8221;. What you are doing by playing is absolutely necessary to get the &#8220;some number&#8221; I mentioned above. In the end, in most cases, that number and test are really based on playing around.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John Emerson</title>
		<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/#comment-30836</link>
		<dc:creator>John Emerson</dc:creator>
		<pubDate>Tue, 22 Feb 2011 13:41:23 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=9999#comment-30836</guid>
		<description>I think that there&#039;s a warning in the recent collapse of macroeconomics only a few years after people were starting to claim that the basic problems had been solved. This was also a case when personal policy preferences contaminated the scientific process; economists denied that this criticism was valid, since all explicitly expressed aspects of these theories were neutral to policy.  It was like a game where you could make your biases into pure science if you could manifest them entirely in scientific, neutral language.

I was suspicious all along, but since I didn&#039;t have an inside understanding of the science my opinions were worthless. I did guess right though. There was a  pattern whereby increasing mathematical virtuosity made the science intelligible to increasingly fewer people. A similar pattern was seen in the business world, where unemployed mathematical physicists were hired by finance to produce increasingly sophisticated financial instruments which no one could understand. It turns out that these instruments were booby-trapped, as we now see. It really happened twice, just recently and in 1998 with Long Term Capital Management, which had two Nobelists on the board of directors.

But it&#039;s not really that if political preferences were finally excluded econ would be science. Political preferences can&#039;t be excluded, after 75 years of trying, and it&#039;s always going to be political, since econ is an applied science.</description>
		<content:encoded><![CDATA[<p>I think that there&#8217;s a warning in the recent collapse of macroeconomics only a few years after people were starting to claim that the basic problems had been solved. This was also a case when personal policy preferences contaminated the scientific process; economists denied that this criticism was valid, since all explicitly expressed aspects of these theories were neutral to policy.  It was like a game where you could make your biases into pure science if you could manifest them entirely in scientific, neutral language.</p>
<p>I was suspicious all along, but since I didn&#8217;t have an inside understanding of the science my opinions were worthless. I did guess right though. There was a  pattern whereby increasing mathematical virtuosity made the science intelligible to increasingly fewer people. A similar pattern was seen in the business world, where unemployed mathematical physicists were hired by finance to produce increasingly sophisticated financial instruments which no one could understand. It turns out that these instruments were booby-trapped, as we now see. It really happened twice, just recently and in 1998 with Long Term Capital Management, which had two Nobelists on the board of directors.</p>
<p>But it&#8217;s not really that if political preferences were finally excluded econ would be science. Political preferences can&#8217;t be excluded, after 75 years of trying, and it&#8217;s always going to be political, since econ is an applied science.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Peter Ellis</title>
		<link>http://blogs.discovermagazine.com/gnxp/2011/02/tea-leaves-and-population-substructure/#comment-30835</link>
		<dc:creator>Peter Ellis</dc:creator>
		<pubDate>Tue, 22 Feb 2011 13:34:31 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=9999#comment-30835</guid>
		<description>I find these plots remarkably annoying to dig through because the colours keep changing.  In group A, for example, you have red/blue components in the k=2 plot.  In the k=4 plot, a component that&#039;s essentially identical to the previous &quot;red&quot; component is now coloured green, while a component essentially identical to the previous &quot;blue&quot; component is now red.  Reassigning the colours of the various components would make things a lot clearer.</description>
		<content:encoded><![CDATA[<p>I find these plots remarkably annoying to dig through because the colours keep changing.  In group A, for example, you have red/blue components in the k=2 plot.  In the k=4 plot, a component that&#8217;s essentially identical to the previous &#8220;red&#8221; component is now coloured green, while a component essentially identical to the previous &#8220;blue&#8221; component is now red.  Reassigning the colours of the various components would make things a lot clearer.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
