<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: You can&#039;t scoop someone else&#039;s brain</title>
	<atom:link href="http://blogs.discovermagazine.com/gnxp/2012/07/you-cant-scoop-someone-elses-brain/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.discovermagazine.com/gnxp/2012/07/you-cant-scoop-someone-elses-brain/</link>
	<description></description>
	<lastBuildDate>Fri, 24 May 2013 07:43:00 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
	<item>
		<title>By: Chad</title>
		<link>http://blogs.discovermagazine.com/gnxp/2012/07/you-cant-scoop-someone-elses-brain/#comment-44297</link>
		<dc:creator>Chad</dc:creator>
		<pubDate>Tue, 24 Jul 2012 02:36:32 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.discovermagazine.com/gnxp/?p=17441#comment-44297</guid>
		<description>Your comments on &quot;trust&quot; and the liberation of genomic data sets reminds me of one of my pet peeves when reviewing any paper for publication (Data Not Shown).

This instantly makes me suspicious that they are trying to pull a fast one on me. Its also interesting the things that are often not shown. For instance, a qPCR validation of a microarray experiment. Why are they unable to take 10 mins to make a bar graph or excel sheet to be slipped in as supplemental data in the paper? Or just recently, a paper using RNA-seq data. They claim a certain number of genes differentially expressed, but then present only a subset of differentially expressed genes that support their hypothesis. Meanwhile, they also make brief reference to certain classes of genes, but do not show the data.........

While I&#039;m on this rant (forgive me) let me add another pet peeve, vagueness in the methodology...in particular vagueness in the informatics. I can not tell you how many sequencing papers I have read where they briefly say &quot;reads were mapped with bowtie/tophat/bwa/etc&quot; and thats it. No mention is made of the parameters used, any of which could alter the data in subtle ways. Unfortunately a lot of bench scientists venturing into informatics simply cut out these details, considering them minor. Even the Bioinformaticians are guilty of it at times. How could I reproduce their results or catch a mistake without knowing the parameters/commands used in the analysis? At least with the statistical genetics papers they actually have equations so you know (if you understand) how they are calculating their results.

Fortunately a lot of journals are now requiring that any new genomics data be deposited in public databases (GEO, SRA, etc) which is a good first step, but its not enough. Typically this means the researchers dump just the raw data, which takes days, even weeks to reanalyze. Of course if you do not know the exact informatics methods used, it can be hard to reproduce exactly (almost impossible if they don&#039;t tell you version numbers of the programs).

The next step in openness will not only be making the data publicly available, but setting accepted guidelines for the reporting of results and methodologies so that it is clear and open to everyone.</description>
		<content:encoded><![CDATA[<p>Your comments on &#8220;trust&#8221; and the liberation of genomic data sets reminds me of one of my pet peeves when reviewing any paper for publication (Data Not Shown).</p>
<p>This instantly makes me suspicious that they are trying to pull a fast one on me. Its also interesting the things that are often not shown. For instance, a qPCR validation of a microarray experiment. Why are they unable to take 10 mins to make a bar graph or excel sheet to be slipped in as supplemental data in the paper? Or just recently, a paper using RNA-seq data. They claim a certain number of genes differentially expressed, but then present only a subset of differentially expressed genes that support their hypothesis. Meanwhile, they also make brief reference to certain classes of genes, but do not show the data&#8230;&#8230;&#8230;</p>
<p>While I&#8217;m on this rant (forgive me) let me add another pet peeve, vagueness in the methodology&#8230;in particular vagueness in the informatics. I can not tell you how many sequencing papers I have read where they briefly say &#8220;reads were mapped with bowtie/tophat/bwa/etc&#8221; and thats it. No mention is made of the parameters used, any of which could alter the data in subtle ways. Unfortunately a lot of bench scientists venturing into informatics simply cut out these details, considering them minor. Even the Bioinformaticians are guilty of it at times. How could I reproduce their results or catch a mistake without knowing the parameters/commands used in the analysis? At least with the statistical genetics papers they actually have equations so you know (if you understand) how they are calculating their results.</p>
<p>Fortunately a lot of journals are now requiring that any new genomics data be deposited in public databases (GEO, SRA, etc) which is a good first step, but its not enough. Typically this means the researchers dump just the raw data, which takes days, even weeks to reanalyze. Of course if you do not know the exact informatics methods used, it can be hard to reproduce exactly (almost impossible if they don&#8217;t tell you version numbers of the programs).</p>
<p>The next step in openness will not only be making the data publicly available, but setting accepted guidelines for the reporting of results and methodologies so that it is clear and open to everyone.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
