[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bioconductor
Subject:    [BioC] FW: [BiO News] Gene name errors introduced by Excel
From:       "Lapointe, David" <David.Lapointe () umassmed ! edu>
Date:       2004-07-30 14:11:10
Message-ID: D07C647C65ED9D4B81D0343D08A26AC90332DFCE () edunivmail01 ! ad ! umassmed ! edu
[Download RAW message or body]



-----Original Message-----
From: News@bioinformatics.org [mailto:News@bioinformatics.org] 
Sent: Thursday, July 29, 2004 2:34 PM
To: Lapointe, David
Subject: [BiO News] Gene name errors introduced by Excel


A research article at BMC Bioinformatics:

BACKGROUND:
When processing microarray data sets, we recently noticed that some gene
names were being changed inadvertently to non-gene names.

RESULTS:
A little detective work traced the problem to default date format
conversions and floating-point format conversions in the very useful
Excel program package. The date conversions affect at least 30 gene
names; the floating-point conversions affect at least 2,000 if Riken
identifiers are included. These conversions are irreversible; the
original gene names cannot be recovered.

CONCLUSIONS:
Users of Excel for analyses involving gene names should be aware of this
problem, which can cause genes, including medically important ones, to
be lost from view and which has contaminated even carefully curated
public databases. We provide work-arounds and scripts for circumventing
the problem.

URL:
http://www.biomedcentral.com/1471-2105/5/80


--
Comments?  Post your replies to
http://bioinformatics.org/forums/forum.php?forum_id=2763


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic