proper use of GOSemSim
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
One day, I am looking for R packages that can analyze PPI and after searching, I found the ppiPre package in CRAN.
The function of this package is not impressive, and I already knew some related works, including http://intscore.molgen.mpg.de/. The authors of this webserver contacted me for the usages of GOSemSim when they developing it.
What makes me curious is that the ppiPre package can calculate GO semantic similarity and supports 20 species exactly like GOSemSim. I opened the source tarball, and surprisingly found that its sources related to semantic similarity calculation are totally copied from GOSemSim.
GOSemSim was firstly released in 2008 Bioconductor 2.4 (at that time, devel version) and published in Bioinformatics in 2010. After compared the sources, I found the sources in ppiPre were copied from GOSemSim version 1.6.8 which released in 2010 Bioconductor 2.6.
The Wang method defined in GOKEGGSims.r file of ppiPre is:
119 WangMethod
It is identical to the one I defined in GOSemSim 1.6.8:
196 ygcWangMethod
The information content based method in ppiPre:
495 GetLatestCommonAncestor
also identical to the one in GOSemSim 1.6.8:
280 `ygcInfoContentMethod`
Let’s look at some helper functions in ppiPre:
477 rebuildICdata
Again, it is identical to GOSemSim 1.6.8:
390 rebuildICdata
Let’s look at the internal function TCSSComputeIC in ppiPre:
410 TCSSComputeIC
and ygcCompute_Information_Content in GOSemSim 1.6.8:
326 ygcCompute_Information_Content
Another helper function GetGOMap in ppiPre:
308 GetGOMap
My ygcGetGOMap in GOSemSim 1.6.8:
100 ygcGetGOMap
There are many other small helper functions that are identical. ppiPre copy most of the source code of GOSemSim. There is 862 lines in GOKEGGSims.r, in which only the following function is about KEGG that is not related to GOSemSim.
10 KEGGSim
This function is only 12 lines, and it calculates the similarity by divide the intersect to the total sum. The other lines in GOKEGGSims.r, more than 800 lines, were totally copied from GOSemSim. Other source files in the ppiPre only has less than 450 lines in sum. About 2/3 of ppiPre were copied from GOSemSim.
The author of ppiPre changed the function names and pretend it is their original works. They just copy and paste and take the credit of months of development of GOSemSim. This is really sucks.
After I found this issue, I add a proper use of GOSemSim statement in its github page:
I am very glad that many people find GOSemSim useful and GOSemSim has been cited by 114 (by google scholar, Aug, 2014). There are two R packages BiSEp and tRanslatome depend on GOSemSim and three R packages clusterProfiler, DOSE and Rcpi import GOSemSim. SemDist package copy some of the source code from GOSemSim with acknowledging within source code and document. ppiPre package copy many source code from GOSemSim without any acknowledgement in souce code or document and did not cited GOSemSim in their publication. This violates the restriction of open source license. For R developers, if you found functions provided in GOSemSim useful, please depends or imports GOSemSim. If you would like to copy and paste source code, you should acknowledge the source code was copied/derived from GOSemSim authored by Guangchuang Yu [email protected] within source code, add GOSemSim in Suggests field and also includes the following reference in the man files for functions that copied/derived from GOSemSim and cited in vignettes. references{ Yu et al. (2010) GOSemSim: an R package for measuring semantic similarity among GO terms and gene products emph{Bioinformatics} (Oxford, England), 26:7 976--978, April 2010. ISSN 1367-4803 url{http://bioinformatics.oxfordjournals.org/cgi/content/abstract/26/7/976} PMID: 20179076 } You are welcome to use GOSemSim in the way you like, but please cite it and give it the proper credit. I hope you can understand.
Related Posts
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.