[Rd] [RFC] A case for freezing CRAN
Hervé Pagès
hpages at fhcrc.org
Fri Mar 21 00:28:15 CET 2014
On 03/20/2014 03:29 PM, Uwe Ligges wrote:
>
>
> On 20.03.2014 23:23, Hervé Pagès wrote:
>>
>>
>> On 03/20/2014 01:28 PM, Ted Byers wrote:
>>> On Thu, Mar 20, 2014 at 3:14 PM, Hervé Pagès <hpages at fhcrc.org
>>> <mailto:hpages at fhcrc.org>> wrote:
>>>
>>> On 03/20/2014 03:52 AM, Duncan Murdoch wrote:
>>>
>>> On 14-03-20 2:15 AM, Dan Tenenbaum wrote:
>>>
>>>
>>>
>>> ----- Original Message -----
>>>
>>> From: "David Winsemius" <dwinsemius at comcast.net
>>> <mailto:dwinsemius at comcast.net>>
>>> To: "Jeroen Ooms" <jeroen.ooms at stat.ucla.edu
>>> <mailto:jeroen.ooms at stat.ucla.edu>>
>>> Cc: "r-devel" <r-devel at r-project.org
>>> <mailto:r-devel at r-project.org>>
>>> Sent: Wednesday, March 19, 2014 11:03:32 PM
>>> Subject: Re: [Rd] [RFC] A case for freezing CRAN
>>>
>>>
>>> On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote:
>>>
>>> On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt
>>> <michael.weylandt at gmail.com
>>> <mailto:michael.weylandt at gmail.com>> wrote:
>>>
>>> Reading this thread again, is it a fair summary
>>> of your position
>>> to say "reproducibility by default is more
>>> important than giving
>>> users access to the newest bug fixes and
>>> features by default?"
>>> It's certainly arguable, but I'm not sure I'm
>>> convinced: I'd
>>> imagine that the ratio of new work being done vs
>>> reproductions is
>>> rather high and the current setup optimizes for
>>> that already.
>>>
>>>
>>> I think that separating development from released
>>> branches can give
>>> us
>>> both reliability/reproducibility (stable branch) as
>>> well as new
>>> features (unstable branch). The user gets to pick
>>> (and you can pick
>>> both!). The same is true for r-base: when using a
>>> 'released'
>>> version
>>> you get 'stable' base packages that are up to 12
>>> months old. If you
>>> want to have the latest stuff you download a nightly
>>> build of
>>> r-devel.
>>> For regular users and reproducible research it is
>>> recommended to
>>> use
>>> the stable branch. However if you are a developer
>>> (e.g. package
>>> author) you might want to develop/test/check your
>>> work with the
>>> latest
>>> r-devel.
>>>
>>> I think that extending the R release cycle to CRAN
>>> would result
>>> both
>>> in more stable released versions of R, as well as
>>> more freedom for
>>> package authors to implement rigorous change in the
>>> unstable
>>> branch.
>>> When writing a script that is part of a production
>>> pipeline, or
>>> sweave
>>> paper that should be reproducible 10 years from now,
>>> or a book on
>>> using R, you use stable version of R, which is
>>> guaranteed to behave
>>> the same over time. However when developing packages
>>> that should be
>>> compatible with the upcoming release of R, you use
>>> r-devel which
>>> has
>>> the latest versions of other CRAN and base packages.
>>>
>>>
>>>
>>> As I remember ... The example demonstrating the need for
>>> this was an
>>> XML package that cause an extract from a website where
>>> the headers
>>> were misinterpreted as data in one version of pkg:XML
>>> and not in
>>> another. That seems fairly unconvincing. Data cleaning
>>> and
>>> validation is a basic task of data analysis. It also
>>> seems excessive
>>> to assert that it is the responsibility of CRAN to
>>> maintain a synced
>>> binary archive that will be available in ten years.
>>>
>>>
>>>
>>> CRAN already does this, the bin/windows/contrib directory
>>> has
>>> subdirectories going back to 1.7, with packages dated
>>> October 2004. I
>>> don't see why it is burdensome to continue to archive these.
>>> It would
>>> be nice if source versions had a similar archive.
>>>
>>>
>>> The bin/windows/contrib directories are updated every day for
>>> active R
>>> versions. It's only when Uwe decides that a version is no
>>> longer worth
>>> active support that he stops doing updates, and it "freezes". A
>>> consequence of this is that the snapshots preserved in those
>>> older
>>> directories are unlikely to match what someone who keeps up to
>>> date with
>>> R releases is using. Their purpose is to make sure that those
>>> older
>>> versions aren't completely useless, but they aren't what
>>> Jeroen was
>>> asking for.
>>>
>>>
>>> But it is almost completely useless from a reproducibility point of
>>> view to get random package versions. For example if some people try
>>> to use R-2.13.2 today to reproduce an analysis that was published
>>> 2 years ago, they'll get Matrix 1.0-4 on Windows, Matrix 1.0-3 on
>>> Mac,
>>> and Matrix 1.1-2-2 on Unix.
>
> Not true, since Matrix 1.1-2-2 has
>
> Depends: R (≥ 2.15.2)
OK. So that means Matrix is not available today for R-2.13.2 users on
Linux:
> "Matrix" %in% rownames(available.packages()[ , ])
[1] FALSE
However since Matrix is a recommended package, it's included in
the official R-2.13.2 source tarball so it gets installed when I
install R:
> installed.packages()["Matrix", "Version", drop=FALSE]
Version
Matrix "0.9996875-3"
As I mentioned earlier, the Matrix package was just an example. In the
case of a non-recommended package, it will either be:
- unavailable by default (if the source package was removed or if
the package maintainer consciously used the R >= x.y.z feature,
e.g. the ape package),
- or available but incompatible (e.g. bcrm is broken with R-2.13.2
on Linux),
- or available and compatible, but with a very different version
than the version that was available 2 years ago (e.g. BaSTA),
- or available and at the exact same version as 2 years ago (bingo!)
This is a very painful experience for anybody trying to install and
use R-2.13.2 today to reproduce 2-year old results. Things could be
improved a lot with very little changes.
Cheers,
H.
> sessionInfo()
R version 2.13.2 (2011-09-30)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.13.2
>
>
> Best,
> Uwe Ligges
>
>
> And none of them of course is what was
>>> used
>>> by the authors of the paper (they used Matrix 1.0-1, which is what
>>> was
>>> current when they ran their analysis).
>>>
>>> Initially this discussion brought back nightmares of DLL hell on
>>> Windows. Those as ancient as I will remember that well. But now, the
>>> focus seems to be on reproducibility, but with what strikes me as a
>>> seriously flawed notion of what reproducibility means.
>>>
>>> Herve Pages mentions the risk of irreproducibility across three minor
>>> revisions of version 1.0 of Matrix.
>>
>> If you use R-2.13.2, you get Matrix 1.1-2-2 on Linux. AFAIK this is
>> the most recent version of Matrix, aimed to be compatible with the most
>> current version of R (i.e. R 3.0.3). However, it has never been tested
>> with R-2.13.2. I'm not saying that it should, that would be a big waste
>> of resources of course. All I'm saying it that it doesn't make sense to
>> serve by default a version that is known to be incompatible with the
>> version of R being used. It's very likely to not even install properly.
>>
>> For the apparently small differences between the versions you get on
>> Windows and Mac, the Matrix package was just an example. With other
>> packages you get (again if you use R-2.13.2):
>>
>> src win mac
>> abc 1.8 1.5 1.4
>> ape 3.1-1 3.0-1 2.8
>> BaSTA 1.9.3 1.1 1.0
>> bcrm 0.4.3 0.2 0.1
>> BMA 3.16.2.3 3.15 3.14.1
>> Boruta 3.0.0 1.6 1.5
>> ...
>>
>> Are the differences big enough?
>>
>> Also note that back in October 2011, people using R-2.13.2 would get
>> e.g. ape 2.7-3 on Linux, Windows and Mac. Wouldn't it make sense that
>> people using R-2.13.2 today get the same? Why would anybody use
>> R-2.13.2 today if it's not to run again some code that was written
>> and used two years ago to obtain some important results?
>>
>> Cheers,
>> H.
>>
>>
>>> My gut reaction would be that if
>>> the results are not reproducible across such minor revisions of one
>>> library, they are probably just so much BS. I am trained in
>>> mathematical ecology, with more than a couple decades of post-doc
>>> experience working with risk assessment in the private sector. When I
>>> need to do an analysis, I will repeat it myself in multiple products, as
>>> well as C++ or FORTRAN code I have hand-crafted myself (and when I wrote
>>> number crunching code myself, I would do so in multiple programming
>>> languages - C++, Java, FORTRAN, applying rigorous QA procedures to each
>>> program/library I developed). Back when I was a grad student, I would
>>> not even show the results to my supervisor, let alone try to publish
>>> them, unless the results were reproducible across ALL the tools I used.
>>> If there was a discrepancy, I would debug that before discussing them
>>> with anyone. Surely, it is the responsibility of the journals' editors
>>> and reviewers to apply a similar practice.
>>>
>>> The concept of reproducibility used to this point in this discussion
>>> might be adequate from a programmers perspective (except in my lab), it
>>> is wholly inadequate from a scientist's perspective. I maintain that if
>>> you have the original data, and repeat the analysis using the latest
>>> version of R and the available, relevant packages, the original results
>>> are probably due to a bug either in the R script or in R or the packages
>>> used IF the results obtained using the latest versions of these are not
>>> consistent with the originally reported results. Therefore, of the
>>> concerns I see raised in this discussion, the principle one of concern
>>> is that of package developers who fail to pay sufficient attention to
>>> backwards compatibility: a new version ought not break any code that
>>> executes fine using previous versions. That is not a trivial task, and
>>> may require contributors obtaining the assistance of a software
>>> engineer. I am sure anyone in this list who programs in C++ knows how
>>> the ANSI committees handle change management. Introduction of new
>>> features is something that is largely irrelevant for backwards
>>> compatibility (but there are exceptions), but features to be removed
>>> are handled by declaring them deprecated, and leaving them in that
>>> condition for years. That tells anyone using the language that they
>>> ought to plan to adapt their code to work when the deprecated feature is
>>> finally removed.
>>>
>>> I am responsible for maintaining code (involving distributed computing)
>>> to which many companies integrate their systems, and I am careful to
>>> ensure that no change I make breaks their integration into my system,
>>> even though I often have to add new features. And I don't add features
>>> lightly, and have yet to remove features. When that eventually happens,
>>> the old feature will be deprecated, so that the other companies have
>>> plenty of time to adapt their integration code. I do not know whether
>>> CRAN ought to have any responsibility for this sort of change
>>> management, or if they have assumed some responsibility for some of it,
>>> but I would argue that the package developers have the primary
>>> responsibility for doing this right.
>>>
>>> Just my $0.05 (the penny no longer exists in Canada)
>>>
>>> Cheers
>>>
>>> Ted
>>> R.E. (Ted) Byers, Ph.D., Ed.D.
>>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list