[Rd] [RFC] A case for freezing CRAN

Hervé Pagès hpages at fhcrc.org
Fri Mar 21 00:28:15 CET 2014


On 03/20/2014 03:29 PM, Uwe Ligges wrote:
>
>
> On 20.03.2014 23:23, Hervé Pagès wrote:
>>
>>
>> On 03/20/2014 01:28 PM, Ted Byers wrote:
>>> On Thu, Mar 20, 2014 at 3:14 PM, Hervé Pagès <hpages at fhcrc.org
>>> <mailto:hpages at fhcrc.org>> wrote:
>>>
>>>     On 03/20/2014 03:52 AM, Duncan Murdoch wrote:
>>>
>>>         On 14-03-20 2:15 AM, Dan Tenenbaum wrote:
>>>
>>>
>>>
>>>             ----- Original Message -----
>>>
>>>                 From: "David Winsemius" <dwinsemius at comcast.net
>>>                 <mailto:dwinsemius at comcast.net>>
>>>                 To: "Jeroen Ooms" <jeroen.ooms at stat.ucla.edu
>>>                 <mailto:jeroen.ooms at stat.ucla.edu>>
>>>                 Cc: "r-devel" <r-devel at r-project.org
>>>                 <mailto:r-devel at r-project.org>>
>>>                 Sent: Wednesday, March 19, 2014 11:03:32 PM
>>>                 Subject: Re: [Rd] [RFC] A case for freezing CRAN
>>>
>>>
>>>                 On Mar 19, 2014, at 7:45 PM, Jeroen Ooms wrote:
>>>
>>>                     On Wed, Mar 19, 2014 at 6:55 PM, Michael Weylandt
>>>                     <michael.weylandt at gmail.com
>>>                     <mailto:michael.weylandt at gmail.com>> wrote:
>>>
>>>                         Reading this thread again, is it a fair summary
>>>                         of your position
>>>                         to say "reproducibility by default is more
>>>                         important than giving
>>>                         users access to the newest bug fixes and
>>>                         features by default?"
>>>                         It's certainly arguable, but I'm not sure I'm
>>>                         convinced: I'd
>>>                         imagine that the ratio of new work being done vs
>>>                         reproductions is
>>>                         rather high and the current setup optimizes for
>>>                         that already.
>>>
>>>
>>>                     I think that separating development from released
>>>                     branches can give
>>>                     us
>>>                     both reliability/reproducibility (stable branch) as
>>>                     well as new
>>>                     features (unstable branch). The user gets to pick
>>>                     (and you can pick
>>>                     both!). The same is true for r-base: when using a
>>>                     'released'
>>>                     version
>>>                     you get 'stable' base packages that are up to 12
>>>                     months old. If you
>>>                     want to have the latest stuff you download a nightly
>>>                     build of
>>>                     r-devel.
>>>                     For regular users and reproducible research it is
>>>                     recommended to
>>>                     use
>>>                     the stable branch. However if you are a developer
>>>                     (e.g. package
>>>                     author) you might want to develop/test/check your
>>>                     work with the
>>>                     latest
>>>                     r-devel.
>>>
>>>                     I think that extending the R release cycle to CRAN
>>>                     would result
>>>                     both
>>>                     in more stable released versions of R, as well as
>>>                     more freedom for
>>>                     package authors to implement rigorous change in the
>>>                     unstable
>>>                     branch.
>>>                     When writing a script that is part of a production
>>>                     pipeline, or
>>>                     sweave
>>>                     paper that should be reproducible 10 years from now,
>>>                     or a book on
>>>                     using R, you use stable version of R, which is
>>>                     guaranteed to behave
>>>                     the same over time. However when developing packages
>>>                     that should be
>>>                     compatible with the upcoming release of R, you use
>>>                     r-devel which
>>>                     has
>>>                     the latest versions of other CRAN and base packages.
>>>
>>>
>>>
>>>                 As I remember ... The example demonstrating the need for
>>>                 this was an
>>>                 XML package that cause an extract from a website where
>>>                 the headers
>>>                 were misinterpreted as data in one version of pkg:XML
>>>                 and not in
>>>                 another. That seems fairly unconvincing. Data cleaning
>>> and
>>>                 validation is a basic task of data analysis. It also
>>>                 seems excessive
>>>                 to assert that it is the responsibility of CRAN to
>>>                 maintain a synced
>>>                 binary archive that will be available in ten years.
>>>
>>>
>>>
>>>             CRAN already does this, the bin/windows/contrib directory
>>> has
>>>             subdirectories going back to 1.7, with packages dated
>>>             October 2004. I
>>>             don't see why it is burdensome to continue to archive these.
>>>             It would
>>>             be nice if source versions had a similar archive.
>>>
>>>
>>>         The bin/windows/contrib directories are updated every day for
>>>         active R
>>>         versions.  It's only when Uwe decides that a version is no
>>>         longer worth
>>>         active support that he stops doing updates, and it "freezes".  A
>>>         consequence of this is that the snapshots preserved in those
>>> older
>>>         directories are unlikely to match what someone who keeps up to
>>>         date with
>>>         R releases is using.  Their purpose is to make sure that those
>>> older
>>>         versions aren't completely useless, but they aren't what
>>> Jeroen was
>>>         asking for.
>>>
>>>
>>>     But it is almost completely useless from a reproducibility point of
>>>     view to get random package versions. For example if some people try
>>>     to use R-2.13.2 today to reproduce an analysis that was published
>>>     2 years ago, they'll get Matrix 1.0-4 on Windows, Matrix 1.0-3 on
>>> Mac,
>>>     and Matrix 1.1-2-2 on Unix.
>
> Not true, since Matrix 1.1-2-2 has
>
> Depends:     R (≥ 2.15.2)

OK. So that means Matrix is not available today for R-2.13.2 users on
Linux:

   > "Matrix" %in% rownames(available.packages()[ , ])
   [1] FALSE

However since Matrix is a recommended package, it's included in
the official R-2.13.2 source tarball so it gets installed when I
install R:

   > installed.packages()["Matrix", "Version", drop=FALSE]
          Version
   Matrix "0.9996875-3"

As I mentioned earlier, the Matrix package was just an example. In the
case of a non-recommended package, it will either be:
   - unavailable by default (if the source package was removed or if
     the package maintainer consciously used the R >= x.y.z feature,
     e.g. the ape package),
   - or available but incompatible (e.g. bcrm is broken with R-2.13.2
     on Linux),
   - or available and compatible, but with a very different version
     than the version that was available 2 years ago (e.g. BaSTA),
   - or available and at the exact same version as 2 years ago (bingo!)

This is a very painful experience for anybody trying to install and
use R-2.13.2 today to reproduce 2-year old results. Things could be
improved a lot with very little changes.

Cheers,
H.

   > sessionInfo()
   R version 2.13.2 (2011-09-30)
   Platform: x86_64-unknown-linux-gnu (64-bit)

   locale:
    [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
    [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
    [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
    [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
    [9] LC_ADDRESS=C               LC_TELEPHONE=C
   [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

   attached base packages:
   [1] stats     graphics  grDevices utils     datasets  methods   base

   loaded via a namespace (and not attached):
   [1] tools_2.13.2

>
>
> Best,
> Uwe Ligges
>
>
>   And none of them of course is what was
>>> used
>>>     by the authors of the paper (they used Matrix 1.0-1, which is what
>>> was
>>>     current when they ran their analysis).
>>>
>>> Initially this discussion brought back nightmares of DLL hell on
>>> Windows.  Those as ancient as I will remember that well.  But now, the
>>> focus seems to be on reproducibility, but with what strikes me as a
>>> seriously flawed notion of what reproducibility means.
>>>
>>> Herve Pages mentions the risk of irreproducibility across three minor
>>> revisions of version 1.0 of Matrix.
>>
>> If you use R-2.13.2, you get Matrix 1.1-2-2 on Linux. AFAIK this is
>> the most recent version of Matrix, aimed to be compatible with the most
>> current version of R (i.e. R 3.0.3). However, it has never been tested
>> with R-2.13.2. I'm not saying that it should, that would be a big waste
>> of resources of course. All I'm saying it that it doesn't make sense to
>> serve by default a version that is known to be incompatible with the
>> version of R being used. It's very likely to not even install properly.
>>
>> For the apparently small differences between the versions you get on
>> Windows and Mac, the Matrix package was just an example. With other
>> packages you get (again if you use R-2.13.2):
>>
>>                src   win    mac
>>    abc         1.8   1.5    1.4
>>    ape       3.1-1 3.0-1    2.8
>>    BaSTA     1.9.3   1.1    1.0
>>    bcrm      0.4.3   0.2    0.1
>>    BMA    3.16.2.3  3.15 3.14.1
>>    Boruta    3.0.0   1.6    1.5
>>    ...
>>
>> Are the differences big enough?
>>
>> Also note that back in October 2011, people using R-2.13.2 would get
>> e.g. ape 2.7-3 on Linux, Windows and Mac. Wouldn't it make sense that
>> people using R-2.13.2 today get the same? Why would anybody use
>> R-2.13.2 today if it's not to run again some code that was written
>> and used two years ago to obtain some important results?
>>
>> Cheers,
>> H.
>>
>>
>>> My gut reaction would be that if
>>> the results are not reproducible across such minor revisions of one
>>> library, they are probably just so much BS.  I am trained in
>>> mathematical ecology, with more than a couple decades of post-doc
>>> experience working with risk assessment in the private sector.  When I
>>> need to do an analysis, I will repeat it myself in multiple products, as
>>> well as C++ or FORTRAN code I have hand-crafted myself (and when I wrote
>>> number crunching code myself, I would do so in multiple programming
>>> languages - C++, Java, FORTRAN, applying rigorous QA procedures to each
>>> program/library I developed).  Back when I was a grad student, I would
>>> not even show the results to my supervisor, let alone try to publish
>>> them, unless the results were reproducible across ALL the tools I used.
>>> If there was a discrepancy, I would debug that before discussing them
>>> with anyone.  Surely, it is the responsibility of the journals' editors
>>> and reviewers to apply a similar practice.
>>>
>>> The concept of reproducibility used to this point in this discussion
>>> might be adequate from a programmers perspective (except in my lab), it
>>> is wholly inadequate from a scientist's perspective.  I maintain that if
>>> you have the original data, and repeat the analysis using the latest
>>> version of R and the available, relevant packages, the original results
>>> are probably due to a bug either in the R script or in R or the packages
>>> used IF the results obtained using the latest versions of these are not
>>> consistent with the originally reported results.  Therefore, of the
>>> concerns I see raised in this discussion, the principle one of concern
>>> is that of package developers who fail to pay sufficient attention to
>>> backwards compatibility: a new version ought not break any code that
>>> executes fine using previous versions.  That is not a trivial task, and
>>> may require contributors obtaining the assistance of a software
>>> engineer.  I am sure anyone in this list who programs in C++ knows how
>>> the ANSI committees handle change management.  Introduction of new
>>> features is something that is largely irrelevant for backwards
>>> compatibility (but there are exceptions), but features to be removed
>>> are handled by declaring them deprecated, and leaving them in that
>>> condition for years.  That tells anyone using the language that they
>>> ought to plan to adapt their code to work when the deprecated feature is
>>> finally removed.
>>>
>>> I am responsible for maintaining code (involving distributed computing)
>>> to which many companies integrate their systems, and I am careful to
>>> ensure that no change I make breaks their integration into my system,
>>> even though I often have to add new features.  And I don't add features
>>> lightly, and have yet to remove features.  When that eventually happens,
>>> the old feature will be deprecated, so that the other companies have
>>> plenty of time to adapt their integration code.  I do not know whether
>>> CRAN ought to have any responsibility for this sort of change
>>> management, or if they have assumed some responsibility for some of it,
>>> but I would argue that the package developers have the primary
>>> responsibility for doing this right.
>>>
>>> Just my $0.05 (the penny no longer exists in Canada)
>>>
>>> Cheers
>>>
>>> Ted
>>> R.E. (Ted) Byers, Ph.D., Ed.D.
>>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list