[BioC] flowCore: inverse logicle transformation of flow cytometry data
Nishant Gopalakrishnan
ngopalak at fhcrc.org
Thu Oct 8 08:47:36 CEST 2009
Hi Josef,
Thanks a lot for the detailed email and for the information on the
additional parameters introduced in the logicle transformation.
1) Regarding the problems with low and negative values, the
transformation for the negative values were implemented incorrectly in the
earlier versions of flowCore. After the detailed email from Parks about the
typo in the manuscript and the discussion we had in Jan 2009, I had modified
the transformation to correct this issue. I do remember comparing the
results generated by flowCore's transformation to those generated using your
Java implementation.
Now that you mention there are issues with the implementation of flowCores
logicle transform it would be great if you could provide more information
regarding what exactly is incorrect here so that the issue can be corrected.
Has the results/software used for the Gating ML standards unit tests for
the logicle transformation been verified by Parks and Moore to be correct
so that everyone has a gold standard that has been verified to be correct
from the original authors of the transformation ?
Also what is the level of compliance to the standards definition for this
transformation amongst the currently available flow cytometry software.
2) The inverse of the logicle was easier to implement with a direct
call to the biexponential function and is available in flowCore (1.11.22 )
3) Future updates will be made to flowCore to
a. Add the new parameter "A" defined for the transform. Addition of
this parameter to the biexponential function should not be difficult as this
parameter just gets added to the old parameter w wherever it exists in the
new definition of the transformation.
b. Move all the inputs to the decade scale since that has been
established as the standard and update the man pages accordingly
4) Default values
a. Thank you for the suggestions for providing an option for a data
driven selection of the transformation parameters. I will look into
providing this option . However, this option might cause problems for
users who what to use an inverse transform to get back to the original
scale.
b. You are correct regarding the man page, it should have said d is the
breadth of the display in decades.
Thanks
Nishant
-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Josef Spidlen
Sent: Wednesday, October 07, 2009 4:59 PM
To: BioConductor
Subject: Re: [BioC] flowCore: inverse logicle transformation of flow
cytometry data
Hi Nishant, Chao-Jen, Pyne, et al.
I thought I would add few notes on the Logicle transformation and its
implementation in BioConductor/flowCore. Please do not take the email
the wrong way, I am definitely not trying to complain about flowCore's
implementation; just trying to bring some light into the Logicle issues.
The implementation of Logicle is tricky, which is due to both, the
transformation being quite complicated and the original documentation
being targeted to readers with substantial mathematical background.
Nishant, Florian, Byron and others did a great job when they opened the
can of worms and implemented the transformation in flowCore.
1) I remember that there were minor issues with the Logicle
transformation, especially when applied on very low and negative values.
I believe, flowCore's implementation wasn't actually a monotone function
around 0 and it probably relates to a typo in the Parks et. al. Logicle
manuscript. Nishant, we had some email discussion on this topic around
January 27, 2009. Recently, I talked to Dave Parks and Wayne Moore and
they told me that flowCore's implementation of Logicle is still broken
(no further details, not sure about validity of this statement and not
sure if this is related to the original issue). Having said that, please
note that these issues are minor and would not really affect typical
users using flowCore/Logicle as part of their analysis pipeline.
2) Quoting Chao-Jen Wong:
> Since the logicle transformation is an one-to-one and onto function,
> it is possible to implement an inverse function. It is, however, not
> straightforward...
I believe Nishant has solved this by now but note that the inverse
should actually be easier than the Logicle itself. This is because the
Logicle transformation is defined as logicle(x)=root(S(y)-x), i.e.,
Logicle is defined as the inverse of S, where S is a "simple" function.
Eventually, you can just use S as the inverse of Logicle.
3) Recently, there have been some development related to Logicle that
flowCore may want to consider/support/implement:
- Since 2005, there is a patent on the Logicle implementation owned by
Stanford. However, early this year, Stanford decided not to collect
royalties on it anymore and it became free to be used by anyone.
- As a result, Logicle has been incorporated in the latest version of
Gating-ML (an ISAC standard for describing gates and data
transformations in XML). This has been done with collaboration of the
authors of the transformation, who decided to tweak the transformation a
little bit:
a) In the original manuscript, Parks et. al. are showing two different
parameterizations of the Logicle function based on natural logarithm
(base e) and decadic logarithm (base 10) respectively. The recent
conclusion is that the decadic (base 10) version is better (i.e.,
easier) for the end user and should be used. Essentially, the
transformation function is the same as long as you adjust the parameters
accordingly. In the manuscript, the parameters are lower case for
natural logarithm parameterization and upper case for the base 10
logarithm. The 'm' and 'w' are the two affected parameters where
m=M*ln(10) and w=W*ln(10).** The "upper case" (i.e., decadic version) is
better for the end user since M and W are expressed in normal decades,
i.e., base 10 log units; M is the total plot width and W is the
linearization width. Therefore, let's say the user wants the result to
be a 4.5 decades plot, so they use M=4.5 rather than having to use m=10.36.
The implementation in flowCore seems to be based on the natural
logarithm but its parameterization is mixed ('w' seems to be in natural
logarithm decades, while 'd' seems to be decadic logarithm decades).
Eventually, flowCore could switch to the decadic logarithm
implementation and harmonize the parameterization, ... and maybe use the
same constants as in the paper?).
** I believe that flowCore calls 'd' what is called 'M' in the
manuscript and 'r' what is called 'T' in the manuscript (Parks, et. al.,
Cytometry, 69A: 541-551; 2006).
b) The authors of Logicle added one additional parameter to the Logicle
function: 'A' - the additional negative display range in asymptotic
decades (usually 0 or a negative value). Setting it to 0 produces the
"original" Logicle. In cases where low data values are dominated by
statistical variation but the values are constrained to be non-negative
(as seen in peak detected flow cytometry data), a Logicle plot with A =
-W would include data zero and be near-linear at low data values thereby
avoiding problems associated with log scales at the low end.
4) If you decided to adjust the implementation of Logicle in flowCore, a
consistent description with (hopefully) all necessary details is
included in the latest Gating-ML specification, which can be downloaded
from http://flowcyt.sourceforge.net/gating/latest.full.zip.
5) The latest Gating-ML specification also includes compliance tests,
which include the Logicle transformation. This may eventually help you
adjust/debug the implementation of Logicle (as well as its inverse
function) in flowCore.
6) I have some Java code that implements the updated Logicle.
Specifically, I have the Logicle(T, M, W, A) class that allows you to
create and apply the Logicle transformation; and I also have some code
that calculates default values for T, M, W, and A based on the contents
of an FCS file. Please let me know if you would like me to share these.
I am not suggesting that you would reuse the implementation directly
since it is quite naive and relatively slow (using a simple bisection
method as a root-finding algorithm every time you call it); however, it
may have some value in clarifying potential ambiguities related to that
function. The crucial part is the updated S function that now includes
the parameter 'A' and works with decadic parameterization (see Gating-ML
for details). However, since flowCore's internal implementation seems to
be based on the "bi-exponential" like parameterization, i.e., a*e^(b*x)
- c*e^(-d*x) + f, it may involve some effort to convert this correctly.
7) A minor note on flowCore's defaults for the logicle transformation:
r = 262144; This works for data from BD's newer instrument since their
range is 2^18; however, there is a lot of other FCS files with different
max (e.g., 10^4), where the 262144 is not very good. An option would be
to have r=NULL as the default and adjust it based on the data that the
transformation is supposed to be applied to.
d = 5; Parks et. al. are suggesting to use 4.5 but this does not really
make a difference. More importantly, there seems to be a typo in the
documentation of the function saying that d is the breath of the display
in natural logarithm units. The code includes d <- d * log(10) and
therefore, the documentation should probably say that d is the breath of
the display in decades (i.e., decadic logarithm units). Also, Nishant,
shouldn't the if (w > d) stop(...) in the logicleTransform function go
after the d <- d * log(10)?
w = 0; This does not perform very well if you have low and negative
values to look at. Alternatively, you could have w=NULL as default and
create the real value based on the data set. A recommended way to
specify W to match particular data is to select a value 'z'
approximating the most negative data value that must be included and
calculate W as: W = (M - log(T/abs(z)))/2. Setting 'z' at the fifth
percentile of events that are below zero will yield an appropriate
display in most cases.
Please let me know if I could do anything to help or clarify things further.
Cheers,
Josef
--
Josef Spidlen, Ph.D.
Research Associate, Terry Fox Laboratory, BC Cancer Agency
675 West 10th Avenue, V5Z 1L3 Vancouver, BC, Canada
Tel: +1 (604) 675-8000 x 7755
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list