[R] Median of streaming data
    Rolf Turner 
    r.turner at auckland.ac.nz
       
    Wed Sep 24 08:43:34 CEST 2014
    
    
  
On 24/09/14 17:31, Mohan Radhakrishnan wrote:
> Hi,
>
>           I have streaming data(1 TB) that can't fit in memory. Is there a
> way for me to find the median of these streaming integers assuming I can
> fit only a small part in memory ? This is about the statistical approach to
> find the median of a large number of values when I can inspect only a part
> of them due to memory constraints.
You cannot, I'm pretty sure, calculate the median recursively.  However 
there are "approximate" recursive median algorithms which provide an 
estimate of location that has the same asymptotic properties as the median.
See:
* U. Holst, Recursive estimators of location.  Commun. Statist. Theory 
Meth., vol. 16, 1987, pp. 2201--2226.
and
* Murray A. Cameron and T. Rolf Turner, Recursive location and scale 
estimators, Commun. Statist. Theory Meth., vol. 22, 1993,
pp. 2503--2515.
cheers,
Rolf Turner
-- 
Rolf Turner
Technical Editor ANZJS
    
    
More information about the R-help
mailing list