[R] Imputation method on binary data
(Ted Harding)
Ted.Harding at manchester.ac.uk
Wed Oct 24 11:59:05 CEST 2007
On 24-Oct-07 08:51:30, sigalit mangut-leiba wrote:
> hello,
> I want to do a single Imputation method on binary data set.
> Is it possible to use imp.cat from CAT package?
> I have a problem defining "theta" when data is binary.
> Do you know any references on the subject?
> Thank you,
> Sigalit.
There should be no problem in principle. However, you must
make sure to do things in the correct sequence.
1. Make sure your data are in a matrix, not a dataframe,
where the rows correzspond to cases and the columns
correspond to variables. (In passing: you must of
course have at least two variables). If your data are
in a dataframe Y in the first instance, first do
X <- as.matrix(Y)
2. Make sure that the binary outcomes are coded as 1 and 2,
not 0 and 1. Missing values should be coded as NA. If
the outcomes are 0 and 1, then do
X <- X+1
3. Now do the preliminary step, which creates a specially
formatted representation of the data, missing value
patterns, etc. (and of course first load the 'cat' package):
library(cat)
s <- prelim.cat(X)
where X is your data matrix as in (1) and (2).
4. Next, the maximum-likelihood estimation:
t <- em.cat(s)
5. Now you can embark on as many imputations as you wish,
alternating between da.cat() and imp.cat():
t <- da.cat(s,t)
imp.1 <- imp.cat(s,t)
t <- da.cat(s,t)
imp.2 <- imp.cat(s,t)
...
More details can be found in the documentation for the
functions listed by
library(help=cat)
e.g. ?prelim.cat, ?em.cat, ?da.cat, ?imp.cat, etc.
The above is for a straightforward imputation of missing
values in a dataset of caregorical variables, and does not
incorporate any special modelling considerations. See also:
?ecm.cat and other functions.
Hoping this helps,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 24-Oct-07 Time: 10:59:01
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list