[R] Problem with handling of attributes in xmlToList in XML package
    santiago gil 
    sg.ccnr at gmail.com
       
    Sun Apr 14 20:09:12 CEST 2013
    
    
  
Hello all,
I have a problem with the way attributes are dealt with in the
function xmlToList(), and I haven't been able to figure it out for
days now.
Say I have a document (produced by nmap) like this:
> mydoc <- '<host starttime="1365204834" endtime="1365205860"><status state="up" reason="echo-reply" reason_ttl="127"/>
    <address addr="XXX.XXX.XXX.XXX" addrtype="ipv4"/>
    <ports><port protocol="tcp" portid="135"><state state="open"
reason="syn-ack" reason_ttl="127"/><service name="msrpc"
product="Microsoft Windows RPC" ostype="Windows" method="probed"
conf="10"><cpe>cpe:/o:microsoft:windows</cpe></service></port>
    <port protocol="tcp" portid="139"><state state="open"
reason="syn-ack" reason_ttl="127"/><service name="netbios-ssn"
method="probed" conf="10"/></port>
    </ports>
    <times srtt="647" rttvar="71" to="100000"/>
    </host>'
I want to store this as a list of lists, so I do:
mytree<-xmlTreeParse(mydoc)
myroot<-xmlRoot(mytree)
mylist<-xmlToList(myroot)
Now my problem is that when I want to fetch the attributes of the
services running of each port, the behavior is not consistent:
> mylist[["ports"]][[1]][["service"]]$.attrs["name"]
   name
"msrpc"
> mylist[["ports"]][[2]][["service"]]$.attrs["name"]
Error in trash_list[["ports"]][[2]][["service"]]$.attrs :
  $ operator is invalid for atomic vectors
I understand that the way they are dfined in the documnt is not the
same, but I think there still should be a consistent behavior. I've
tried many combination of parameters for xmlTreeParse() but nothing
has helped me. I can't find a way to call up the name of the service
consistently regardless of whether the node has children or not. Any
tips?
All the best,
S.G.
--
-------------------------------------------------------------------------------
http://barabasilab.neu.edu/people/gil/
    
    
More information about the R-help
mailing list