Title: | Read, Write and Work with 'XML' Data |
---|---|
Description: | 'XML' package for creating, reading and manipulating 'XML', with an object model based on 'Reference Classes'. |
Authors: | Per Nyfelt [cre, aut], Alipsa HB [cph], Steven Brandt [ctb] |
Maintainer: | Per Nyfelt <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.0.9001 |
Built: | 2025-03-12 04:28:50 UTC |
Source: | https://github.com/alipsa/xmlr |
An abstract base class with some utility methods
#' @field m_parent the parent (if any)
The base container for the DOM
## S4 method for signature 'Document' as.vector(x) ## S4 method for signature 'Document' as.character(x)
## S4 method for signature 'Document' as.vector(x) ## S4 method for signature 'Document' as.character(x)
x |
the object to convert |
Methods allow access to the root element as well as the DocType and other document-level information.
as.vector
: as.vector(Document)
as.character
: as.character(Document)
getBaseURI()
return the URI from which this document was loaded
setBaseURI(uri)
Sets the effective URI from which this document was loaded
Create a xmlr object tree based on parsing events
endDocument()
Event signalling parsing has completed
endElement(name)
end element event; @param name the element name
startDocument()
Event signalling parsing has begun
startElement(name, attributes)
start element event; @param name the element name, @param attributes a named list of attributes
text(text)
text event; @param text the character content of the Text node
An XML element. Methods allow the user to get and manipulate its child elements and content, directly access the element's textual content, and manipulate its attributes.
## S4 method for signature 'Element' as.vector(x) ## S4 method for signature 'Element' as.character(x)
## S4 method for signature 'Element' as.vector(x) ## S4 method for signature 'Element' as.character(x)
x |
the object to convert |
as.vector
: as.vector(Element)
as.character
: as.character(Element)
name
The local name of the element
contentList
all the children of this element
attributeList
a list of all the attributes belonging to this element
addAttributes(attributes)
Add the supplied attributes to the attributeList of this Element
addContent(content)
Appends the child to the end of the content list. return the parent (the calling object)
contentIndex(content)
Find the position of the content in the contentList or -1 if not found
getAttribute(name)
Get an attribute value
getAttributes()
Get the list of attributes
getChild(name)
Return the first child element matching the name
getChildren()
Get all the child Elements belong to this Element
getContent()
Returns the full content of the element as a List that may contain objects of type Text, Element, Comment, ProcessingInstruction, CDATA, and EntityRef
getName()
Return the name of this Element
getText()
Return the text content of this element if any
hasAttributes()
return TRUE if this element has any attributes, otherwise FALSE
hasChildren()
Return TRUE if this element has any child Element nodes
hasContent()
return TRUE if this element has any content, otherwise FALSE
hasText()
Return TRUE if this element has a Text node
removeContent(content)
Remove the specified content from this element
removeContentAt(index)
Remove the content at the given index and return the content that was removed
setAttribute(name, value)
Add or replace an attribute, parameters will be converted to characters
setAttributes(attributes)
Replace the attributes with this named list, NULL or empty list will remove all attributes, all values will be converted to characters
setName(name)
Set the name of this Element
setText(text)
Replace all content with the text supplied
Common utility functions
isRc(x, clazz = "refClass")
isRc(x, clazz = "refClass")
x |
the object to check |
clazz |
the name of the class e.g. "Element" for the Element class. Optional, if omitted it checks that the object is a reference class |
A boolean indicating whether the object x
belongs to the class or not
isRc
: Check if the object is a reference class, similar to isS4().
an XML parser based on an article on creating a quick and dirty xml parser by Steven Brandt: https://www.javaworld.com/article/2077493/java-tip-128–create-a-quick-and-dirty-xml-parser.html
A general purpose linked stack
size
the size of the stack (number of elements in the stack)
stackNode
an envronment containing the current element and the one under
peek()
Get the top element from the stack without changing it
pop()
Pull the top element from the stack removing it from the stack
push(val)
Add an element to the top of the stack
size()
Get the current size of the stack
Reference class representing text content
as.vector for Text classes
as.character for Text classes
## S4 method for signature 'Text' as.vector(x) ## S4 method for signature 'Text' as.character(x)
## S4 method for signature 'Text' as.vector(x) ## S4 method for signature 'Text' as.character(x)
x |
the object to convert |
An XML character sequence. Provides a modular, parentable method of representing text.
as.vector
: as.vector(Text)
as.character
: as.character(Text)
XML import functions
parse.xmlstring(xml) parse.xmlfile(fileName)
parse.xmlstring(xml) parse.xmlfile(fileName)
xml |
an xml character string to parse |
fileName |
the name of the xml file to parse |
a Document object
parse.xmlstring
: create a Document from a character string
parse.xmlfile
: create a Document from a xml file
A package for creating, reading and manipulating XML providing and object model implemented with Reference Classes. This is perhaps especially useful when dealing with deeply nested XML structures.
library("xmlr") doc <- Document$new() root <- Element$new("table") root$setAttribute("xmlns", "http://www.w3.org/TR/html4/") doc$setRootElement(root) root$addContent( Element$new("tr") $addContent(Element$new("td")$setText("Apples")) $addContent(Element$new("td")$setText("Bananas")) ) table <- doc$getRootElement() stopifnot(table$getName() == "table") stopifnot(table$getAttribute("xmlns") == "http://www.w3.org/TR/html4/") children <- table$getChild("tr")$getChildren() stopifnot(length(children) == 2) stopifnot(children[[1]]$getText() == "Apples") stopifnot(children[[2]]$getText() == "Bananas") # you can also parse character strings (or parse a file using parse.xmlfile(fileName)) doc <- parse.xmlstring("<foo><bar><baz val='the baz attribute'/></bar></foo>")
library("xmlr") doc <- Document$new() root <- Element$new("table") root$setAttribute("xmlns", "http://www.w3.org/TR/html4/") doc$setRootElement(root) root$addContent( Element$new("tr") $addContent(Element$new("td")$setText("Apples")) $addContent(Element$new("td")$setText("Bananas")) ) table <- doc$getRootElement() stopifnot(table$getName() == "table") stopifnot(table$getAttribute("xmlns") == "http://www.w3.org/TR/html4/") children <- table$getChild("tr")$getChildren() stopifnot(length(children) == 2) stopifnot(children[[1]]$getText() == "Apples") stopifnot(children[[2]]$getText() == "Bananas") # you can also parse character strings (or parse a file using parse.xmlfile(fileName)) doc <- parse.xmlstring("<foo><bar><baz val='the baz attribute'/></bar></foo>")
This is a convenience method to take all the children of the given Element and create a data frame based on the content of each child where each child constitutes a row and the attributes or elements (including text) will constitute the columns. It assumes a homogeneous structure and the column names are takes from the first child
xmlrToDataFrame(element)
xmlrToDataFrame(element)
element |
the element to convert |
a data frame