python - How to find all occurrences of tag using lxml -
i have file multiple doc tags. each doc tag has docid tag inside it. need inside doc tag if docid tag matches. using htmlparser parse file.
need is:
1. recursively iterate on doc tag.
2. each doc tag if docid tag inside matches under doc tag.
3. repeat 2nd step doc tags.
def get_docs(self, filepaths): parser = etree.htmlparser() file in filepaths: tree = etree.parse(file, parser) # tree = etree.parse(file) doc = tree.findall('.//doc') elem in doc: print etree.tostring(elem)
i trying content inside each doc tag text_content() failing. getting below error while doing
attributeerror: 'lxml.etree._element' object has no attribute 'text_content'
Comments
Post a Comment