Image Parsers

Base Image Parser

class html2ans.parsers.image.AbstractImageParser[source]

Bases: html2ans.parsers.base.BaseElementParser

Abstract class for image parsing.

ANS schema

construct_output(element, *args, **kwargs)[source]

Convenience method for constructing an output dictionary. If element is a Tag with attributes, those attributes will be stashed in additional_properties.

Parameters
  • element (bs4.element.Tag or bs4.element.Comment or bs4.element.NavigableString) – the element being parsed

  • ans_type (str) – the ANS type to put in the output type field

  • content (str) – the content to put in the content field

  • version (str) – the version to put in the version field. Note: if not provided but version_required=True on this parser, the output will receive a version from the root parser

Basic Image Parser

class html2ans.parsers.image.ImageParser[source]

Bases: html2ans.parsers.image.AbstractImageParser

Basic img element parser.

Example:

<img src="postreports.jpg" alt="The Post Reports" width="50" height="50" />

->

{
    "type": "image",
    "version": "0.8.0",
    "url": "postreports.jpg",
    "caption": "The Post Reports",
    "width": 50,
    "height": 50
}
parse(element, *args, **kwargs)[source]

Parses the given element.

Parameters

element (bs4.element.Tag or bs4.element.Comment or bs4.element.NavigableString) – the element to parse

Linked Image Parser

class html2ans.parsers.image.LinkedImageParser[source]

Bases: html2ans.parsers.image.ImageParser

Link-wrapped image parser.

Example:

<a href="https://www.stitcher.com/podcast/the-washington-post/post-reports">
    <img src="postreports.jpg" alt="The Post Reports" width="50" height="50" />
</a>

->

{
    "type": "image",
    "version": "0.8.0",
    "url": "postreports.jpg",
    "caption": "The Post Reports",
    "width": 50,
    "height": 50,
    "additional_properties": {
        "image_link": "https://www.stitcher.com/podcast/the-washington-post/post-reports"
    }
}
is_applicable(element, *args, **kwargs)[source]

Checks applicability using applicable_elements and, optionally, applicable_classes

Parameters

element (bs4.element.Tag or bs4.element.Comment or bs4.element.NavigableString) – the element to parse

parse(element, *args, **kwargs)[source]

Parses the given element.

Parameters

element (bs4.element.Tag or bs4.element.Comment or bs4.element.NavigableString) – the element to parse

Figure Parser

class html2ans.parsers.image.FigureParser[source]

Bases: html2ans.parsers.image.ImageParser

Figure-wrapped image parser.

Example:

<figure>
    <img src="postreports.jpg" alt="The Post Reports logo" width="50" height="50" />
    <figcaption>The Post Reports</figcaption>
</figure>

->

{
    "type": "image",
    "version": "0.8.0",
    "url": "postreports.jpg",
    "caption": "The Post Reports",
    "width": 50,
    "height": 50
}
parse(element, *args, **kwargs)[source]

Parses the given element.

Parameters

element (bs4.element.Tag or bs4.element.Comment or bs4.element.NavigableString) – the element to parse