Scryer-prolog: A Rust-based XML parser could be an interesting project for new contributors

Created on 1 May 2020  路  3Comments  路  Source: mthom/scryer-prolog

Prolog is ideally suited for reasoning about HTML, XML and SGML documents, because such tree-shaped markup documents can be directly mapped to Prolog terms.

An HTML element with tag T, attributes As and children Cs could be mapped to the Prolog term element(T, As, Cs), and thus become amenable to fast and convenient Prolog-based reasoning.

For instance, the following HTML file, represented as a list of characters:

<html>
  <head>
    <title>
      Hello!
    </title>
  </head>
  <body style="padding-left: 5%; padding-right: 5%">
    Hello.
  </body>
</html>

can be directly mapped to the Prolog term:

[[element(html, [],
     [element(head, [],
          [element(title, [], ["      Hello!\n    "])]),
      element(body, [style="padding-left: 5%; padding-right: 5%"],
          ["    Hello.\n  "])])]

roxmltree looks like a useful Rust component to parse XML files and convert them to Prolog terms.

feature request good first issue

Most helpful comment

This is now available via library(sgml).

All 3 comments

Another approach would be to use Tree-Sitter-based parsers for various languages at once. And it has HTML grammar already, sadly though, no XML yet.

I have filed #596 for HTML.

This is now available via library(sgml).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

XVilka picture XVilka  路  3Comments

srenatus picture srenatus  路  4Comments

triska picture triska  路  3Comments

UWN picture UWN  路  4Comments

triska picture triska  路  4Comments