Scryer-prolog: A Rust-based XML parser could be an interesting project for new contributors

Created on 1 May 2020 · 3Comments · Source: mthom/scryer-prolog

Prolog is ideally suited for reasoning about HTML, XML and SGML documents, because such tree-shaped markup documents can be directly mapped to Prolog terms.

An HTML element with tag T, attributes As and children Cs could be mapped to the Prolog term element(T, As, Cs), and thus become amenable to fast and convenient Prolog-based reasoning.

For instance, the following HTML file, represented as a list of characters:

<html>
  <head>
    <title>
      Hello!
    </title>
  </head>
  <body style="padding-left: 5%; padding-right: 5%">
    Hello.
  </body>
</html>

can be directly mapped to the Prolog term:

[[element(html, [],
     [element(head, [],
          [element(title, [], ["      Hello!\n    "])]),
      element(body, [style="padding-left: 5%; padding-right: 5%"],
          ["    Hello.\n  "])])]

roxmltree looks like a useful Rust component to parse XML files and convert them to Prolog terms.

feature request good first issue

Source

triska

Most helpful comment

This is now available via library(sgml).

triska on 4 Aug 2020

👍3

All 3 comments

Another approach would be to use Tree-Sitter-based parsers for various languages at once. And it has HTML grammar already, sadly though, no XML yet.

XVilka on 9 May 2020

👍1

I have filed #596 for HTML.

triska on 15 Jun 2020

This is now available via library(sgml).

triska on 4 Aug 2020

👍3

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Add embedded help predicate.

XVilka · 3Comments

compilation fails on master still

srenatus · 4Comments

socket_server_accept/4 cannot be interrupted with Ctrl-c

triska · 3Comments

Compilation problems

UWN · 4Comments

Please provide a rudimentary phrase_from_file/2

triska · 4Comments