Bat: Complete collection of syntax highlighting test files

Created on 3 Oct 2020  Â·  45Comments  Â·  Source: sharkdp/bat

With all the recent news about Hacktoberfest I thought it would be a good idea to point out good beginner issues that would be actually helpful for bat. In the past years, I have actually experienced Hacktoberfest as a really great event - both as a contributor as well as a maintainer.

As of recently, bat has a set of syntax highlighting regression tests (see #1124 for more details). The main idea is that we have a large collection of test files for each and every language that bat can highlight. This way, we can make sure that we do not run into issues we had in the past where either (1) syntax highlighting for some language is suddenly not working anymore or (2) bat suddenly crashes/panics for some input (due to incompatibilities in the regex flavors in syntect and Sublime Text).

In order to add a new test file, you can follow these steps (let's take "Ruby" as an example):

  1. Make sure that you are running the latest version of bat (master or bat 0.16) and that bat is available on the path.
  2. Find an example Ruby source file or write one yourself. If possible, the file should aim to be "comprehensive" (i.e. include a lot of the possible syntax), but this is not strictly necessary. A simple file is better than none at all. Also, the files shouldn't be gigantic.
  3. Save the file in tests/syntax-tests/source/Ruby (adapt for your language). The file name could be test.rb (adapt extension) but can also be adapted if that is necessary in order for bat to highlight it correctly (e.g. Makefile).
  4. If you have copied the file from somewhere else, please make sure that the file may be copied under the respective license and that the license is compatible with bats license. If it requires attribution, please add a LICENSE.md in the same folder with a text like this:

    The `test.rb` file has been added from [enter source here] under the following license: 
    
    [add license text here]
    
  5. Go to tests/syntax-tests and run the update.sh Bash script. A new file should be generated in the highlighted folder (e.g. highlighted/Ruby/test.rb).
  6. Use cat or bat --language=txt to display the content of this file and make sure that the syntax highlighting looks correct.
  7. git add the new files in the source folder as well as the autogenerated files in the highlighted folder.
  8. Commit and submit a PR! Please reference this issue (#1213).

List of languages / syntaxes:

  • [x] ActionScript
  • [x] Apache Conf
  • [ ] AppleScript
  • [x] ARM Assembly
  • [x] AsciiDoc (Asciidoctor)
  • [ ] ASP
  • [x] Assembly (x86_64)
  • [x] AWK
  • [x] Batch File
  • [x] BibTeX
  • [x] Bourne Again Shell (bash)
  • [x] C
  • [x] C#
  • [x] C++
  • [ ] Cabal
  • [x] Clojure
  • [x] CMake
  • [ ] CoffeeScript
  • [x] CpuInfo (/proc/cpuinfo)
  • [x] Crystal
  • [x] CSS
  • [x] CSV
  • [x] D
  • [x] Dart
  • [x] Diff
  • [x] Dockerfile
  • [x] DotENV
  • [x] Elixir
  • [x] Elm
  • [x] Email
  • [x] Erlang
  • [ ] F#
  • [ ] Fortran
  • [ ] Friendly Interactive Shell (fish)
  • [x] fstab
  • [x] Git Attributes
  • [x] Git Config
  • [x] Git Ignore
  • [x] GLSL
  • [x] Go
  • [x] GraphQL
  • [x] Graphviz (DOT)
  • [x] Groovy
  • [x] /etc/group
  • [x] Haskell
  • [x] Highlight non-printables (--show-all)
  • [x] /etc/hosts
  • [x] HTML
  • [x] INI
  • [x] Java
  • [ ] Java Server Page (JSP)
  • [x] JavaScript
  • [x] Jinja2
  • [x] JSON
  • [ ] jsonnet
  • [x] Julia
  • [x] Kotlin
  • [x] LaTeX
  • [x] Less
  • [x] Lisp
  • [ ] Literate Haskell
  • [x] Lua
  • [x] Makefile
  • [x] Manpage
  • [x] Markdown
  • [x] MATLAB
  • [x] MemInfo (/proc/meminfo)
  • [ ] NAnt Build File
  • [x] nginx
  • [x] Nim
  • [x] Nix
  • [ ] Objective-C
  • [ ] Objective-C++
  • [x] OCaml
  • [x] orgmode
  • [x] Pascal
  • [x] /etc/passwd
  • [x] Perl
  • [x] PHP
  • [x] Plain Text
  • [x] PowerShell
  • [x] Protocol Buffer
  • [ ] Puppet
  • [x] PureScript
  • [x] Python
  • [x] QML
  • [x] R
  • [ ] Rego
  • [ ] Regular Expression
  • [x] requirements.txt
  • [ ] resolv (/etc/resolv.conf)
  • [x] reStructuredText
  • [ ] Robot Framework.
  • [x] Ruby
  • [ ] Ruby Haml
  • [ ] Ruby on Rails
  • [x] Rust
  • [x] Salt State (SLS)
  • [x] Sass
  • [x] Scala
  • [x] SCSS
  • [x] SML
  • [x] SQL
  • [x] SSH Config
  • [x] SSHD Config
  • [ ] Strace
  • [ ] Stylus
  • [x] Swift
  • [ ] syslog
  • [ ] Tcl
  • [x] Terraform
  • [x] TeX
  • [ ] Textile
  • [x] TOML
  • [x] TypeScript
  • [ ] TypeScriptReact
  • [ ] varlink
  • [ ] Verilog
  • [ ] VimL
  • [x] Vue Component
  • [x] XML
  • [x] YAML

It would be great if we could focus on the syntaxes in sublimehq/Packages first (marked in bold) as this would allow us to merge #1174 without having to worry (too much) about syntax highlighting regressions. Also, most of these are really popular languages, so it makes sense to have them in the test suite.

good first issue hacktoberfest syntax-highlighting

All 45 comments

Can I submit a python file for this issue?

Can I submit a python file for this issue?

yes :+1:

I will submit a elixir file soon :)

i am not able to run bash update.sh it gives me error :
File "/usr/lib/python3.8/subprocess.py", line 1702, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'bat'

@AkshatGadhwal @sharkdp I was also thinking about it. I think the script should first try to run a _locally_ built binary from target/ if one is available. I'm not very familiar with Rust workflows yet though.

I'm also getting an error

Traceback (most recent call last):
  File "create_highlighted_versions.py", line 85, in <module>
    create_highlighted_versions(output_basepath=args.output)
  File "create_highlighted_versions.py", line 35, in create_highlighted_versions
    ["bat"] + BAT_OPTIONS + [source], stderr=subprocess.PIPE, env=env,
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'bat': 'bat'

@CCreativeCND The solution is to add the target/debug/ dir to your $PATH, or copy the binary target/debug/bat to somewhere in your $PATH. That's assuming you run cargo build wihout changing any build options (else the target may be something else than debug).

FileNotFoundError: [Errno 2] No such file or directory: 'bat': 'bat'

See step 1 of the instructions above: "Make sure that you are running the latest version of bat (preferably master, but the latest release should also work) and that bat is available on the path."

We could probably improve the error message in the Python script.

Hey @sharkdp :wave:

We could probably improve the error message in the Python script.

This looks like a simple try catch statement. I'll be happy to submit a PR to add this if it helps :)

which files are supposed to be changed??
by running bash update.sh all the files in the 'highlights' folder are modified.

which files are supposed to be changed??
by running bash update.sh all the files in the 'highlights' folder are modified.

Please make sure that you are running the latest version of bat. Only one new file should be generated.

which files are supposed to be changed??
by running bash update.sh all the files in the 'highlights' folder are modified.

Please make sure that you are running the latest version of bat. Only one new file should be generated.

i run choco upgrade bat but it is showing that you have the latest version v0.16.0

@AkshatGadhwal What is the output for bat -V? Must be 0.16.0. Otherwise you may have conflicting installations.

@Mithil467 yes it is 0.16.0
ss_bat_v

@Mithil467 yes it is 0.16.0
ss_bat_v

should I discard the changes in other files?

No, do not run chocho upgrade, but just bat -V. That should give you the version for the bat that is installe and detected on path.

No, do not run chocho upgrade, but just bat -V. That should give you the version for the bat that is installe and detected on path.

I did that too.

bat_V

I did that too.

Hm... I see that you opened a PR now anyway. Did it work for you? Or did you simply exclude all other files?

Maybe there is an incompatibility problem when running these scripts in Windows? Has someone else used Windows to make their change?

I pulled your branch locally and did ./update.sh. And I do see some changes in highlighted test.dart.
Diff shows -
image

Also, the dart file has ^M characters (\r) in it, which might be an issue?

I did that too.

Hm... I see that you opened a PR now anyway. Did it work for you? Or did you simply exclude all other files?

Maybe there is an incompatibility problem when running these scripts in Windows? Has someone else used Windows to make their change?

I discarded other changes

I pulled your branch locally and did ./update.sh. And I do see some changes in highlighted test.dart.
Diff shows -
image

Also, the dart file has ^M characters (\r) in it, which might be an issue?

yes, you are right \r is creating an issue.
to run bash update.sh I had to run dos2unix update.sh

Not the update.sh file, but your test.dart file.

Ok, this looks like a line ending problem indeed. I just opened https://github.com/sharkdp/bat/pull/1254 which might fix this (after it is merged and after you re-cloned the repository).

Not the update.sh file, but your test.dart file.

so what I supposed to do now,
when I committed two files were created

  1. test.dart in highlighted/Dart folder
  2. test.dart in source/Dart folder

After #1254 is merged into master, you may pull the branch and update your fork's branch as well. Then rerun the changes.
Alternatively an easier way would be to run-

git remote add upstream http://github.com/sharkdp/bat
git fetch upstream
git checkout your_test_feature_branch
git reset upstream/master

This would update your local branch with the main repo's master.
Then you may re run the update.sh script and redo the process. You'll need to force push finally.

After #1254 is merged into master

it has been merged by now.

Edit: Moving this comment to the topmost post.

@sharkdp What do you think about taking example files from learn x in y minutes? We would still need to make them manually from their repo but it would be faster.

I'm not an expert on open source licenses, but it seems like the CC BY-SA 3.0 license is not a really great fit for us:

ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

It's not completely clear what this means for bat as a project if we were to modify the files.

@sharkdp It's like GPL. You need to distribute modified versions of that file under the same license, CC-BY-SA. It only applies to that exact file in this case, since it's not a part of a larger work really and remains independent (by analogy, incorporating a CC-BY-SA song into a soundtrack of a CC-BY-NC-ND movie would be against the "share-alike" clause, but distributing songs under CC-BY-SA and other licenses on the same CD wouldn't).

@dmbaturin If that is the case, I'm also fine with adding files from this source. We should still check if they are really suited for our goal here. The large amount of comments in between is not too helpful, for example.

@sharkdp Yeah, that's a much bigger obstacle. Examples from http://rosettacode.org (GFDL-licensed) can make much better placeholders until a comprehensive syntax test file can be found or made.

In fact, maybe a collection of _comprehensive_ syntax sample files should be a project of its own.

In fact, maybe a collection of _comprehensive_ syntax sample files should be a project of its own.

That would be really helpful, yes :smile:

@sharkdp removing comments shouldn't be that hard. Maybe I can strip all the comments out and make a new project as @dmbaturin mentioned. As learn x in y minutes seems to cover all syntax of a particular language.

Let me know if you find it interesting.

EDIT: Just seen your comment. Will do it.

Ok, but maybe it should be discussed in a bit more detail before you spend too much time on this.

Sure

@sharkdp I managed to hack around learnxinyminutes's repo repo to extract needed part with my dirty python script. Here syntax-samples.

I have also cleaned up the posts that where derivative of other languages. Or just explaination posts(like set theory...).

If we plan to go forward with it, It is a good starting point as it needs further processing. And I could use some help with that :)

Hi there!

This project seems pretty good, I would like to contribute by adding a Dockerfile sample file. I will open a Pull Request as soon as possible.

Can I submit a perl file for this issue?

Can I submit a perl file for this issue?

That would be great! :+1:

Can I submit a Java file for this issue?

Thanks, but we already have a Java test file

@sharkdp Do you have any thoughts about how you would add one of these for Regular Expressions? I tried writing a regular expression literal in JS and that gets highlighted well.

@sharkdp Do you have any thoughts about how you would add one of these for Regular Expressions? I tried writing a regular expression literal in JS and that gets highlighted well.

I think it should work fine if you choose a filename with .re extension. The content would be just the plain regex.

I'd like to submit a D file if that's ok.

Was this page helpful?
0 / 5 - 0 ratings