Nbconvert: Fails to convert bulletpoint-list to LaTeX itemize environment

Created on 16 May 2017 · 5Comments · Source: jupyter/nbconvert

For some reason the second bulletpoint-list (the one in the k-means section) is not converted to LaTex in this file: https://github.com/scikit-learn-contrib/hdbscan/blob/5b1ed1dfef216c4083567061f533a47609c4e3b2/notebooks/Comparing%20Clustering%20Algorithms.ipynb
Instead the section is converted to the following LaTeX code:

So, in summary, here's how K-Means seems to stack up against out
desiderata: * \textbf{Don't be wrong!}: K-means is going to throw points
into clusters whether they belong or not; it also assumes you clusters
are globular. K-Means scores very poorly on this point. *
\textbf{Intuitive parameters}: If you have a good intuition[...]

The rest of the bulletpoint-lists in the notebook are converted correctly, and they seem to be formatted in exactly the same way as the one which fails.

LaTeX workaround known

Source

AllanLRH

👍3

Most helpful comment

Two newlines (one empty line) before the list seems to do the trick. Pretty confusing though because without the empty line, it is still rendered fine in the HTML view.

slokhorst on 17 May 2017

👍3

All 5 comments

Try adding a two spaces after the immediately preceding new line to indicate a line break per markdown as expected by pandoc. Does that solve your problem?

mpacer on 16 May 2017

👍1

Two newlines (one empty line) before the list seems to do the trick. Pretty confusing though because without the empty line, it is still rendered fine in the HTML view.

slokhorst on 17 May 2017

👍3

I'm guessing that's because we're using pandoc for LaTeX export, and it generally has more consistent behaviour than the markdown spec requires around blank lines and lists. There is wide variation in how markdown is actually implemented.

However, I should mention, I actually get perfectly well formatted LaTeX on the linked notebook (which doesn't have the two newlines).

So, in summary, here's how K-Means seems to stack up against out desiderata:
* **Don't be wrong!**: K-means is going to throw points into clusters whether they belong or not; it also assumes you clusters are globular. K-Means scores very poorly on this point.
* **Intuitive parameters**: If you have a good intuition for how many clusters the dataset your exploring has then great, otherwise you might have a problem.
* **Stability**: Hopefully the clustering is stable for your data. Best to have many runs and check though.
* **Performance**: This is K-Means big win. It's a simple algorithm and with the right tricks and optimizations can be made exceptionally efficient. There are few algorithms that can compete with K-Means for performance. If you have truly huge data then K-Means might be your only option.

\begin{itemize}
\tightlist
\item
  \textbf{Don't be wrong!}: We inherited all the benefits of DBSCAN and
  removed the varying density clusters issue. HDBSCAN is easily the
  strongest option on the 'Don't be wrong!' front.
\item
  \textbf{Intuitive parameters}: Choosing a mimnimum cluster size is
  very reasonable. The only remaining parameter is \texttt{min\_samples}
  inherited from DBSCAN for the density based space transformation.
  Sadly \texttt{min\_samples} is not that intuitive; HDBSCAN is not that
  sensitive to it and we can choose some sensible defaults, but this
  remains the biggest weakness of the algorithm.
\item
  \textbf{Stability}: HDBSCAN is stable over runs and subsampling (since
  the variable density clustering will still cluster sparser subsampled
  clusters with the same parameter choices), and has good stability over
  parameter choices.
\item
  \textbf{Performance}: When implemented well HDBSCAN can be very
  efficient. The current implementation has similar performance to
  \texttt{fastcluster}'s agglomerative clustering (and will use
  \texttt{fastcluster} if it is available), but we expect future
  implementations that take advantage of newer data structure such as
  cover trees to scale significantly better.
\end{itemize}

What version of pandoc are you using?

mpacer on 17 May 2017

I'm using pandoc 1.19.2.1 (from the Arch Linux repositories).

slokhorst on 21 May 2017

$ pandoc --version
pandoc 1.19.2.1
Compiled with pandoc-types 1.17.0.5, texmath 0.9, skylighting 0.1.1.4

Installed using Homebrew on Mac OS X

AllanLRH on 22 May 2017

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Config option `template_path` not recognized by `NotebookExporter`.

joelostblom · 4Comments

Pdf's not produced. [ubuntu 16.04 / python 3.6]

bharathgs · 5Comments

nbconvert --to html skips Markdown within HTML tags

vlbrown · 3Comments

How to execute notebook with the custom arguments?

Hiroshiba · 3Comments

convert notebook with html file embedded (not referenced to a local file)

Alcampopiano · 3Comments