Jabref: convert latex encoding - code in entry that crashes jabref

Created on 3 May 2020  ·  22Comments  ·  Source: JabRef/jabref

JabRef 5.1--2020-05-02--1d9957b
Linux 5.6.8-200.fc31.x86_64 amd64
Java 14.0.1

In a test database with >10500 entries, add an entry (ctrl n, or with button): this crashes jabref. No log message.

In a smaller database, no problem adding a new entry. I can copy and paste it into the larger database.

bug 🐛

All 22 comments

JabRef 5.1--2020-05-04--b5599c9
Windows 10 10.0 amd64
Java 14.0.1

AND

JabRef 5.1--2020-05-04--b5599c9
Linux 5.3.0-51-generic amd64
Java 14.0.1

using a database with >19,000 entries

Cannot reproduce this issue. Might be related to specific database, preferences or hardware? @ilippert : Can you reliably reproduce this problem? Or does it appear only sometimes?

JabRef 5.1--2020-05-04--7bb1e24
Linux 5.6.8-200.fc31.x86_64 amd64
Java 14.0.1

yes, I can still reproduce this reliably.

Can you tell us, what the ram usage is of JabRef when this happens?

before adding new entry:
memory 1,7gb
virtual memory 107,1gb
Resident memory 1,9 gb
Shared memory mb

when adding: shared memory goes up to 250mb; memory and resident memory each go up by +100mb.

I don't really know much about java memory usage, but a 250 mb of ram usage rise when adding one entry seems not normal...
@tobiasdiez @koppor @Siedlerchr ?

sorry, I just checked something else: i created an entirely new bib file, copying in 10500 entries. Then adding an entry succeeded.

ok, comparing the original database file and the newly created one, with the same entries: I see a difference of file size: 1.5mb difference file size.
i tried to compare both files - but the order of the entries are entirely differently stored.

I have now copied the group structure from the original bib file and copied into the new bib file. That new bib file is still by 1.5mb smaller than the original file. I can still add entries to this new file.

Upps, the new jabref file has, of course, changed all my timestamps. that's not good. i need the old timestamps....

So, maybe we can close the issue at this point - as in, it was an "artefact" of that original database.

However, the original database is simply one that has grown throughout the years and versions of jabref. Maybe other users have also such naturally growing biblatex databases.

JabRef 5.1--2020-05-04--7bb1e24
Linux 5.6.8-200.fc31.x86_64 amd64
Java 14.0.1

wait, now, having deactivated the timestamp update, creating a new file and pasting the entries results (tested in two instances) in a new database of the equivalent size as the original bib file.

And now adding a new entry results in crashing jabref.

this bug is quite new, it started to emerge around last weekend. before i was able to add entries to the original database file without crash.

Think, we need your database to be able to reproduce the issue. Would it be possible that you share it? Only the core developers will have access to the file - it won't be published, ...

Yes, I am happy to share, please advise how you like to receive the file

You'll see my email address at my GitHub profile. Could you try sending it there?

I now noted, that regularly, with my Intel® Core™ i7-6700HQ CPU @ 2.60GHz × 8 system, if the said file is open, jabref needs 40-60% of my CPU. If I close the file, Jabref needs only about 2%.

I investigated the database and identified one entry that reliably breaks jabref. However, I cannot detect what is wrong with it.

@Article{Kolb2003,
  Title                    = {Protest, \"{O}ffentlichkeitsarbeit und {L}obbying schlie{\ss}en sich nicht aus. {D}ie {M}assen als {S}chl\"{u}ssel zur {M}acht. {F}elix {K}olb vergleicht die politischen {S}trategien von {U}mweltbewegung und {G}lobalisierungskritikern, extract from `\textit{politische \"{o}kologie}' (85) 2003},
  Author                   = {Felix Kolb},
  Year                     = {2003},
  Month                    = {14. Aug.},
  Number                   = {188},
  Pages                    = {7},

  Journal                  = {Frankfurter Rundschau}
}

Without this entry, jabref seems to run more smoothly...

Just an idea and possible workaround (?). Have you tried making the following changes:

\"{O}ffentlichkeitsarbeit to {\"{O}}ffentlichkeitsarbeit
{S}chl\"{u}ssel to {S}chl{\"{u}}ssel
\"{o}kologie} to {\"{o}}kologie} (on a side note: {\"{O}}kologie} should be upper case)

Does that make any difference?

Maybe a parsing error in the month field? Is there maybe a max length for the title?

Actually, it might be best to write the umlauts differently (see https://tex.stackexchange.com/questions/366546/jabref-cant-read-bib-file-created-by-jabref-3-0/434268#434268

and

https://tex.stackexchange.com/questions/57743/how-to-write-%c3%a4-and-other-umlauts-and-accented-letters-in-bibliography):

So change \"{O} to {\"O}
and
\"{u} to {\"u}
and
\"{o} to {\"o} (or {\"O} if you are allowed to correct the capitalization)

the problem is in inserting \"{o} in

`\textit{}'

this breaks jabref.

Thank you for triangulating this.
JabRef uses internally an extern library (latex2unicode) to convert the latex encoding. Sadly, this library seems no more in active development, so we already started to think of a teplacement. But this is going to be a larger project.
I don't know yet if there is a quick fix possible.

Refs #5547
Refs #6155

issue topic -
Now I am on
JabRef 5.1--2020-05-25--6f34de3
Linux 5.6.13-300.fc32.x86_64 amd64
Java 14.0.1

I have moved all my old entries from my 15y old database to a new database (and in that process caught https://github.com/JabRef/jabref/issues/6399#issuecomment-633720078 with this bug https://github.com/JabRef/jabref/issues/6399#issuecomment-633743391).
Now I do not have the problem of the crash anymore - crash when adding new entry in database with 10500 entries. Therefore I am changing the title of this issue. please feel free to alter, if this does not fit.

I can't shed much light on the underlying issue, but I don't think it should be the latex2unicode converter. Adding the following test case to LatexToUnicodeFormatterTest.java works for me,

@Test
void formatUmlautsInTextit() {
    assertEquals("\uD835\uDC5D\uD835\uDC5C\uD835\uDC59\uD835\uDC56\uD835\uDC61\uD835\uDC56\uD835\uDC60\uD835\uDC50ℎ\uD835\uDC52 \uD835\uDC5C̈\uD835\uDC58\uD835\uDC5C\uD835\uDC59\uD835\uDC5C\uD835\uDC54\uD835\uDC56\uD835\uDC52",
            formatter.format("\\textit{politische \\\"{o}kologie}"));
}

where the unicode on the left comes from yaytext.com.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Siedlerchr picture Siedlerchr  ·  3Comments

Siedlerchr picture Siedlerchr  ·  4Comments

thorstenwagner picture thorstenwagner  ·  4Comments

Siedlerchr picture Siedlerchr  ·  3Comments

caugner picture caugner  ·  3Comments