ICML differentiates between hyperlinks (external links) and cross-references (document-internal links). Pandoc's ICML writer does not correctly write document-internal links. I will summarize below the output changes that are needed to fix this; the full input and output files for the markup below can be found in the related topic on the pandoc-discuss list (see link-citations in ICML writer). It is likely not a simple one-line change but does not require a massive overhaul of the entire writer either, only the part that writes links. I don't know enough about Haskell to fix the writer myself, though I may try if nobody else wants to fix this.
There are essentially four things to change, enumerated below:
<CrossReferenceFormat> element should be present just before the <Story> element. The following example worked in my test: <CrossReferenceFormat Self="u1" Name="Text Anchor Name">
<BuildingBlock Self="u1BuildingBlock0" BlockType="BookmarkNameBuildingBlock" CustomText="$ID/" AppliedDelimiter="$ID/" IncludeDelimiter="false" />
</CrossReferenceFormat>
<CrossReferenceSource> element, NOT a <HyperlinkTextSource> element (which is for external links only). The Self attribute of the <CrossReferenceFormat> tag (from number 1 above) should be referenced in the AppliedFormat attribute of <CrossReferenceSource> tags, and the Name attribute of <CrossReferenceSource> tags should have a relevant value. For example: <HyperlinkTextSource Self="htss-1" Name="" Hidden="false">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Cite Link">
<Content>2017</Content>
</CharacterStyleRange>
</HyperlinkTextSource>
<CrossReferenceSource Self="htss-1" AppliedFormat="u1" Name="2017" Hidden="false">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Cite Link">
<Content>2017</Content>
</CharacterStyleRange>
</CrossReferenceSource>
<HyperlinkTextDestination> element, NOT a <HyperlinkURLDestination> element (which is for external links only). <HyperlinkTextDestination> elements for internal-link destination points should be written at relevant points in the document, NOT at end of file, and the Name attribute of <HyperlinkTextDestination> tags should have a relevant value. For example:<Hyperlink> element at end of file): <HyperlinkURLDestination Self="HyperlinkURLDestination/#ref-Citation1%3a2017" Name="link" DestinationURL="#ref-Citation1:2017" DestinationUniqueKey="1" />
<Hyperlink Self="uf-1" Name="#ref-Citation1:2017" Source="htss-1" Visible="true" DestinationUniqueKey="1">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkURLDestination/#ref-Citation1%3a2017</Destination>
</Properties>
</Hyperlink>
...in this example, written just before the relevant bibliography entry:
<HyperlinkTextDestination Self="HyperlinkTextDestination/#ref-Citation1%3a2017" Name="#ref-Citation1:2017" Hidden="false" DestinationUniqueKey="9" />
<Content>Last1, F. (2017). </Content>
...with the corresponding <Hyperlink> element written at end of file, and without the DestinationURL attribute (which is for external links only):
<Hyperlink Self="uf-1" Name="#ref-Citation1:2017" Source="htss-1" Visible="true" DestinationUniqueKey="9">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkTextDestination/#ref-Citation1%3a2017</Destination>
</Properties>
</Hyperlink>
DestinationUniqueKey attribute of each <Hyperlink> tag and its corresponding <HyperlinkURLDestination> or <HyperlinkTextDestination> tag (for external and internal links respectively) should be unique, NOT the same. For example:<Hyperlink> tags with identical DestinationUniqueKey attributes: <HyperlinkURLDestination Self="HyperlinkURLDestination/https%3a//pandoc.org/MANUAL.html" Name="link" DestinationURL="https://pandoc.org/MANUAL.html" DestinationUniqueKey="1" />
<Hyperlink Self="uf-9" Name="https://pandoc.org/MANUAL.html" Source="htss-9" Visible="true" DestinationUniqueKey="1">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkURLDestination/https%3a//pandoc.org/MANUAL.html</Destination>
</Properties>
</Hyperlink>
<HyperlinkURLDestination Self="HyperlinkURLDestination/https%3a//www.adobe.com/InDesign" Name="link" DestinationURL="https://www.adobe.com/InDesign" DestinationUniqueKey="1" />
<Hyperlink Self="uf-8" Name="https://www.adobe.com/InDesign" Source="htss-8" Visible="true" DestinationUniqueKey="1">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkURLDestination/https%3a//www.adobe.com/InDesign</Destination>
</Properties>
</Hyperlink>
<HyperlinkURLDestination Self="HyperlinkURLDestination/#de-optimo-modo-percipiendi" Name="link" DestinationURL="#de-optimo-modo-percipiendi" DestinationUniqueKey="1" />
<Hyperlink Self="uf-7" Name="#de-optimo-modo-percipiendi" Source="htss-7" Visible="true" DestinationUniqueKey="1">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkURLDestination/#de-optimo-modo-percipiendi</Destination>
</Properties>
</Hyperlink>
<Hyperlink> tags with unique DestinationUniqueKey attributes (and the last <Hyperlink> tag, because it is an internal link, correctly lacks a companion <HyperlinkURLDestination> element): <HyperlinkURLDestination Self="HyperlinkURLDestination/https%3a//pandoc.org/MANUAL.html" Name="https%3a//pandoc.org/MANUAL.html" DestinationURL="https://pandoc.org/MANUAL.html" DestinationUniqueKey="1" />
<Hyperlink Self="uf-9" Name="https://pandoc.org/MANUAL.html" Source="htss-9" Visible="true" DestinationUniqueKey="1">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkURLDestination/https%3a//pandoc.org/MANUAL.html</Destination>
</Properties>
</Hyperlink>
<HyperlinkURLDestination Self="HyperlinkURLDestination/https%3a//www.adobe.com/InDesign" Name="https%3a//www.adobe.com/InDesign" DestinationURL="https://www.adobe.com/InDesign" DestinationUniqueKey="2" />
<Hyperlink Self="uf-8" Name="https://www.adobe.com/InDesign" Source="htss-8" Visible="true" DestinationUniqueKey="2">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkURLDestination/https%3a//www.adobe.com/InDesign</Destination>
</Properties>
</Hyperlink>
<Hyperlink Self="uf-7" Name="#de-optimo-modo-percipiendi" Source="htss-7" Visible="true" DestinationUniqueKey="3">
<Properties>
<BorderColor type="enumeration">Black</BorderColor>
<Destination type="object">HyperlinkTextDestination/#de-optimo-modo-percipiendi</Destination>
</Properties>
</Hyperlink>
Thanks for the excellent bug report!
Thanks a lot for your report, exactly the info I needed!
I've started implementing this... most of your points should be implemented, but I haven't looked at the tests yet. A few questions:
Is there an equivalent to the CrossReferenceFormat thing for external links as well? Currently, we don't have that. Seems like it's used only for styling the link with the AppliedFormat attribute?
<HyperlinkTextDestination>elements for internal-link destination points should be written at relevant points in the document, NOT at end of file
What's the reason for this? Since the <HyperlinkTextDestination> is the analogue of the <HyperlinkURLDestination>, wouldn't it make sense to place them both at the end? (Or both right after the corresponding link source in the body text?) This is also easier to do in the current source code...
the
Nameattribute of<CrossReferenceSource>tags should have a relevant value
What is this used for? Is this only for the GUI somewhere? What's a good value? Does the same go for external links? Currently, it's set to the title of the link or the empty string. Try for example this markdown: [link text](http://pandoc.org "my link title"). I see I couldn't make my mind up about this: the Name attribute on the <Hyperlink> element is actually the url/href. So what's the name used for there?
- Is there an equivalent to the
CrossReferenceFormatthing for external links as well? Currently, we don't have that. Seems like it's used only for styling the link with theAppliedFormatattribute?
<CrossReferenceFormat> is only for internal links (cross-references). InDesign usually uses this to automatically update certain cross-reference tags that it calls "building blocks" (hence the <BuildingBlock> element within the <CrossReferenceFormat> element); for example, this could include page numbers, paragraph numbers, or other variables that are automatically updated in the text. In InDesign's GUI, users can edit these building blocks. In my tests, the <CrossReferenceFormat> element (and corresponding AppliedFormat attribute of <CrossReferenceSource> tags) appeared to be necessary for cross-references to work, even though the content of <CrossReferenceSource> chosen here is not something that InDesign can automatically update.
When the markup chosen here (the "desired-output.icml" file that I provided on pandoc-discuss) is imported into InDesign, an "update" icon appears next to each cross-reference in the Hyperlinks pane. The InDesign reference manual (p. 434) explains: "An update icon indicates that the cross-reference destination text has changed or that the cross-reference source text has been edited." This is because in the file "desired-output.icml", Pandoc's ICML writer has "edited" (written) the cross-reference source text to be whatever is in the <Content> element of the relevant <CrossReferenceSource> element, instead of one of InDesign's building blocks. For example:
<CrossReferenceSource Self="htss-1" AppliedFormat="u1" Name="2017" Hidden="false">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Cite Link">
<Content>2017</Content>
</CharacterStyleRange>
</CrossReferenceSource>
I can't see any better way to do this given InDesign's limitations, and it works well enough.
<HyperlinkTextDestination>elements for internal-link destination points should be written at relevant points in the document, NOT at end of fileWhat's the reason for this? Since the
<HyperlinkTextDestination>is the analogue of the<HyperlinkURLDestination>, wouldn't it make sense to place them both at the end? (Or both right after the corresponding link source in the body text?) This is also easier to do in the current source code...
This has a simple answer: <HyperlinkTextDestination> is the analogue of the <HyperlinkURLDestination>, but since <HyperlinkTextDestination> represents an internal destination point, it needs to be written at the point in the file at which the internal link points. <HyperlinkURLDestination> is written at end of file because it doesn't "point inside" the document; it points at an external URL.
Here's an analogy to HTML that comes to mind: the <HyperlinkTextDestination> tag is analogous to named anchor tags in HTML4, where <a href="#value">text</a> points to named anchor tag <a name="value" /> at some other point in the document. Pandoc's HTML and DOCX writers, for example, do this correctly, so there may be some clue about how to implement this in those writers.
There is another relevant way of doing internal links in ICML using <ParagraphDestination> instead of <HyperlinkTextDestination> but that would involve changing values of other elements, and I didn't want to complicate things by presenting a completely different option that would not be any easier to implement.
- the
Nameattribute of<CrossReferenceSource>tags should have a relevant valueWhat is this used for? Is this only for the GUI somewhere? What's a good value? Does the same go for external links? Currently, it's set to the title of the link or the empty string. Try for example this markdown:
[link text](http://pandoc.org "my link title"). I see I couldn't make my mind up about this: theNameattribute on the<Hyperlink>element is actually the url/href. So what's the name used for there?
This value is used by InDesign in several link-related dialogue boxes ("Hyperlink Options...", "Cross-Reference Options...", etc.) in a drop-down menu that lists all the links.
It appears that the best value for the Name attribute of a <CrossReferenceSource> tag would be the same as the Name attribute of the corresponding <Hyperlink> tag. Several of the Name values that I provided in the file "desired-output.icml" are wrong; instead of <CrossReferenceSource Self="htss-1" AppliedFormat="u1" Name="2017" Hidden="false"> it would be <CrossReferenceSource Self="htss-1" AppliedFormat="u1" Name="#ref-Citation1:2017" Hidden="false">. So, yes, it would be the same as the value of the href attribute in HTML.
Thanks so much for working on this.
it needs to be written at the point in the file at which the internal link points.
Ah, that's what I was missing. Yes, that makes sense, but is somewhat tricky to implement, since a lot of elements can be linked to (every element that can have an id attribute). I'll take a look...
A year after opening this issue, I just wanted to note that I think this issue is still worth working on, and I am willing to help in any way that I can!
I just discovered this as well and would be willing to do what I can to help. Is anyone actively working on the ICML writer?
I've written most of the ICML writer a long time ago... since then, I haven't been actively working on it anymore... but pulls welcome! I'm also happy to answer any questions...
Looked over the code last night...
One question @mb21, have you ever tried to use a Lua filter to write direct ICML? I've used them for other formats incl. docx and HTML, but wonder if it would work here as well because of the XML...
not sure I understand your question... most of pandoc is written in Haskell, which is a much nicer programming language for big projects. to do a little bit of AST transformations, lua is great though.
btw. if you want a lua writer, you should look at https://pandoc.org/MANUAL.html#custom-writers
Spend some time looking at this today, and there are appears to be two possible approaches to getting internal links works in ICML
1 - there is the cross-reference approach that @nathan-artist mentions above
2 - there is a simpler "text anchor" model, which maps fairly well to the HTML or Docx models of bookmarks/anchors.
The first two could be done with a pandoc filter (I've done it for docx and could easily adjust for ICML), but since those only allow changes to the content, it's not possible to do #3. That has to be done in the writer itself. I looked at the writer code and while I can read and understand it - haskell is just quirky enough that I might break something.
thanks for the info! I probably won't have time to implement this anytime soon, but maybe someone else wants to give it a shot? Or a first step would be to figure out some example XML that we'd need to generate...
haskell is just quirky enough that I might break something.
the compiler and test-suite most probably will catch it :-)
Thanks @lrosenthol, I am looking forward to testing this when the PR is accepted!
Most helpful comment
Thanks for the excellent bug report!