In a similar way to https://github.com/jgm/pandoc/issues/1793 for pdf generation, it would be nice to automatically convert svg images to png (or any word supported format) when going from markdown to docx.
Hi, the latest version of Word 2016 finally supports SVG:
https://support.office.com/en-us/article/edit-svg-images-in-microsoft-office-2016-69f29d39-194a-4072-8c35-dbe5e7ea528c?ui=en-US&rs=en-US&ad=US#id0eaafaaa=office_2016_on_mac
Thanks for the pointer!
It took them a while for a 1999 standard, and it seems it's only for whose with active subscriptions to 365
Anyway, better late than never. Nice!
Posting the relevant part of the attached docx here for reference. Apparently, the docx contains both the svg and a fallback png.
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0" wp14:anchorId="589F1FF2" wp14:editId="553CEC71">
<wp:extent cx="5943600" cy="3608070"/>
<wp:effectExtent l="0" t="0" r="0" b="0"/>
<wp:docPr id="1" name="Graphic 1"/>
<wp:cNvGraphicFramePr>
<a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1"/>
</wp:cNvGraphicFramePr>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="1" name="cars.svg"/>
<pic:cNvPicPr/>
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:embed="rId8"> <!-- pretty sure this is the file Id of the PNG -->
<a:extLst>
<a:ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
<a14:useLocalDpi xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main" val="0"/>
</a:ext>
<a:ext uri="{96DAC541-7B7A-43D3-8B79-37D633B846F1}"> <!-- this is a constant, identifying the svg extension -->
<asvg:svgBlip xmlns:asvg="http://schemas.microsoft.com/office/drawing/2016/SVG/main"
r:embed="rId9"/> <!-- pretty sure this is the file Id of the SVG -->
</a:ext>
</a:extLst>
</a:blip>
<a:stretch>
<a:fillRect/>
</a:stretch>
</pic:blipFill>
<pic:spPr>
<a:xfrm>
<a:off x="0" y="0"/>
<a:ext cx="5943600" cy="3608070"/>
</a:xfrm>
<a:prstGeom prst="rect">
<a:avLst/>
</a:prstGeom>
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
Since we need to generate a fallback-png for this anyway, this seems as good a time as any, to factor out the svg-to-png rendering features that was added for LaTeX/PDF output, and generalize it to docx and epub output (https://github.com/jgm/pandoc/issues/2766).
btw. see https://github.com/jgm/pandoc/pull/2211 for some code that may be usable.
@jgm What do you think of factoring out the rsvg-convert-related code from PDF.hs? I'm not sure where it should go: in Text.Pandoc.Shared (where PDF.hs lives) or Text.Pandoc.Writers.Shared (where EPUB.hs etc. are). Or even Text.Pandoc.ImageSize which already has morphed into a kind of general-purpose image module, or a new module?
The nice thing is that even older word (2013) can open documents with SVGs (I guess they fall back to showing the png)
Are you talking about convertImage from the PDF
module, or are you thinking of something more
special-purpose, just for SVG?
I don't remember, it's been a while ;-) But yes, maybe LaTeX and Word understand about the same subset of image formats, then we could just move the whole convertImage function...
As a heavy user of the docx output format, I'm really interested in this issue. It would allow outputting better quality documents.
The issue is still there: When convertig Markdown to PDF the SVGs are embedded, in DOCX files the images are not present, only placeholders (Broken image...). Is there any way to deal with Word documents?
It would be good to do something here. Note that convertImage from the PDF module currently has type
convertImage :: WriterOptions -> FilePath -> FilePath
-> IO (Either Text FilePath)
It creates a file. I think we should create a new unexported module Text.Pandoc.Image with
convertImage :: WriterOptions
-> MimeType -- ^ Input mime type
-> ByteString -- ^ Input image as bytestring
-> MimeType -- ^ Desired output mime type
-> IO (Either Text ByteString)
This would be more suitable for use in Word, since we don't have a tmp dir. We'd need a bit of code around this in Text.Pandoc.PDF, but it would be simple.
Note: rsvg-convert can be used as a pipe.
Or maybe Text.Pandoc.ImageSize should be folded into the new Text.Pandoc.Image (which would then need to be exported). ImageSize contains an ImageType type, which could be used instead of MimeType in the new convertImage.
I added an unexported module Text.Pandoc.Image with svgToPng.
I'm confused. is this a milestone for the next release, or is it not?
Either way, I'm really happy to see that you folks are working on this.
hm... I was taking a stab at this... and svgToPng in Text.Pandoc.Image is great... but we cannot run it in the Pandoc Monad.. :S
for the record, this is how far I got:
diff --git a/src/Text/Pandoc/Writers/Docx.hs b/src/Text/Pandoc/Writers/Docx.hs
index 2caba59cc..d5403e65b 100644
--- a/src/Text/Pandoc/Writers/Docx.hs
+++ b/src/Text/Pandoc/Writers/Docx.hs
@@ -44,6 +44,7 @@ import Text.Pandoc.Definition
import Text.Pandoc.Generic
import Text.Pandoc.Highlighting (highlight)
import Text.Pandoc.Error
+import Text.Pandoc.Image (svgToPng)
import Text.Pandoc.ImageSize
import Text.Pandoc.Logging
import Text.Pandoc.MIME (MimeType, extensionFromMimeType, getMimeType,
@@ -1328,7 +1329,12 @@ inlineToOpenXML' opts (Image attr@(imgident, _, _) alt (src, title)) = do
imgs <- gets stImages
let
stImage = M.lookup (T.unpack src) imgs
- generateImgElt (ident, _, _, img) =
+ svgBlip ident = mknode "a:extLst" [] $
+ mknode "a:ext" [("uri", "{96DAC541-7B7A-43D3-8B79-37D633B846F1}")] $
+ mknode "asvg:svgBlip" [
+ ("xmlns:asvg", "http://schemas.microsoft.com/office/drawing/2016/SVG/main")
+ , ("r:embed", ident) ] ()
+ generateImgElt (ident, _, mbMimeType, img) =
let
(xpt,ypt) = desiredSizeInPoints opts attr
(either (const def) id (imageSize opts img))
@@ -1343,7 +1349,10 @@ inlineToOpenXML' opts (Image attr@(imgident, _, _) alt (src, title)) = do
[("descr",T.unpack src),("id","0"),("name","Picture")] ()
, cNvPicPr ]
blipFill = mknode "pic:blipFill" []
- [ mknode "a:blip" [("r:embed",ident)] ()
+ [ mknode "a:blip" [("r:embed",ident)] $
+ case mbMimeType of
+ Just "image/svg+xml" -> [svgBlip ident]
+ _ -> []
, mknode "a:stretch" [] $
mknode "a:fillRect" [] ()
]
@@ -1414,6 +1423,8 @@ inlineToOpenXML' opts (Image attr@(imgident, _, _) alt (src, title)) = do
else do
-- insert mime type to use in constructing [Content_Types].xml
modify $ \st -> st { stImages = M.insert (T.unpack src) imgData $ stImages st }
+
+ svgToPng opts $ toLazy img
return [generateImgElt imgData]
)
`catchError` ( \e -> do
but we cannot run it in the Pandoc Monad.
svgToPng may be too specialized to make a method of PandocMonad.
One possibility would be to add a method to PandocMonad class
ioWithFallback :: PandocMonad m => a -> IO a -> m a
This would be implemented in PandocIO by simply running the IO action.
In PandocPure it would simply return the fallback.
With this we could easily integrate svgToPng.
Thoughts? @tarleb @jkr
I'm not sure. Seems like a reasonably clean and pragmatic solution, but feels a bit weird, too.
Questions that came to mind, in no particular order:
Text.Pandoc.App.Transform be a viable option to perform the conversion?MonadFallbackIO or the like could allow for finer-graned effects handling; but might be too complicated.Would an IOException in the IO action trigger the fallback, or would it bubble up in the form of a PandocError?
I suppose we'd want to trap exceptions in the IO action and raise a PandocError. The fallback would just be for cases that can't perform IO. If you wanted to return the fallback if there were IO exceptions, you could just handle the exception yourself.
Could a Text.Pandoc.App.Transform be a viable option to perform the conversion?
Ah, I see; you mean do a pass through the AST first, converting SVGs to PNGs, before rendering the AST? This might be a bit less performant, but we could do it without changes to the Class API.
Most helpful comment
I added an unexported module Text.Pandoc.Image with svgToPng.