Thursday, 16 November 2017

Convert HTML to Docx and Pdf using Microsoft.Office.Interop.Word

// use _Application and _Document instead of var to avoid ambiguities 
_Application word = new Application();
_Document document = word.Documents.Open(FileName: @"C:\index.html", ConfirmConversions: false, ReadOnly: false);

word.Visible = false;

// impersonated user must have read-write access to the following locations

var pdfOutputFile = @"C:\Temp\document.pdf";
var docOutputFile = @"C:\Temp\document.docx";

byte[] Content = null;

.
.
.
case "pdf":
document.SaveAs2(FileName: pdfOutputFile, FileFormat: WdSaveFormat.wdFormatPDF);
document.Close();
word.Quit();
Content = File.ReadAllBytes(pdfOutputFile);
break;

case "docx":

foreach (InlineShape image in document.InlineShapes)
{
// make sure images are embeded
if (image.LinkFormat != null)
{
try
{
image.LinkFormat.SavePictureWithDocument = true;
image.LinkFormat.BreakLink();
}
catch { /* do nothing */ }
}
}
document.SaveAs2(FileName: docOutputFile, FileFormat: WdSaveFormat.wdFormatDocumentDefault);
document.Close();
word.Quit();
Content = File.ReadAllBytes(docOutputFile);
break;
.
.
.

// download file as byte[]
if (Content != null)
{
Response.AddHeader("content-disposition", "attachment; filename=" + (pdfOutputFile or docOutputFile));
Response.BufferOutput = true;
Response.OutputStream.Write(Content, 0, Content.Length);
Response.End();
}


Important notes: 


  1. In order to be able to save document, create a new "Desktop" directory inside of "C:\Windows\SysWOW64\config\systemprofile\"
  2. The library will not work on Server 2016 with O365 or Office 2016. On server 2016, need to use Office 2013.



No comments:

SQL: Generate a range of numbers

SELECT ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n FROM       (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ones(n),      (VALU...