Convert PDF to XML in C# using PDF Focus .Net library

Convert PDF to XML in C# using PDF Focus .Net library

PDF Focus .Net

.Net assembly which gives API to convert PDF to All: DOCX, RTF, HTML, XML, Text, Excel, Images in .Net and C#.
PDF to XML scheme

PDF Focus .Net

Convert PDF to XML in C# using PDF Focus .Net library
PDF to XML scheme

Introduction

Let's see how to add "PDF to XML feature" into any .NET application. First of all, to give your .NET application ability to convert PDF documents to XML, add a reference to the "SautinSoft.PdfFocus.dll" assembly. You may download it here, 104.0 Mb.

Let's take a look to a very straightforward example in C#:

           SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
            f.XmlOptions.ConvertNonTabularDataToSpreadsheet = false;
            f.OpenPdf(@"c:\Table.pdf");
            f.ToXml(@"c:\Table.xml");
          

After launching this code you will get XML-document produced from Table.pdf. Since we have set the property "ConvertNonTabularDataToSpreadsheet" to false, all textual data will be skipped. In other words, only tables will be converted to XML.


<document>
  <page index="1">
    <table>
      <tgroup cols="5">
        <row>
          <entry rowspan="2">September</entry>
          <entry colspan="2">October</entry>
          <entry colspan="2">November</entry>
        </row>
      </tgroup>
    </table>
  </page>
</document>
                                  

Thus, you may adjust the component to get such XML document as you wish.


Download

To see this functionality firsthand, download the freshest «PDF Focus .Net» with code examples, 104.0 Mb.

Limitations

PDF Focus .Net The limitations of the free version are: The trial notice "Created by unlicensed version of PDF Focus .Net" and the random addition of the word "TRIAL".


Some examples to convert PDF to XML in C# and VB.Net

1. Convert PDF file to XML file in C#:


            string pathToPdf = @"c:\Table.pdf";
            string pathToXml = Path.ChangeExtension(pathToPdf, ".xml");

            // Convert PDF file to XML file.
            SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

	    	// This property is necessary only for registered version.
		    //f.Serial = "XXXXXXXXXXX";

            // Let's convert only tables to XML and skip all textual data.
            f.XmlOptions.ConvertNonTabularDataToSpreadsheet = false;

            f.OpenPdf(pathToPdf);

            if (f.PageCount > 0)
            {
                int result = f.ToXml(pathToXml);

                //Show HTML document in browser
                if (result==0)
                {
                    System.Diagnostics.Process.Start(pathToXml);
                }
            }
      

2. Convert PDF file to XML file in VB.Net:


        Dim pathToPdf As String = @"c:\Table.pdf"
        Dim pathToXml As String = Path.ChangeExtension(pathToPdf, ".xml")

        ' Convert PDF file to XML file.
        Dim f As New SautinSoft.PdfFocus()

        ' This property is necessary only for registered version.
        'f.Serial = "XXXXXXXXXXX";

        ' Let's convert only tables to XML and skip all textual data.
        f.XmlOptions.ConvertNonTabularDataToSpreadsheet = False

        f.OpenPdf(pathToPdf)

        If f.PageCount > 0 Then
            Dim result As Integer = f.ToXml(pathToXml)

            'Show HTML document in browser
            If result = 0 Then
                System.Diagnostics.Process.Start(pathToXml)
            End If
        End If
      

Requirements and Technical Information

Requires .NET Framework 4.0 or higher. Our product is compatible with all .NET languages and supports all Operating Systems where .NET Framework and .NET Core can be used. Note that PDF Focus .Net is entirely written in managed C#, which makes it absolutely standalone and an independent library.

.Net Framework 4.0 and higher and .Net Core 2.0 and higher

.NET Framework 4.0, 4.5, 4.6.1 and higher.The old version for old .NET 2.0 can be found here

.NET Standard 2.0

.NET Core 2.0 and higher.


Multi-platform component, runs on:


Our component has proven itself on cloud platforms and services:

  • Microsoft Azure
  • Amazon Web Services (AWS)
  • Google Cloud Platform
  • SharePoint
  • Docker
  • etc.

If you need a new code example or have a question: email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page) or use the Form below: