SautinSoft is an Internet-oriented software development company
  SautinSoft on Facebook SautinSoft on Twitter 
 
How to convert Extract Text from PDF in C# or VB.Net?
 
    PDF to Image in C# and VB.Net

The PDF Focus .Net is a standalone 100% managed C# library. It enables to any .Net application to convert PDF to Word, RTF, extract Text from PDF and rasterize PDF to Images.

The library represents only the file "SautinSoft.PdfFocus.dll", it's absolutely standalone and independent and doesn't require Adobe Acrobat or MS Word installed.

To include in any .Net app (ASP.Net, WinForms, Console etc) the ability of converting PDF to Word you have to only copy the file "SautinSoft.PdfFocus.dll" into your Bin folder and add the reference to it.

The PDF Focus .Net perfectly works in 32/64-bit applications under .Net 2.0 and above. It can be easily launched in Medium trust level.

The words above describe the PDF Focus .Net in general, but the main point of this page is "How to extract Text from PDF in C# and VB.Net":

  1. Extract Text from custom pages of PDF file using C#
  2. Convert whole PDF document to Text in memory using C#
  3. Extract Text from all pages of PDF in ASP.Net/VB.Net
  4. Convert 1st page of PDF to Text in VB.Net

1. Extract Text from custom pages of PDF file using C#:

            SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
	
            f.OpenPdf(@"d:\Cook Book.pdf");

            if (f.PageCount > 2)
            {
                //Convert only pages from 2 to 3 in Text
                f.ToText(@"d:\Cook Book.txt", 2, 3);
            }
2. Convert whole PDF document to Text in memory using C#:
            SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

            //Read PDF to byte array
            byte[] pdf = File.ReadAllBytes(@"d:\Sample.pdf");

            f.OpenPdf(pdf);

            if (f.PageCount > 0)
            {
                string text = f.ToText();

                //Save to text file
                File.WriteAllText(@"d:\Sample.txt", text);
            }
3. Extract Text from all pages of PDF in ASP.Net/VB.Net:
        Dim f As New SautinSoft.PdfFocus()
        Dim url As New Uri("http://www.website.com/sample.pdf")
	
        f.OpenPdf(url)

        If f.PageCount > 0 Then
            'Convert whole PDF to Text (extract text from PDF)
            Dim text As String = f.ToText()

            'show text
            TextBox1.Text = Text

        Else
            TextBox1.Text = "Converting failed!"
        End If
4. Convert 1st page of PDF to Text in VB.Net:
        Dim f As New SautinSoft.PdfFocus()

        Dim pdf() As Byte = File.ReadAllBytes("d:\Simple.pdf")
        Dim text As String = ""

        f.OpenPdf(pdf)

        If f.PageCount > 0 Then
            text = f.ToText(1, 1)

            'show text
            If text <> "" Then
                TextBox1.Text = text
            End If
        End If
If anyone needs a code sample in C#, VB.Net, ASP.Net etc "How to extract Text from PDF", email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page). We'll help you certainly!

   

Home | Contact | Privacy

Copyright © 2002 - 2012, SautinSoft (started from sautin.com). All rights reserved.