The PDF Focus .Net is a standalone 100% managed C# library. It enables to any .Net application to convert PDF to Word, RTF, extract Text from PDF and rasterize PDF to Images.
The library represents only the file "SautinSoft.PdfFocus.dll", it's absolutely standalone and independent and doesn't require Adobe Acrobat or MS Word installed.
To include in any .Net app (ASP.Net, WinForms, Console etc) the ability of converting PDF to Word you have to only copy the file "SautinSoft.PdfFocus.dll" into your Bin folder and add the reference to it.
The PDF Focus .Net perfectly works in 32/64-bit applications under .Net 2.0 and above. It can be easily launched in Medium trust level.
The words above describe the PDF Focus .Net in general, but the main point of this page is "How to extract Text from PDF in C# and VB.Net":
- Extract Text from custom pages of PDF file using C#
- Convert whole PDF document to Text in memory using C#
- Extract Text from all pages of PDF in ASP.Net/VB.Net
- Convert 1st page of PDF to Text in VB.Net

1. Extract Text from custom pages of PDF file using C#:
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
f.OpenPdf(@"d:\Cook Book.pdf");
if (f.PageCount > 2)
{
//Convert only pages from 2 to 3 in Text
f.ToText(@"d:\Cook Book.txt", 2, 3);
}
2. Convert whole PDF document to Text in memory using C#:
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
//Read PDF to byte array
byte[] pdf = File.ReadAllBytes(@"d:\Sample.pdf");
f.OpenPdf(pdf);
if (f.PageCount > 0)
{
string text = f.ToText();
//Save to text file
File.WriteAllText(@"d:\Sample.txt", text);
}
3. Extract Text from all pages of PDF in ASP.Net/VB.Net:
Dim f As New SautinSoft.PdfFocus()
Dim url As New Uri("http://www.website.com/sample.pdf")
f.OpenPdf(url)
If f.PageCount > 0 Then
'Convert whole PDF to Text (extract text from PDF)
Dim text As String = f.ToText()
'show text
TextBox1.Text = Text
Else
TextBox1.Text = "Converting failed!"
End If
4. Convert 1st page of PDF to Text in VB.Net:
Dim f As New SautinSoft.PdfFocus()
Dim pdf() As Byte = File.ReadAllBytes("d:\Simple.pdf")
Dim text As String = ""
f.OpenPdf(pdf)
If f.PageCount > 0 Then
text = f.ToText(1, 1)
'show text
If text <> "" Then
TextBox1.Text = text
End If
End If
If anyone needs a code sample in C#, VB.Net, ASP.Net etc "How to extract Text from PDF", email us at support@sautinsoft.com or ask at Online Chat (right-bottom corner of this page). We'll help you certainly!
|