To know more about how to extract only the hyperlinks from a PDF document, please consult the "Get Hyperlinks" example.
The following example demonstrates how to load an existing PDF document, get text from a specific area of that document & then output that text into a newly created PDF document.
| Get text located in a specific area of a PDF document (C#) |
Copy Code |
|---|---|
public static void GetTextFromArea() { Console.WriteLine( "=== GET TEXT FROM AN AREA ===" ); var outputFileName = "GetTextFromArea.pdf"; var outputPath = TextsSample.TextsSampleOutputDirectory + outputFileName; // Load a pdf document. using( var pdfInput = PdfDocument.Load( TextsSampleResourcesDirectory + @"Two Page Text Only - from libre office.pdf" ) ) { // Get first page of input document. var page = pdfInput.Pages[ 0 ]; // Get Text from a specific area of the first page. var areaText = page.GetTextFromArea( new Rectangle( 297, 77, 75, 12 ) ); // Create an output Pdf to display areaText. using( var pdfoutput = PdfDocument.Create( outputPath ) ) { // Get first page of output pdf. var outputPage = pdfoutput.Pages[ 0 ]; // Set the title. var titleFont = pdfoutput.Fonts.GetStandardFont( StandardFontType.Helvetica ); outputPage.AddParagraph( "Get Text From Area", TextStyle.WithFont( titleFont, 15 ), new ParagraphStyle( ParagraphHorizontalAlignment.Center ) ); // Display the areaText. var textStyle = TextStyle.WithFont( titleFont, 12 ); outputPage.AddText( $"The text found in the area (297, 77, 375, 88) is: \"{areaText}\".", new Point( 110, 145 ), textStyle ); // Save the output document. pdfoutput.Save(); Console.WriteLine( $"Created: {outputFileName}" ); } } } | |
To know more about how to extract only the hyperlinks from a PDF document, please consult the "Get Hyperlinks" example.