pdfExtractTextFromRect2

Extract text from a rectangular region.
char *pdfExtractTextFromRect2(PDFHandle pdf, int page, double x0, double y0, double x1, double y1, int *length)
This function extracts text from a rectangular region on a page, and returns the resulting text in a string.

The rectangle is defined by two opposite corners: (x0, y0) and (x1, y1). The coordinates are in PDF coordinate space.

pdfExtractTextFromRect2 returns a string if successful, or NULL if text extraction is prohibited by this PDF file.

The string is returned, and *length is filled in with the string length. The string will be zero-terminated, but it may contain zero bytes, depending on the current text encoding (see pdfSetTextEncoding). The caller is responsible for freeing the string with the pdfFreeMemory function.

This function is identical to pdfExtractTextFromRect except that it takes points in PDF coordinate space.

See the "Setting parameters" section in the function list for settings that affect text extraction.

C:
char *buf; int length; /* extract a rectangle 4" from the left side, 1" up from * the bottom, 2" wide, 0.5" high, on page 1 */ if (!(buf = pdfExtractTextFromRect2(pdf, 1, 4*72, 1*72, 6*72, 1.5*72, &length))) { /* handle the error */ } ... pdfFreeMemory(buf);
pdfExtractTextFromRect
pdfConvertToTextFile
pdfConvertToTextString
pdfFreeMemory