Class: TextExtractorLine

Core.PDFNet. TextExtractorLine


new TextExtractorLine( [line] [, uni] [, num] [, cur_num] [, m_direction] [, mp_bld])

TextExtractor::Line object represents a line of text on a PDF page. Each line consists of a sequence of words, and each words in one or more styles.
Parameters:
Name Type Argument Description
line number <optional>
uni number <optional>
num number <optional>
cur_num number <optional>
m_direction number <optional>
mp_bld <optional>
Properties:
Name Type Description
line number
uni number
num number
cur_num number
m_direction number
mp_bld

Methods


<static> create()

Constructor
Returns:
A promise that resolves to an object of type: "PDFNet.TextExtractorLine"
Type
Promise.<Core.PDFNet.TextExtractorLine>

compare(line2)

Comparison function. Determines if parameter object is equal to current object.
Parameters:
Name Type Description
line2 Core.PDFNet.TextExtractorLine
Returns:
A promise that resolves to True if the two objects are equivalent, False otherwise
Type
Promise.<boolean>

endsWithHyphen()

Returns:
A promise that resolves to true is this line of text ends with a hyphen (i.e. '-'), false otherwise.
Type
Promise.<boolean>

getBBox()

Returns:
A promise that resolves to the bounding box for this line (in unrotated page coordinates).
Type
Promise.<Core.PDFNet.Rect>

getCurrentNum()

Returns:
A promise that resolves to the index of this line of the current page.
Type
Promise.<number>

getFirstWord()

Returns:
A promise that resolves to the first word in the line. Note: To traverse the list of all words on this line use word.GetNextWord().
Type
Promise.<Core.PDFNet.TextExtractorWord>

getFlowID()

Returns:
A promise that resolves to the unique identifier for a paragraph or column that this line belongs to. This information can be used to identify which lines/paragraphs belong to which flows.
Type
Promise.<number>

getNextLine()

Returns:
A promise that resolves to the next line on the page.
Type
Promise.<Core.PDFNet.TextExtractorLine>

getNumWords()

Returns:
A promise that resolves to the number of words in this line.
Type
Promise.<number>

getParagraphID()

Returns:
A promise that resolves to the unique identifier for a paragraph or column that this line belongs to. This information can be used to identify which lines belong to which paragraphs.
Type
Promise.<number>

getQuad()

Gets the quadrilateral bounding box for the line (in unrotated page coordinates)
Returns:
A promise that resolves to an object of type: "PDFNet.QuadPoint"
Type
Promise.<Core.PDFNet.QuadPoint>

getStyle()

Returns:
A promise that resolves to predominant style for this line.
Type
Promise.<Core.PDFNet.TextExtractorStyle>

getWord(word_idx)

Parameters:
Name Type Description
word_idx number A integer representing the index of the word to get.
Returns:
A promise that resolves to the i-th word in this line.
Type
Promise.<Core.PDFNet.TextExtractorWord>

isSimpleLine()

Returns:
A promise that resolves to true is this line is not rotated (i.e. if the quadrilaterals returned by GetBBox() and GetQuad() coincide).
Type
Promise.<boolean>

isValid()

Returns:
A promise that resolves to true if this is a valid line, false otherwise.
Type
Promise.<boolean>