new TextExtractorWord( [line] [, word] [, end] [, uni] [, num] [, cur_num] [, mp_bld])
TextExtractor::Word object represents a word on a PDF page.
Each word contains a sequence of characters in one or more styles
(see TextExtractor::Style).
Parameters:
Name | Type | Argument | Description |
---|---|---|---|
line |
number |
<optional> |
|
word |
number |
<optional> |
|
end |
number |
<optional> |
|
uni |
number |
<optional> |
|
num |
number |
<optional> |
|
cur_num |
number |
<optional> |
|
mp_bld |
<optional> |
Properties:
Name | Type | Description |
---|---|---|
line |
number | |
word |
number | |
end |
number | |
uni |
number | |
num |
number | |
cur_num |
number | |
mp_bld |
Methods
-
<static> create()
-
Constructor
Returns:
A promise that resolves to an object of type: "PDFNet.TextExtractorWord"- Type
- Promise.<PDFNet.TextExtractorWord>
-
compare(word)
-
Comparison function. Determines if parameter object is equal to current object.
Parameters:
Name Type Description word
PDFNet.TextExtractorWord Returns:
A promise that resolves to True if the two objects are equivalent, False otherwise- Type
- Promise.<boolean>
-
getBBox()
-
Returns:
A promise that resolves to the bounding box for this word (in unrotated page coordinates).- Type
- Promise.<PDFNet.Rect>
-
getCharStyle(char_idx)
-
Parameters:
Name Type Description char_idx
number The index of a character in this word. Returns:
A promise that resolves to the style associated with a given character.- Type
- Promise.<PDFNet.TextExtractorStyle>
-
getCurrentNum()
-
Returns:
A promise that resolves to the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).- Type
- Promise.<number>
-
getNextWord()
-
Returns:
A promise that resolves to the next word on the current line.- Type
- Promise.<PDFNet.TextExtractorWord>
-
getNumGlyphs()
-
Returns:
A promise that resolves to the number of glyphs in this word.- Type
- Promise.<number>
-
getQuad()
-
Returns:
A promise that resolves to the quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).- Type
- Promise.<PDFNet.QuadPoint>
-
getString()
-
Returns:
A promise that resolves to the content of this word represented as a string. coordinates).- Type
- Promise.<string>
-
getStringLen()
-
Returns:
A promise that resolves to the number of characters in this word.- Type
- Promise.<number>
-
getStyle()
-
Returns:
A promise that resolves to predominant style for this word.- Type
- Promise.<PDFNet.TextExtractorStyle>
-
isValid()
-
Returns:
A promise that resolves to true if this is a valid word, false otherwise.- Type
- Promise.<boolean>