Class TextExtractor.Word
TextExtractor.Word object represents a word on a PDF page. Each word contains a sequence of characters in one or more styles (see TextExtractor.Style).
Implements
Inherited Members
Namespace: pdftron.PDF
Assembly: PDFNet.dll
Syntax
public class TextExtractor.Word : IDisposable
Constructors
Word()
Declaration
public Word()
Methods
Dispose()
Releases all resources used by the Word
Declaration
public override sealed void Dispose()
Dispose(bool)
Declaration
[HandleProcessCorruptedStateExceptions]
protected virtual void Dispose(bool A_0)
Parameters
Type | Name | Description |
---|---|---|
bool | A_0 |
Equals(object)
Checks whether this Word
object is the same as the opject specified.
Declaration
public bool Equals(object o)
Parameters
Type | Name | Description |
---|---|---|
object | o | specified object |
Returns
Type | Description |
---|---|
bool | true if equals to the specified object |
~Word()
Declaration
protected ~Word()
GetBBox()
Gets the b box.
Declaration
public Rect GetBBox()
Returns
Type | Description |
---|---|
Rect | The bounding box for this word (in unrotated page coordinates). |
Remarks
To account for the effect of page '/Rotate' attribute, transform all points using page.GetDefaultMatrix().
GetCharStyle(int)
Gets the char style.
Declaration
public TextExtractor.Style GetCharStyle(int char_idx)
Parameters
Type | Name | Description |
---|---|---|
int | char_idx | The index of a character in this word. |
Returns
Type | Description |
---|---|
TextExtractor.Style | The style associated with a given character. |
GetCurrentNum()
Gets the index of this word of the current line. A word that starts the line will return 0, whereas the last word in the line will return (line.GetNumWords()-1).
Declaration
public int GetCurrentNum()
Returns
Type | Description |
---|---|
int | the index of this word of the current line |
GetGlyphQuad(int)
Gets the glyph from index
Declaration
public double[] GetGlyphQuad(int glyph_idx)
Parameters
Type | Name | Description |
---|---|---|
int | glyph_idx | The index of a glyph in this word. |
Returns
Type | Description |
---|---|
double[] | The quadrilateral representing a tight bounding box for a given glyph in the word (in unrotated page coordinates). |
GetNextWord()
Gets the next object
Declaration
public TextExtractor.Word GetNextWord()
Returns
Type | Description |
---|---|
TextExtractor.Word | the next object |
GetNumGlyphs()
Gets the num glyphs.
Declaration
public int GetNumGlyphs()
Returns
Type | Description |
---|---|
int | The number of glyphs in this word. |
GetQuad()
return The quadrilateral representing a tight bounding box for this word (in unrotated page coordinates).
Declaration
public double[] GetQuad()
Returns
Type | Description |
---|---|
double[] | the quad |
GetString()
Gets Unicode string
Declaration
public string GetString()
Returns
Type | Description |
---|---|
string | the content of this word represented as a Unicode string. |
GetStringLen()
Gets the number of chars in the string.
Declaration
public int GetStringLen()
Returns
Type | Description |
---|---|
int | the number of characters in this word. |
GetStyle()
Gets predominant style for this word.
Declaration
public TextExtractor.Style GetStyle()
Returns
Type | Description |
---|---|
TextExtractor.Style | the style |
IsValid()
Checks if valid word
Declaration
public bool IsValid()
Returns
Type | Description |
---|---|
bool | true if this is a valid word, false otherwise. |
Set(Word)
Sets value to given Word
object
Declaration
public void Set(TextExtractor.Word r)
Parameters
Type | Name | Description |
---|---|---|
TextExtractor.Word | r | a given |
op_Assign(Word)
Assignment operator
Declaration
public TextExtractor.Word op_Assign(TextExtractor.Word r)
Parameters
Type | Name | Description |
---|---|---|
TextExtractor.Word | r | a given |
Returns
Type | Description |
---|---|
TextExtractor.Word |
|
Operators
operator ==(Word, Word)
Equality operator check whether two Word
objects are the same.
Declaration
public static bool operator ==(TextExtractor.Word l, TextExtractor.Word r)
Parameters
Type | Name | Description |
---|---|---|
TextExtractor.Word | l |
|
TextExtractor.Word | r |
|
Returns
Type | Description |
---|---|
bool | true if both |
operator !=(Word, Word)
Inequality operator check whether two Word
objects are different.
Declaration
public static bool operator !=(TextExtractor.Word l, TextExtractor.Word r)
Parameters
Type | Name | Description |
---|---|---|
TextExtractor.Word | l |
|
TextExtractor.Word | r |
|
Returns
Type | Description |
---|---|
bool | true if both |