java.lang.Object | |
↳ | com.pdftron.pdf.ElementReader |
ElementReader can be used to parse and process content streams. ElementReader provides a
convenient interface used to traverse the Element display list of a page. The display list
representing graphical elements (such as text-runs, paths, images, shadings, forms, etc) is
accessed using the intrinsic iterator. ElementReader automatically concatenates page contents
spanning multiple streams and provides a mechanism to parse contents of sub-display lists
(e.g. forms XObjects and Type3 fonts).
A sample use case for ElementReader is given below:
...
ElementReader reader=new ElementReader();
reader.Begin(page);
for (Element element=reader.next(); element!=null;element=reader.next()) {
Rect bbox;
if((bbox=element.getBBox())!=null) System.out.println("Bounding Box: " + bbox.getRectangle());
switch (element.getType()) {
case Element.e_path: { // Process path data...
double[] data = element.getPathPoints();
break;
case Element.e_text:
// ...
break;
}
}
reader.End();
}
For a full sample, please refer to ElementReader and ElementReaderAdv sample projects.
Public Constructors | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
ElementReader()
Instantiates a new element reader.
|
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
void |
begin(Obj content_stream, Obj resource_dict)
Begin processing given content stream.
| ||||||||||
void |
begin(Obj content_stream)
Begin processing given content stream.
| ||||||||||
void |
begin(Page page)
Begin processing a page.
| ||||||||||
void |
begin(Obj content_stream, Obj resource_dict, Context ctx)
Begin processing given content stream.
| ||||||||||
void |
begin(Page page, Context ctx)
Begin processing a page.
| ||||||||||
void |
clearChangeList()
Clear the list containing identifiers of modified graphics state attributes.
| ||||||||||
void |
close()
Frees the native memory of the object.
| ||||||||||
Element |
current()
get current page element
Note: Every call to ElementReader::Next() destroys the current Element. | ||||||||||
void |
destroy()
Frees the native memory of the object.
| ||||||||||
boolean |
end()
Close the current display list.
| ||||||||||
void |
formBegin()
When the current element is a form XObject you have the option to skip form
processing (by not calling FormBegin()) or to open the form stream and
continue Element traversal into the form.
| ||||||||||
GSChangesIterator |
getChangesIterator()
Get the changes iterator.
| ||||||||||
Obj |
getColorSpace(String name)
Get the color space.
| ||||||||||
Obj |
getExtGState(String name)
Get the ext GState.
| ||||||||||
Obj |
getFont(String name)
Get the specified font.
| ||||||||||
Obj |
getPattern(String name)
Get the pattern.
| ||||||||||
Obj |
getShading(String name)
Get the specified shading object.
| ||||||||||
Obj |
getXObject(String name)
Get the specified XObject.
| ||||||||||
boolean |
isChanged(int gstate_attrib)
Checks if given GState attribute is changed.
| ||||||||||
Element |
next()
Get next page element
Note: Every call to ElementReader::Next() destroys the current Element. | ||||||||||
void |
patternBegin(boolean fill_pattern, boolean reset_ctm_tfm)
A method used to spawn the sub-display list representing the tiling pattern
of the current element in the ElementReader.
| ||||||||||
void |
patternBegin(boolean fill_pattern)
Pattern begin.
| ||||||||||
void |
type3FontBegin(CharData char_data, Obj resource_dict)
A method used to spawn a sub-display list representing a Type3 Font glyph.
| ||||||||||
void |
type3FontBegin(CharData char_data)
pawn a sub-display list representing a Type3 Font glyph.
|
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
From class
java.lang.Object
| |||||||||||
From interface
com.pdftron.pdf.__Delete
| |||||||||||
From interface
java.lang.AutoCloseable
|
Begin processing given content stream. The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.
Note: When page processing is completed, make sure to call ElementReader.End().
content_stream | - A stream object representing the content stream (usually a Form XObject). |
---|---|
resource_dict | - An optional '/Resource' dictionary parameter. If content stream refers to named resources that are not present in the local Resource dictionary, the names are looked up in the supplied resource dictionary. |
PDFNetException |
---|
Begin processing given content stream. The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.
Note: When page processing is completed, make sure to call ElementReader.End().
content_stream | - A stream object representing the content stream (usually a Form XObject). |
---|
PDFNetException |
---|
Begin processing a page.
Note: When page processing is completed, make sure to call ElementReader.End().
page | the page to start processing. |
---|
PDFNetException |
---|
Begin processing given content stream. The content stream may be a Form XObject, Type3 glyph stream, pattern stream or any other content stream.
Note: When page processing is completed, make sure to call ElementReader.End().
content_stream | - A stream object representing the content stream (usually a Form XObject). |
---|---|
resource_dict | - An optional '/Resource' dictionary parameter. If content stream refers to named resources that are not present in the local Resource dictionary, the names are looked up in the supplied resource dictionary. |
ctx | The Optional Content (OC) Context that should be used when processing the page. Element::IsOCVisible() will return 'true' or 'false' depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context. |
PDFNetException |
---|
Begin processing a page.
Note: When page processing is completed, make sure to call ElementReader.End().
page | the page to start processing. |
---|---|
ctx | the Optional Content (OC) Context that should be used when processing the page. Element::IsOCVisible() will return 'true' or 'false' depending on the visibility of the current Optional Content Group (OCG) and the states of flags in the given context |
PDFNetException |
---|
Clear the list containing identifiers of modified graphics state attributes. The list of modified attributes is then accumulated during a subsequent call(s) to ElementReader.Next().
PDFNetException |
---|
Frees the native memory of the object. This can be explicity called to control the deallocation of native memory and avoid situations where the garbage collector does not free the object in a timely manner.
PDFNetException |
---|
get current page element
Note: Every call to ElementReader::Next() destroys the current Element. Therefore, an Element becomes invalid after subsequent ElementReader::Next() operation.
PDFNetException |
---|
Frees the native memory of the object. This can be explicity called to control the deallocation of native memory and avoid situations where the garbage collector does not free the object in a timely manner.
PDFNetException |
---|
Close the current display list. If the current display list is a sub-list created using FormBegin(), PatternBegin(), or Type3FontBegin() methods, the function will end the sub-list and will return processing to the parent display list at the point where it left off before entering the sub-list.
PDFNetException |
---|
When the current element is a form XObject you have the option to skip form processing (by not calling FormBegin()) or to open the form stream and continue Element traversal into the form. To open a form XObject display list use FormBegin() method. The Next() returned Element will be the first Element in the form XObject display list. Subsequent calls to Next() will traverse form's display list until NULL is returned. At any point you can close the form sub-list using ElementReader::End() method. After the form display list is closed (using End()) the processing will return to the parent display list at the point where it left off before entering the form XObject.
PDFNetException |
---|
Get the changes iterator.
PDFNetException |
---|
Get the color space.
name | the name of the color space |
---|
Get the specified font.
name | the font name |
---|
PDFNetException |
---|
Get the pattern.
name | the pattern name |
---|
Get the specified shading object.
name | the name of the shading object |
---|
Get the specified XObject.
name | the name of the XObject |
---|
Checks if given GState attribute is changed.
gstate_attrib | the given GState attribute |
---|
PDFNetException |
---|
Get next page element
Note: Every call to ElementReader::Next() destroys the current Element. Therefore, an Element becomes invalid after subsequent ElementReader::Next() operation.
PDFNetException |
---|
A method used to spawn the sub-display list representing the tiling pattern of the current element in the ElementReader. You can call this method at any point as long as the current element is valid.
fill_pattern | If true, the filling pattern of the current element will be spawned; otherwise, the stroking pattern of the current element will be spawned. Note that the graphics state will be inherited from the parent content stream (the content stream in which the pattern is defined as a resource) automatically. |
---|---|
reset_ctm_tfm | An optional parameter used to indicate whether the pattern's display list should set its initial CTM and transformation matrices to identity matrix. In general, we should leave it to be false. To open a tiling pattern sub-display list use PatternBegin() method. The Next() returned Element will be the first Element in the pattern display list. Subsequent calls to Next() will traverse pattern's display list until NULL is encountered. At any point you can close the pattern sub-list using ElementReader::End() method. After the pattern display list is closed, the processing will return to the parent display list at the point where pattern display list was spawned. |
PDFNetException |
---|
Pattern begin.
fill_pattern | If true, the filling pattern of the current element will be spawned; otherwise, the stroking pattern of the current element will be spawned. Note that the graphics state will be inherited from the parent content stream (the content stream in which the pattern is defined as a resource) automatically. |
---|
PDFNetException |
---|
A method used to spawn a sub-display list representing a Type3 Font glyph. You can call this method at any point as long as the current element in the ElementReader is a text element whose font type is type 3.
char_data | The information about the glyph to process. You can get this information by dereferencing a CharIterator. |
---|---|
resource_dict | - An optional '/Resource' dictionary parameter. If any glyph descriptions refer to named resources but Font Resource dictionary is absent, the names are looked up in the supplied resource dictionary. To open a Type3 font sub-display list use Type3FontBegin() method. The Next() returned Element will be the first Element in the glyph's display list. Subsequent calls to Next() will traverse glyph's display list until NULL is returned. At any point you can close the glyph sub-list using ElementReader::End() method. After the glyph display list is closed, the processing will return to the parent display list at the point where glyph display list was spawned. |
PDFNetException |
---|
pawn a sub-display list representing a Type3 Font glyph.
PDFNetException |
---|