Some test text!

Discord Logo

Chat with us

PDFTron is now Apryse, learn more here.

Ruby / Guides / Text search



PDFTron is now Apryse, learn more here.

Search for text in a PDF in Ruby

To search for text in a PDF using regular expression and then apply a link annotation on the highlighted result.

In this example, we add a link annotation but any other types of annotations can be applied here such as redaction annotations in the case of a search and redact workflow.
doc =
txt_search =
mode = TextSearch::E_whole_word | TextSearch::E_page_stop
pattern = ""

# use regular expression to find credit card number
mode |= TextSearch::E_reg_expression | TextSearch::E_highlight
pattern = "\\d{4}-\\d{4}-\\d{4}-\\d{4}"	 #or "(\\d{4}-){3}\\d{4}"

# call Begin method to initialize the text search.
txt_search.Begin(doc, pattern, mode)
searchResult = txt_search.Run

if searchResult.IsFound
  # add a link annotation based on the location of the found instance
  hlts = searchResult.GetHighlights
  while hlts.HasNext do
    cur_page = doc.GetPage(hlts.GetCurrentPageNumber)
    quadsInfo = hlts.GetCurrentQuads

    i = 0
    while i < quadsInfo.size do
      q = quadsInfo[i]
      # assume each quad is an axis-aligned rectangle						
      x1 = [q.p1.x, q.p2.x, q.p3.x, q.p4.x].min
      x2 = [q.p1.x, q.p2.x, q.p3.x, q.p4.x].max
      y1 = [q.p1.y, q.p2.y, q.p3.y, q.p4.y].min
      y2 = [q.p1.y, q.p2.y, q.p3.y, q.p4.y].max
      hyper_link = Link.Create(doc.GetSDFDoc,, y1, x2, y2), Action.CreateURI(doc.GetSDFDoc, ""))
      i = i + 1

Search PDF files for text
Full code sample which shows how to use TextSearch to search text on PDF pages using regular expressions.

Get the answers you need: Support