Read Data from PDF/Image Using UiPath & Python
In last month blog post we learned how to use different OCR Engine with UiPath for Optical Character Recognition (OCR). In the same blog post, we applied 6 Different types of OCR Engine to test and evaluate the performance of the OCR engine on a very small set of example images & PDF files.
As our results demonstrated, most of the cloud provider has performed well that traditional available OCR Tools.
However, many readers have reached out to me and said why canโt we use the power of Python to Read Image/PDF in UiPath Instead of using cloud variant of ABBYY, Microsoft Vision API or Google Vision API.
Nevertheless, itโs important to understand how OCR works with Python. You will see almost all the cloud OCR engine provider provides SDK for Python language.
Even if you use the Python Language instead of Activities Provided by UiPath its works in similar fashionโฆ
- You need to read pdf/image
- You need to pass image to engine
- The engine will return data in a structured format
Then why some people prefer to go to python language for OCR capability โฆ the reason is preprocessing of the image before it is passed to the engine & post-processing of data received from the engine. This can later be then subjected to any amount of pre-processing for additional tasks.
Letโs think โฆOCR working as a process consists of several sub-processes to perform as accurately as possible. These subprocesses are:
- Preprocessing of the Image
- Text Localization
- Character Segmentation
- Character Recognition
- Post Processing
If you wish to read more about OCR working, you can read the links provided in the reference section.
In the remainder of this blog post, weโll learn to work with Tesseract OCR + Python and integrating the same python script into UiPath.
By the end of the tutorial, youโll be able to convert the text in an image/pdf to a Python string data type and then finally using the python script inside the UiPath to perform post-processing of data as you wish to do!
Just keep readingโฆ
Pre Requisites โย
- You need to install pytesseract (Using pip install pytesseract) โ Wrapper on top of tesseract
- CV2 can be also used with tesseract for better image processing.
- Add Uipath.python.Activities Package in your project Dependency in Uipath along with setting for Python Pathย
- You also need to configure and install tesseract binary on the same machine where this script needs to be executed
Sample Example โ
For better understanding, this post has been divided into two parts โ
-
OCR Python Code which will take Image as Input and provide relevant data in textย format further processing
-
Uipath workflow to use Python Activity & OCR Python Code (Written in step 1)
Python Code for OCR (Say UiPathOCR.py)
# import the necessary packages from PIL import Image import pytesseract import cv2 import os def ocr(image_path, preprocess): """ Takes Image and preprocess for some common handling :param image_path: :param preprocess: should be thresh, blur, :return: """ # load the example image and convert it to grayscale image = cv2.imread(image_path) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # check to see if we should apply thresholding to preprocess the # image if preprocess == "thresh": gray = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1] # make a check to see if median blurring should be done to remove # noise elif preprocess == "blur": gray = cv2.medianBlur(gray, 3) # write the grayscale image to disk as a temporary file so we can # apply OCR to it filename = "{}.png".format(os.getpid()) cv2.imwrite(filename, gray) # load the image as a PIL/Pillow image, apply OCR, and then delete # the temporary file # You might Need to set the path for tesseract incase its not in your system path like below # pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' text = pytesseract.image_to_string(Image.open(filename)) os.remove(filename) return text
Uipath workflow
You need to Drag โPython Scopeโ activity into the designer panel and set the required parameters.
- Path: System python installed path i.e. โC:\Python36โ
- Version: python version .i.e 3.6

Drag โLoad Python Scriptโ activity into the designer panel to load the ( UiPathOCR.py )python script and supply the below parameters.
- File:ย Your python script path
- Result:ย Create a PythonObject type variable

Drag โInvoke Python Methodโ activity into the designer panel and supply theย function nameย that we want to invoke with theย required argumentsย to it.
- Function Name: ocr
- two arguments i.eย 1. image_path:โData\invoice-sample.jpgโ & preprocess: โโ as dictionary {โโ,โNoneโ}

Drag โGet Python Objectโ activity into the designer panel to convert the python Object obtained from the above activity to our desired datatype (In our case its โdictionaryโ type)

Finally, when you execute you will receive your result in your desired variable as below โ
[โ, โ, โ, โ, โโโ, โInvoiceโ, โ, โ, โ, โYourโ, โCompanyโ, โLLCโ, โAddressโ, โ123,โ, โState,โ, โMyโ, โCountryโ, โPโ, โ111-222-333,โ, โFโ, โ111-222-334โ, โ, โ, โ, โBILLโ, โTO:โ, โ, โJohnโ, โDoeโ, โ, โ, โAlphaโ, โBravoโ, โRoadโ, โ33โ, โ, โ, โP:โ, โ111-222-338,โ, โF:โ, โ111-222-334โ, โ, โclient@example.netโ, โ, โ, โ, โSHIPPINGโ, โTO:โ, โ, โJohnโ, โDoeโ, โOfficeโ, โ, โ, โOfficeโ, โRoadโ, โ38,โ, โ, โ, โP:โ, โ111-383-222,โ, โF:โ, โ122-222-834โ, โ, โoffice@example.netโ, โ, โ, โ, โhttp://mrsinvoice.comโ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โInvoiceโ, โ#โ, โ00001โ, โ, โInvoiceโ, โDateโ, โ12/12/2001โฒ, โ, โNameโ, โofโ, โRep.โ, โBobโ, โ, โ, โContactโ, โPhoneโ, โ101-102-103โ, โ, โ, โ, โ โ, โ, โ, โ, โPaymentโ, โTermsโ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โCashโ, โonโ, โDeliveryโ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โAmountโ, โDue:โ, โ$4,170โ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โ โ, โ, โ, โ, โNOโ, โPRODUCTSโ, โ/โ, โSERVICEโ, โQUANTITYโ, โ/โ, โRATEโ, โ/โ, โUNITโ, โAMOUNTโ, โ, โHOURS.โ, โPRICEโ, โ, โ, โ1โ, โtyeโ, โ2โ, โ$20โ, โ$40โ, โ, โ, โ2__|โ, โSteeringโ, โWheelโ, โ5โ, โ$10โ, โ$50โ, โ, โ, โ3โ, โ|โ, โEngineโ, โoilโ, โ10โ, โ$15โ, โ$150โ, โ, โ, โ4โ, โ|โ, โBrakeโ, โPadโ, โ24โ, โ$1000โ, โ$2,400โ, โ, โ, โSubtotalโ, โ$275โ, โ, โ, โTaxโ, โ(10%)โ, โ$27.5โ, โ, โ, โGrandโ, โTotalโ, โ$302.5โ, โ, โ, โ, โโTHANKโ, โYOUโ, โFORโ, โYOURโ, โBUSINESS.โ]
Summary
Today we learned how to Use Tesseract on our machines with UiPath, the first part in a two-part series on using Tesseract for OCR.
- Use Python script with tesseract binary to apply OCR to input images.
- Then Using the UiPath to invoke the Python Script & perform the task
However, If you compare the result of this OCR with another Cloud-based engine result is poor โฆbut for better accuracy, we can train a custom machine learning model to recognize characters in our specific use case.
We can also use CV2 with Tesseract for better pre-processing of image & then apply ocr.
Tesseract is best suited for situations with high-resolution inputs such as sample invoice pdf & formatted image with a clear background.
Next week weโll learn how to use Uipath With Service Now โฆ so stay tuned.
Notes โ
- You need to install Uipath.python.Activities
- You might get pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or itโs not in your path
References โ
- Tesseract installer binary for windows โ https://digi.bib.uni-mannheim.de/tesseract/
- OCR with Tesseract https://nanonets.com/blog/ocr-with-tesseract/
- https://www.learnopencv.com/deep-learning-based-text-recognition-ocr-using-tesseract-and-opencv/