PDF Data Extraction Engineer | Brightleaf Solutions

A minimum of 2 years of relevant industry experience is required.
This role required entity extraction from PDFs.
Must be well versed with architecture and different layers of PDFs and must be able to decompose these layers.
Must be able to identify font information, color schema, handwritten text, images, tables, table of content, header footer, and so on from PDF using Python.
Must have hands-on experience in working with different Python libraries like Pdfminer, Pytesseract, pypdf, pdfbox, etc.
At least 2 years of industry experience in Python programming.

BE/ B tech. / MCA / MSc. / BSc
Proven experience in building an application using Python, OCR, etc
Contribute to various development projects.
Excellent English communication skills both written and verbal
Ability to self-learn
Confident and friendly
Critical thinking, logical analysis, and ability to work independently, prioritize and take initiative