As a Data Extraction Engineer, you will be responsible for designing, building and maintaining tooling for the extraction of data from a variety of sources, from highly structured to seemingly no structure at all. You will work closely with the engineering and data science teams to transform data from these sources into usable data.
What you’ll be responsible for
- You will be in charge of developing extraction tooling for a wide range of sources.
- You will design systems to be highly automated and adapt to changing conditions.
- You will take charge of your own work and proactively suggest improvements across the range of tooling.
- You will vigilantly identify errors and help define technical solutions.
- You will help satisfy the goals by utilising a wide range of technologies
What we’re looking for
- Strong agile mindset - iterate fast and get feedback early
- Proven experience in Python (PANDAS, Numpy)
- Thorough understanding of data structures
- Ability to work on projects using Git
- Excellent communication skills in English
- Previous experience building tooling for extracting information, be it web scraping, text extraction, or sentiment analyses
- Positive attitude
- Desire and eagerness to contribute to the success of the company and grow along with it
- Hands-on mentality
Bonus points for
- Extensive knowledge of xPath and RegEx
- Understanding of document structures, especially PDF files
- Experience in developing and monitoring long running python jobs
- AWS Lambda or other serverless platforms