Job
Description
You will be an Automation Engineer responsible for developing reliable and scalable systems for browser automation, crawling, and data extraction from the web. Your main objective will be to support AI agents in mimicking human actions on the internet such as navigating websites, filling forms, clicking buttons, and extracting structured insights from unstructured pages. Ideal candidates for this role are those who excel in solving real-world problems through intelligent automation solutions and are passionate about integrating automation capabilities with AI technologies. Your key responsibilities will include building and managing scalable web scrapers, crawlers, and data extractors for various platforms, including dynamic and JavaScript-heavy sites. You will also be tasked with creating browser automation flows using tools like Puppeteer, Playwright, and Selenium to replicate human actions like navigation, form submission, and interactions. It will be crucial for you to design robust retry mechanisms, failure handling processes, and rate-limiting strategies to ensure high reliability and adherence to compliance standards. Collaboration with the AI team will be essential as you work together to encapsulate automation flows as APIs or callable modules that can be utilized by AI agents. Additionally, you will need to continuously monitor for changes in site structures and adjust automation workflows accordingly. The maintenance of scraping infrastructure, including job schedulers, proxies, headless browser management, and logging pipelines, will also fall under your purview. It is imperative that all automation processes adhere to ethical guidelines and comply with legal and platform-specific terms of use. To excel in this role, you must possess 3-6 years of solid experience with web scraping tools and libraries such as Puppeteer, Playwright, and Selenium. Proficiency in programming languages like Python or Node.js is a must, along with a deep understanding of HTML, CSS, and browser DOM structures. Experience with headless browsers, captcha solving techniques, rotating proxies, working with APIs, session-based authentication, and cookies will be beneficial. Your ability to write clean, modular, and well-tested code will also be crucial for success in this position.,