Job description KrawlNet Technologies provides services to advertiser & publisher to run the affiliate programs effectively. KrawlNet aggregates products from various retailers that can readily and effectively allow publishers & analytics to grow their business. An integral part of offerings is web-scale crawl and extraction. Our objective is to solve the business problems faced in the industry and provide associated services of cleansing, normalizing the web content. Responsibility: As a software developer, in this full-time permanent role, you will be responsible for Ensuring an uninterrupted flow of data from the various sources by crawling the web Extracting & managing large volumes of structured and unstructured data, with the ability to parse data into standardized format for ingestion into data sources Actively participate in troubleshooting, debugging & maintaining the broken crawlers Scraping difficult websites by deploying anti-blocking and anti-captcha tools Strong data analysis skills working with data quality, data consolidation and data wrangling Solid understanding of Data structures and Algorithms Comply with coding standards and technical design Requirements: Experience of complex crawling like captcha, recaptcha and bypassing proxy, etc Regular Expressions Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3. Strong fundamental C.S. skills (Data structures, algorithms, multi-threading, etc.) Good communication skills (must) Experience with web crawler projects is a plus. Required skills: Python, Perl, Scrapy, Selenium, headless browsers, Puppeteer, Node.js, Beautiful Soup, SVN, GitHub, AWS Desired: Experience in productionizing machine learning models Experience with DevOps tools such as Docker, Kubernetes Familiarity with a big data stack (e.g. Airflow, Spark, Hadoop, MapReduce, Hive, Impala, Kafka, Storm, and equivalent cloud-native services) Education: B.E / B.Tech / Bsc. Experience : 0-2 years Location: Pune (In- office) How to Apply: Please email a copy of your CV at hr@krawlnet.com