Python Scrapy Engineer Recruitment Requirements<br>Duties include:<br><br>Responsible for designing and developing efficient and stable network crawlers, collecting various types of website data.<br><br>Able to implement complex web page parsing, including websites that require login, captcha, and anti-scraping strategies.<br><br>Proficient in using browser simulation (such as Selenium, Playwright, Puppeteer, etc.) to simulate user operations and complete data acquisition.<br><br>Optimize the crawler framework to handle distributed crawling, data cleaning, deduplication, and storage.<br><br>Participate in payment-related system interface data scraping, risk control data collection, etc. actual business projects.<br><br>Able to quickly develop and maintain data collection scripts based on business requirements, ensuring stability and data accuracy.<br><br>Job Requirements<br><br>Computer-related professional, bachelor's degree or above (exceptionally talented individuals may be given a pass).<br><br>3 years of Python spider development experience, with solid programming skills and good coding habits.<br><br>Proficient in the following skills:<br><br>Python Web Scraping Libraries: Requests, BeautifulSoup, lxml, Scrapy, etc.;<br><br>Browser automation: Selenium, Playwright, Pyppeteer, etc.;<br><br>Anti-crawling strategy response: proxy pool, user agent rotation, cookie/session maintenance, CAPTCHA recognition, etc.<br><br>Have payment-related business experience, such as third-party payment platform interface calling, transaction data collection, risk control strategy support, etc. is preferred.<br><br>Familiar with multithreading/asynchronous programming, able to optimize crawling efficiency; distributed crawling (such as Scrapy-Redis, Kafka, RabbitMQ) experience is better.<br><br>Familiar with MySQL / MongoDB / Redis databases, able to complete data storage and query optimization.<br><br>Have some Linux usage experience, able to write Shell/Python scripts, familiar with Docker is preferred.<br><br>Have strong problem analysis skills, be able to independently solve technical difficulties, and ensure long-term stable operation of data collection tasks.<br><br><br><br><br><br><br><br>Familiar with the data collection logic required for risk control systems and anti-fraud models.<br><br>Have large-scale data collection and processing experience, and be able to design high-concurrency and distributed crawler architecture.<br><br>In GitHub / technical community there are open source projects or technical articles.