<p>Our client is looking for a Junior to Mid level Developer who is skilled in PySpark.</p><p></p><p><strong>Job Duties: </strong></p><p>• Develop, test, and maintain data processing applications and pipelines using PySpark and related technologies.</p><p>• Perform data extraction, transformation, and loading (ETL) from multiple sources into target systems.</p><p>• Ensure data quality, consistency, and performance across data workflows.</p><p>• Participate in code reviews, documentation, and continuous improvement of data processes.</p><p>• Troubleshoot and resolve issues in data processing and integration environments.</p><p>• Support the deployment, monitoring, and maintenance of data solutions in production.</p><p></p><p><strong>Requirements:</strong></p><p>• 1-3 Years in Python, including data structures, algorithms, and libraries for data manipulation (e.g., Pandas).</p><p>• Deep understanding of Apache Spark, its architecture, and components (RDDs, DataFrames, Datasets).</p><p>• Strong knowledge of SQL for data querying and manipulation.</p><p>• Experience in ETL (Extract, Transform, Load) processes using PySpark.</p><p>• Ability to analyze and interpret complex datasets and derive insights.</p><p>•Strong analytical skills to troubleshoot issues in data processing pipelines.</p><p>• Good command of both spoken and written English, Cantonese, and Mandarin</p>