Senior Data Engineer - NoSQL, PySpark, AWS, CI/CD UST

  • company name UST
  • working location Office Location
  • job type Full Time

Experience: 4 - 4 years required

Pay:

Salary Information not included

Type: Full Time

Location: Kerala

Skills: RDMS, NoSQL, aws, pyspark, cicd

About UST

Job Description

Role Description Key Responsibilities: Data Pipeline Design & Architecture: Design and architect robust data pipelines for structured, semi-structured, and unstructured data. Develop and manage databases (RDBMS, NoSQL, data lakes) to ensure data accessibility and integrity. ETL & Data Transformation: Implement efficient ETL processes using tools like PySpark and Hadoop to prepare data for analytics and AI use cases. Database Optimization: Optimize database performance through query tuning, indexing, and caching strategies using Azure and GCP caching databases. Monitor, maintain, and troubleshoot database infrastructure to ensure high availability and performance. CI/CD Pipelines & Version Control: Build and maintain CI/CD pipelines and manage YML files for efficient code integration and deployment. Use GitHub for version control and team collaboration. Containerized Deployment: Leverage Docker for containerized deployment, and manage database and pipeline processes with Docker commands. System Design & Best Practices: Ensure solutions follow best practices in system design, focusing on trade-offs, security, performance, and efficiency. Collaborate with engineering teams to design scalable solutions for large-scale data processing. Technology Research & Implementation: Stay updated with the latest database technologies and implement best practices for database design and management. Qualifications Experience: 4+ years of experience in database architecture and optimization. Proven expertise in RDBMS, NoSQL, and semi-structured databases (e.g., MySQL, PostgreSQL, MongoDB). Technical Skills: Proficiency in programming languages for database integration and optimization (Python preferred). Strong knowledge of distributed data processing tools like PySpark and Hadoop. Hands-on experience with AWS services for data storage and processing, including S3. Strong familiarity with Redis for caching and query optimization. Experience with Docker for containerized deployment and writing CI/CD pipelines using YML files. Problem-Solving & Collaboration: Ability to collaborate effectively with engineering teams to design scalable data solutions. Strong problem-solving skills and attention to detail. Skills RDMS, NoSQL, PySpark, AWS, CI/CD,