Job Description
We are looking for a highly skilled Data Engineer to join our growing team. In this role, you will be responsible for designing, building, and maintaining scalable big data pipelines and data architectures. You will play a key role in enabling analytics and real-time use cases by ensuring the availability, quality, and accessibility of data across the organization. If you are passionate about working with distributed databases, real-time analytics, and open data table formats, we want to hear from you!
Key Responsibilities:
?.
Design, develop, and optimize big data pipelines for both batch and real-time data processing. ?. Build scalable and robust data architectures to support analytics, machine learning, and real-time applications. ?. Create and maintain efficient database models tailored for OLAP and OLTP workloads. ?. Work with distributed databases (e.g., Cassandra, ScyllaDB, CockroachDB) to enable horizontal scalability and low-latency operations. ?.
Implement and optimize distributed query engines like Presto, Trino, or similar tools for interactive querying. ?. Leverage columnar databases (e.g., ClickHouse, Apache Druid) for high-performance analytics use cases. ?. Work with open table formats like Apache Iceberg, Delta Lake, and Apache Hudi to enable transactional capabilities on data lakes. ?. Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data needs and deliver scalable solutions. ?.
Ensure data reliability, consistency, and quality through proper validation and monitoring frameworks. ?. Stay updated with emerging technologies and advocate for the adoption of tools and practices to improve system efficiency and scalability.
Required Skills and Qualifications:
? 4+ years of experience as a Data Engineer or in a similar role. ?. Strong experience building big data pipelines using tools like Apache Spark, Kafka etc. ?. Proficiency in working with distributed databases for efficient storage and retrieval for Time Series, tabular, semi-structured and unstructured data ?.
Expertise in data modeling for both relational and non-relational databases like columnar to support complex analytical workloads. ?. Hands-on experience with distributed query engines like Presto, Trino, or Apache Impala. ?. In-depth knowledge of columnar storage and tools like Apache Parquet, ORC, or Avro. ?. Familiarity with open table formats (e.g., Apache Iceberg, Delta Lake, Apache Hudi) and their transactional capabilities. ?.
Strong understanding of data architecture principles, including data warehousing, data lakes, and lakehouse paradigms. ?. Programming experience in languages like Python, Scala, or Java for ETL development and pipeline automation. ?. Solid understanding of cloud platforms (e.g., AWS, GCP, Azure) and tools for data processing and storage. ?. Strong knowledge of SQL and experience writing complex queries for analytics and performance optimization. ?.
Familiarity with streaming frameworks like Kafka, Kinesis, or Pulsar for real-time data processing.
Preferred Skills:
?. Experience working with data governance tools and ensuring compliance with data security policies. ?. Knowledge of DevOps practices for data infrastructure, including CI/CD pipelines for data engineering workflows. ?. Exposure to machine learning pipelines and collaboration with data science teams. ?.
Understanding of data observability and monitoring tools (e.g., prometheus) ?. Certification in any cloud platforms related to data engineering would be a plus.
Tagged as: Engineering
Company Statement: Graham Packaging is a people, planet and values-based company and a leader in sustainable packaging manufacturing. From the...
Apply For This JobJob Description LeadStack Inc. is an award-winning, one of the nation’s fastest-growing, certified minority-owned (MBE) staffing services provider of contingent...
Apply For This JobSUMMARY:Under general supervision, sets up, operates, and adjusts various types of conventional or automatic machines including lathes, drill presses, grinders,...
Apply For This JobEaton’s IS AER FMC division is currently seeking a Machinist B (2nd shift). The hourly range for this position is...
Apply For This JobJob Title: Customer Support Representative I Location: Johnston, IA 50131 Duration: 36 Months Job Type: Contract Work Type: Onsite Pay...
Apply For This JobWe are currently looking to fill multiple Change Control Specialist positions. These positions develop strategic plans related to change control...
Apply For This Job