Data Engineer

3 days ago


Islamabad, Islamabad, Pakistan Fusemachines Full time

About Fusemachines

Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 full-time employees). Fusemachines seeks to bring its global expertise in AI to transform companies around the world.

Type: Full-time, Remote

About the role

This is a remote full-time position responsible for designing, building, testing, optimizing and maintaining the infrastructure and code required for data integration, storage, processing, pipelines and analytics (BI, visualization and Advanced Analytics) from ingestion to consumption, implementing data flow controls, and ensuring high data quality and accessibility for analytics and business intelligence purposes. This role requires a strong foundation in programming, and a keen understanding of how to integrate and manage data effectively across various storage systems and technologies.

We are looking for a skilled Data Engineer with a strong background in Python, SQL, Pyspark and AWS cloud-based large scale data solutions with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment.

This role is perfect for an individual passionate about leveraging data to drive insights, improve decision-making, and support the strategic goals of the organization through innovative data engineering solutions. 

Qualification & Experience

Must have a full-time Bachelor's degree in Computer Science Information Systems, Engineering, or a related field.At least 2 years of experience as a data engineer with strong expertise in Python, SQL, PySpark and AWS in an Agile environment, with a proven track record of building and optimizing data pipelines, architectures, and datasets, and proven experience in data storage, modeling, management, lake, warehousing, processing/transformation, integration, cleansing, validation and analytics.2+ years of experience with DevOps tools and technologies: GitHub or AWS DevOps.Proven experience delivering large scale projects and products for Data and Analytics, as a data engineer within AWS.Preferred previous experience working with retail or other similar data models.Following certifications:AWS Certified Cloud PractitionerAWS Certified Data Engineer - AssociateNice to have:Databricks Certified Associate Developer for Apache SparkDatabricks Certified Data Engineer Associate

Required skills/Competencies

Strong programming Skills in one or more object-oriented languages such as Python (must have), Scala, Java, and proficiency in writing high-quality, scalable, maintainable, efficient and optimized code for data integration, storage, processing, manipulation and analytics solutions. Strong SQL skills and experience working with complex data sets, Enterprise Data Warehouse and writing advanced SQL queries. Proficient with Relational Databases (RDS, MySQL, Postgres, or similar) and NonSQL Databases (Cassandra, MongoDB, Neo4j, etc.).Strong analytic skills related to working with structured and unstructured datasets.Thorough understanding of big data principles, techniques, and best practices.Experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have including Spark SQL) and Kafka, to be able to handle large volumes of data.Experience with stream-processing systems: Storm, Spark-Streaming, etc. is a plus.Experience in implementing data pipelines and efficient ELT/ETL processes, batch and real-time, in AWS and using open source solutions, being able to develop custom integration solutions as needed, including Data Integration from different sources such as APIs (PoS integrations is a plus), ERP (Oracle and Allegra are a plus), databases, flat files, Apache Parquet, event streaming, including cleansing, transformation and validation of the data.Experience in data cleansing, transformation, and validation.Understanding of Data Modeling and Database Design Principles. Being able to implement efficient database schemas that meet the requirements to support data solutions. With good understanding of dimensional data modelingKnowledge in cloud computing specifically in AWS services related to data and analytics, such as S3, EMR, Glue, SageMaker, RDS, Redshift, Lambda, Kinesis, Lake Formation, EC2, ECS/ECR, EKS, IAM, CloudWatch, etc. implementing Data Warehousing, data lake and data lake house, solutions in AWS.Experience in Orchestration using technologies like Azkaban, Luigi, Airflow, etc..Good understanding of BI solutions including Looker and LookML (Looker Modeling Language)Familiar with advanced analytics, AI/ML services and tools, and the ability to integrate advanced analytics, machine learning, and AI capabilities into data solutions, nice to have.Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.Knowledge of SDLC tools and technologies, including project management software (Jira or similar), source code management (GitHub, AWS CodeCommit or similar), CI/CD system (GitHub actions, Jenkins, AWS CodePipeline or similar) and binary repository manager (Sonatype Nexus, AWS CodeArtifact or similar). Knowledge and hands-on experience of DevOps principles, tools and technologies (GitHub and AWS DevOps) including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC – Terraform), configuration management, automated testing, performance tuning and cost management and optimization.Knowledge of data structures and algorithms and good software engineering practices.Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.Understanding of Data Quality and Governance, including implementation of data quality and integrity checks and monitoring processes to ensure that data is accurate, complete, and consistent. Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues. Strong interpersonal skills and ability to work with a wide range of stakeholders.Excellent communication skills to collaborate with cross-functional teams, including business users, data architects, DevOps/DataOps/MLOps engineers, data analyst, data scientists, developers, and operations teams. Essential to convey complex technical concepts and insights to non-technical stakeholders effectively.Ability to document processes, procedures, and deployment configurations.Understanding of security practices, including network security groups, encryption, and compliance standards, and ability to implement security controls and best practices within data and analytics solutions, including proficient knowledge and working experience on various cloud security vulnerabilities and ways to mitigate them. Self-motivated with the ability to work well in a team.Strong project management and organizational skills.A willingness to stay updated with the latest services, Data Engineering trends, and best practices in the field.Comfortable with picking up new technologies independently and working in a rapidly changing environment with ambiguous requirements.Care about architecture, observability, testing, and building reliable infrastructure and data pipelines.

Responsibilities:

Design, implement, deploy, test and maintain highly scalable and efficient data architectures, defining and maintaining standards and best practices for data management independently with minimal guidance.Ensure systems meet business requirements and industry practices for data integrity, performance, and reliability.Integrate new data management technologies and software engineering tools into existing structures.Create custom software components and analytics applications.Employ a variety of languages and tools to marry systems together or try to hunt down opportunities to improve current processes.Evaluate and advise on technical aspects of open work requests in the data pipeline with the project team.Handle ELT/ETL processes, including data extraction, loading and transformation, from different sources ensuring consistency and quality.Transform and clean data for further analysis and storage.Design and optimize data models and schemas to support business requirements and analysis.Implement monitoring tools and systems to ensure the availability and performance of data systems. Manage data security and access, ensuring confidentiality and integrity.Automate repetitive tasks and processes to improve operational efficiency.Collaborate with data science teams to establish pipelines and workflows for training, validation, deployment, and monitoring of machine learning models. Automate deployment and management of machine learning models in production environments.Contribute to data quality assurance efforts, such as implementing data validation checks and tests to ensure reliability, efficiency, accuracy, completeness and consistency of data.Test software solutions and meet product quality standards prior to release to QA.Ensure the reliability, scalability, and efficiency of data systems are maintained at all times. Identifying and resolving performance bottlenecks in pipelines due to data, queries and processing workflows to ensure efficient and timely data delivery.Work with DevOps teams to optimize resources.Assist in the configuration and management of data warehousing and data lake solutions.Collaborate closely with cross-functional teams including Product, Engineering, Data Scientists, and Analysts to thoroughly understand data requirements and provide data engineering support and extend the company's data with third-party sources of information when needed.Takes ownership of storage layer, database management tasks, including schema design, indexing, and performance tuning.Evaluate and implement cutting-edge technologies and methodologies and continue learning and expanding skills in data engineering and cloud platforms,  to improve and modernize existing data systems.Develop, design, and execute data governance strategies encompassing cataloging, lineage tracking, quality control, and data governance frameworks that align with current analytics demands and industry best practices working closely with Data Architect.Ensure technology solutions support the needs of the customer and/or organization.Define and document data engineering architectures, processes and data flows.

Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.


  • Data Engineer

    2 weeks ago


    Islamabad, Islamabad, Pakistan Victreat Health Tech pvt ltd Full time $30,000 - $60,000 per year

    Job Title:Data Reliability EngineerLocation:Onsite/NSTP𝗘𝗺𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗧𝘆𝗽𝗲: Full-time (On-site)This is a data quality role built for your skills. We are hiring a Data Quality Engineer to apply fundamental data engineering concepts and build scalable systems, automate pipelines, and ensure reliability in addressing the...

  • Data Engineer

    2 weeks ago


    Islamabad, Islamabad, Pakistan ADLAB Solutions Full time $40,000 - $80,000 per year

    Job SummaryWe are seeking a motivatedData Engineerwith 1-2 years of experience to join our dynamic team. This role focuses on building and optimizing scalable data pipelines, ensuring seamless data flow, and supporting analytical efforts across various projects. The ideal candidate will have a strong technical foundation, be eager to learn, and possess a...

  • Data Engineer

    2 weeks ago


    Islamabad, Islamabad, Pakistan Volmatica Full time 900,000 - 1,200,000 per year

    Location:Bahria Paradise Phase IV,Rawalpindi, PakistanExperience:8+ monthsWork Timings:6 PM to 2 AM (Onsite- US Shift)Working Days:Monday to Friday (Weekends Off)Salary:Market CompetitiveBenefits:Paid Overtime, Fuel Allowance, and Provident FundRole DescriptionThis is a full-time on-site role for a Data Engineer at Volmatica in Rawalpindi, Pakistan. The Data...

  • Data Engineer

    5 days ago


    Islamabad, Islamabad, Pakistan Fusemachines Full time

    About FusemachinesFusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 full-time employees)....

  • Data Engineer

    7 days ago


    Islamabad, Islamabad, Pakistan Nisum Full time

    What You'll DoThis is an ideal position for a skilled and proactiveData/Platform Engineerwho is passionate about building scalable data platforms and integrating enterprise systems. You will be responsible for designing, implementing, and maintaining data infrastructure, supporting high-performance data pipelines, and ensuring platform reliability and...

  • Data Engineer

    2 weeks ago


    Islamabad, Islamabad, Pakistan Victreat Health Tech pvt ltd Full time 1,200,000 - 3,600,000 per year

    𝗘𝗻𝘀𝘂𝗿𝗲 𝘁𝗵𝗲 𝗜𝗻𝘁𝗲𝗴𝗿𝗶𝘁𝘆 𝗼𝗳 𝗚𝗿𝗼𝘂𝗻𝗱𝗯𝗿𝗲𝗮𝗸𝗶𝗻𝗴 𝗛𝗲𝗮𝗹𝘁𝗵𝗰𝗮𝗿𝗲 𝗗𝗮𝘁𝗮 We are building a real time data intelligence engine that transforms the web into a structured, queryable database to power a groundbreaking Treatment Navigator....

  • Data Engineer

    5 days ago


    Islamabad, Islamabad, Pakistan Datamatics Technologies Full time

    We are looking for a highly skilledData Engineerwith6–7 years of hands-on experiencein building scalable data systems, designing robust ETL frameworks, and working with modern big data technologies. The ideal candidate will be capable of developing high-performance data pipelines to power analytics, reporting, and business intelligence initiatives. Key...

  • Data Engineer

    2 weeks ago


    Islamabad, Islamabad, Pakistan ieng Group Full time 1,200,000 - 3,600,000 per year

    The Company's Equal Opportunities policy applies equally to the recruitment process and must be complied with at every stage of the recruitment process. This means that prospective applicants should not be discriminated against either directly or indirectly on the grounds of race, nationality, ethnic origin, gender, marital status, sexual orientation,...

  • Data Engineer

    5 days ago


    Islamabad, Islamabad, Pakistan Datamatics Global Services Ltd Full time

    Job Title: Data Engineer (2 Positions)Experience: 6–7 YearsLocation: Remote / HybridEmployment Type: Full-TimeJob SummaryWe are seeking talented and experienced Data Engineers with 6–7 years of hands-on experience in designing, developing, and maintaining large-scale data solutions. The ideal candidates will have a strong background in building robust...

  • Data Engineer

    5 days ago


    Islamabad, Islamabad, Pakistan Datamatics Technologies Full time

    Job Title: Data Engineer (2 Positions)Experience: 6–7 YearsLocation: Remote / Hybrid Employment Type: Full-Time Job SummaryWe are seeking talented and experienced Data Engineers with 6–7 years of hands-on experience in designing, developing, and maintaining large-scale data solutions. The ideal candidates will have a strong background in building...