
Lead Data Engineer
3 weeks ago
Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic and more than 450 employees). Fusemachines seeks to bring its global expertise in AI to transform companies around the world.
Location: Remote (Full-time)
About the roleThis is a remote full-time position, responsible for designing, building, testing, optimizing and maintaining the infrastructure and code required for data integration, storage, processing, pipelines and analytics (BI, visualization and Advanced Analytics) from ingestion to consumption, implementing data flow controls, and ensuring high data quality and accessibility for analytics and business intelligence purposes. This role requires a strong foundation in programming, and a keen understanding of how to integrate and manage data effectively across various storage systems and technologies.
We're looking for someone who can quickly ramp up, contribute right away and lead the work in Data & Analytics, helping from backlog definition, to architecture decisions, and lead technical the rest of the team with minimal oversight.
Qualifications / Skill Set- Must have a full-time Bachelor's degree in Computer Science Information Systems, Engineering, or a related field.
- 5+ years of real-world data engineering development experience in AWS and GCP (certifications preferred).
- Strong expertise in Python, SQL, PySpark and AWS in an Agile environment, with a proven track record of building and optimizing data pipelines, architectures, and datasets, and proven experience in data storage, modeling, management, lake, warehousing, processing/transformation, integration, cleansing, validation and analytics.
- Senior person who can understand requirements and design end to end solutions with minimal oversight.
- Strong programming skills in Python, Scala, and proficient in writing efficient and optimized code for data integration, storage, processing and manipulation.
- Strong knowledge of SDLC tools and technologies, including project management software (Jira or similar), source code management (GitHub or similar), CI/CD system (GitHub actions, AWS CodeBuild or similar) and binary repository manager (AWS CodeArtifact or similar).
- Good understanding of Data Modeling and Database Design Principles. Able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions.
- Strong SQL skills and experience working with complex data sets, Enterprise Data Warehouse and writing advanced SQL queries. Proficient with Relational Databases (RDS, MySQL, Postgres, or similar) and NonSQL Databases (Cassandra, MongoDB, Neo4j, etc.).
- Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
- Strong experience in implementing data pipelines and efficient ELT/ETL processes, batch and real-time, in AWS and using open source solutions, being able to develop custom integration solutions as needed, including Data Integration from different sources such as APIs (PoS integrations is a plus), ERP (Oracle and Allegra are a plus), databases, flat files, Apache Parquet, event streaming, including cleansing, transformation and validation of the data.
- Strong experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT and Kafka, to be able to handle large volumes of data.
- Experience with stream-processing systems: Storm, Spark-Streaming, etc. is a plus.
- Strong experience in designing and implementing Data Warehousing solutions in AWS with Redshift. Demonstrated experience in designing and implementing efficient ELT/ETL processes that extract data from source systems, transform it (DBT), and load it into the data warehouse.
- Strong experience in Orchestration using Apache Airflow.
- Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, S3, Lake Formation, EC2, EMR, ECS/ECR, IAM, CloudWatch, etc.
- Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent.
- Good understanding of BI solutions including Looker and LookML (Looker Modeling Language).
- Strong knowledge and hands-on experience of DevOps principles, tools and technologies (GitHub and AWS DevOps) including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC – Terraform), configuration management, automated testing, performance tuning and cost management and optimization.
- Good problem-solving skills: able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues.
- Possesses strong leadership skills with a willingness to lead, create ideas, and be assertive.
- Strong project management and organizational skills.
- Excellent communication skills to collaborate with cross-functional teams, including business users, data architects, DevOps/DataOps/MLOps engineers, data analysts, data scientists, developers, and operations teams. Essential to convey complex technical concepts and insights to non-technical stakeholders effectively.
- Ability to document processes, procedures, and deployment configurations.
- Design, implement, deploy, test and maintain highly scalable and efficient data architectures, defining and maintaining standards and best practices for data management independently with minimal guidance.
- Ensuring the scalability, reliability, quality and performance of data systems.
- Mentoring and guiding junior/mid-level data engineers.
- Collaborating with Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components.
- Evaluating and implementing new technologies and tools to improve data integration, data processing and analysis.
- Design architecture, observability and testing strategies, and building reliable infrastructure and data pipelines.
- Takes ownership of storage layer, data management tasks, including schema design, indexing, and performance tuning.
- Swiftly address and resolve complex data engineering issues, incidents and resolve bottlenecks in SQL queries and database operations.
- Conduct Discovery on existing Data Infrastructure and Proposed Architecture.
- Evaluate and implement cutting-edge technologies and methodologies and continue learning and expanding skills in data engineering and cloud platforms, to improve and modernize existing data systems.
- Evaluate, design, and implement data governance solutions: cataloging, lineage, quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns.
- Define and document data engineering architectures, processes and data flows.
- Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive).
- Be an active member of our Agile team, participating in all ceremonies and continuous improvement activities.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
#J-18808-Ljbffr
-
Lead Data Engineer
3 weeks ago
Islamabad, Islamabad, Pakistan Fusemachines Full timeAbout FusemachinesFusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic) and more than 450 employees,...
-
Lead Data Engineer
3 weeks ago
Islamabad, Islamabad, Pakistan S&P Global Full timeAbout the Role:Grade Level (for internal use):11The Role: Lead Data EngineerThe Location: Islamabad, PakistanThe Team: Our team is responsible for the design, architecture, and development of our Content applications using a variety of tools that are regularly updated as new technologies emerge. You will have the opportunity every day to work with people...
-
Lead Data Engineer
2 weeks ago
Islamabad, Islamabad, Pakistan S&P Global Full timeAbout the Role:Grade Level (for internal use):11The Role: Lead Data EngineerThe Location: Islamabad, PakistanThe Team: Our team is responsible for the design, architecture, and development of our Content applications using a variety of tools that are regularly updated as new technologies emerge. You will have the opportunity every day to work with people...
-
Data Analytics Team Lead
2 weeks ago
Islamabad, Islamabad, Pakistan beBeeEngineering Full time 18,000,000 - 22,000,000Job OverviewThis role entails spearheading a high-performing engineering team focused on ensuring AI system reliability, developing and managing tools to identify and mitigate failures, and driving cross-functional initiatives to enhance monitoring and data quality.Key ResponsibilitiesBuild, manage and mentor a high-performing engineering team specializing...
-
Data Engineer
2 weeks ago
Islamabad, Islamabad, Pakistan beBeeDataEngineering Full time 6,000,000 - 10,000,000Job SummaryWe are seeking a skilled professional to deliver high-value financial datasets and accelerate the migration of data delivery channels to cloud-native infrastructure.About the RoleThis role plays a critical part in ensuring timely, accurate, and scalable delivery of S&P Global's financial datasets. The team is responsible for delivering structured...
-
Data Engineer
7 hours ago
Islamabad, Islamabad, Pakistan PLC Group Full time 1,200,000 - 3,600,000 per yearJob Title:Data EngineerSchedule:On-site Full-timeWork Location:I-10/3 Islamabad, PakistanAre you passionate about building scalable data pipelines, managing large datasets, and enabling advanced analytics? Do you enjoy working with cutting-edge technologies to support AI, ML, and IoT-driven solutions? If so, then PLC Group wants youAs a Data Engineer, you...
-
Data Engineer
4 weeks ago
Islamabad, Islamabad, Pakistan Edge Full timeAt EdgeWe're on a mission to eliminate geographic borders as barriers to full-time employment and fair wages. We're creating a global HR platform ecosystem that seamlessly connects exceptional talent worldwide with North American businesses. By making global hiring easier than local hiring, we provide businesses access to a broader talent pool and accelerate...
-
Data Engineer
2 days ago
Islamabad, Islamabad, Pakistan WORK Full time 900,000 - 1,200,000 per yearWe are looking for a talented and highly skilledData Engineerto join our growing team. In this role, you will be responsible for designing, developing, and managing ETL pipelines, creating scraping scripts, handling API integrations, and managing databases. The ideal candidate will have a solid understanding of data engineering principles and experience with...
-
Senior Data Engineer
2 weeks ago
Islamabad, Islamabad, Pakistan Creative Chaos Full time4 weeks ago Be among the first 25 applicantsJob Brief:We are seeking a highly skilled Senior Data Engineer to join our dynamic team. The ideal candidate will have a strong background in designing and implementing data pipelines, as well as experience in optimizing data workflows and handling large volumes of data. You will be responsible for developing and...
-
Data Engineer Intern
7 hours ago
Islamabad, Islamabad, Pakistan ZiCON Cloud Full time 900,000 - 1,200,000 per yearJob Description – Data Engineer Intern (Paid Internship)Location: Islamabad, Pakistan (Onsite)Duration:3 Months (with possibility of extension based on performance)About ZiCON CloudZiCON Cloud is a forward-looking technology company focused on digital transformation and climate innovation. We are building cutting-edge solutions for Monitoring, Reporting,...