
Sr. Data Engineer Azure Databricks
1 week ago
About Fusemachines
Fusemachines is a leading AI strategy, talent, and education services and products provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic) and more than 400 full-time employees, Fusemachines seeks to bring its global expertise in AI to transform companies around the world.
About the Role
This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization, and Advanced Analytics).
We are looking for a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps, and cloud-based large-scale data applications with a passion for data quality, performance, and cost optimization. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products in the Aviation Industry, including migration from Synapse to Azure Data Lake. This role involves hands-on coding, mentoring junior staff, and collaboration with multi-disciplined teams to achieve project objectives.
Qualification & Experience
- Must have a full-time Bachelor's degree in Computer Science or similar.
- At least 5 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers.
- 5+ years of experience with Azure DevOps, GitHub.
- Proven experience delivering large scale projects and products for Data and Analytics, as a data engineer, including migrations.
- Following certifications:
- Databricks Certified Associate Developer for Apache Spark
- Databricks Certified Data Engineer Associate
- Microsoft Certified: Azure Fundamentals
- Microsoft Certified: Azure Data Engineer Associate
- Microsoft Exam: Designing and Implementing Microsoft DevOps Solutions (nice to have)
Required Skills/Competencies
- Strong programming skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, migration, storage, processing, and manipulation.
- Strong understanding and experience with SQL and writing advanced SQL queries.
- Thorough understanding of big data principles, techniques, and best practices.
- Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have: experience with Azure Databricks), DBT, and Kafka, to be able to handle large volumes of data.
- Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, NumPy in Azure environment.
- Strong experience in designing and implementing efficient ELT/ETL processes in Azure and Databricks and using open-source solutions being able to develop custom integration solutions as needed.
- Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
- Expertise in data cleansing, transformation, and validation.
- Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).
- Good understanding of Data Modeling and Database Design Principles.
- Strong experience in designing and implementing Data Warehousing, data lake, and data lake house solutions in Azure and Databricks.
- Good experience with Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT).
- Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.
- Strong knowledge of SDLC tools and technologies Azure DevOps and GitHub.
- Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC), configuration management, automated testing, performance tuning, and cost management and optimization.
- Strong knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics.
- Experience in Orchestration using technologies like Databricks workflows and Apache Airflow.
- Strong knowledge of data structures and algorithms and good software engineering practices.
- Proven experience migrating from Azure Synapse to Azure Data Lake, or other technologies.
- Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.
- Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.
- Good understanding of Data Quality and Governance.
- Experience with BI solutions including PowerBI is a plus.
- Strong written and verbal communication skills.
- Ability to document processes, procedures, and deployment configurations.
- Understanding of security practices, including network security groups, Azure Active Directory, encryption, and compliance standards.
- Ability to implement security controls and best practices within data and analytics solutions.
- Self-motivated with the ability to work well in a team, and experienced in mentoring and coaching different members of the team.
- A willingness to stay updated with the latest services, Data Engineering trends, and best practices in the field.
- Comfortable with picking up new technologies independently and working in a rapidly changing environment with ambiguous requirements.
- Care about architecture, observability, testing, and building reliable infrastructure and data pipelines.
Responsibilities
- Architect, design, develop, test, and maintain high-performance, large-scale, complex data architectures.
- Contribute to detailed design, architectural discussions, and customer requirements sessions.
- Actively participate in the design, development, and testing of big data products.
- Construct and fine-tune Apache Spark jobs and clusters within the Databricks platform.
- Migrate out of Azure Synapse to Azure Data Lake or other technologies.
- Assess best practices and design schemas that match business needs for delivering a modern analytics solution.
- Design and implement data models and schemas that support efficient data processing and analytics.
- Design and develop clear, maintainable code with automated testing.
- Collaborate with cross-functional teams to understand data requirements and develop data solutions.
- Evaluate and implement new technologies and tools to improve data integration, data processing, storage, and analysis.
- Evaluate, design, implement, and maintain data governance solutions.
- Continuously monitor and fine-tune workloads and clusters to achieve optimal performance.
- Provide guidance and mentorship to junior team members.
- Maintain clear and comprehensive documentation of the solutions.
- Promote and enforce best practices in data engineering, data governance, and data quality.
- Ensure data quality and accuracy.
- Design, implement, and maintain data security and privacy measures.
- Be an active member of an Agile team.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
#J-18808-Ljbffr-
Senior Data Engineering Specialist
1 week ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeAbout the RoleThis is a remote, contract position responsible for designing, building, and maintaining large-scale data infrastructure required for data integration, storage, processing, and analytics. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products, including migration...
-
Principal Data Engineer- Azure
2 weeks ago
Islamabad, Islamabad, Pakistan Clustox Full timeJoin to apply for the Principal Data Engineer- Azure role at Clustox.About The ProjectWe are a mission-driven team of developers, architects, ML engineers, and data specialists building an innovative cloud-based platform to combat coral reef degradation caused by global warming. By leveraging real-time data pipelines, AI/ML models, and scalable cloud...
-
Sr. Python Cloud Architect
1 week ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeSr. Python Developer - Cloud ArchitectWe are looking for a skilled Sr. Python Developer - Cloud Architect to design and build scalable and high-performance systems using cloud technologies like Azure / AWS.About the Role:Lead code reviews & documentation as well as take on complex bug fixes, especially on high-risk problemsMentoring less experienced members...
-
Cloud Software Developer
1 week ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeSr. Software Engineer - Cloud & Data Architect\We are looking for a highly skilled Sr. Software Engineer - Cloud & Data Architect to join our team. The successful candidate will have extensive experience designing and building large-scale cloud-based systems, particularly focusing on AWS/Azure and Databricks.\Your responsibilities will include designing...
-
Data Validation Specialist – Power BI
3 days ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeData Validation Specialist – Power BIAbout the RoleWe are seeking a highly skilled Data Validation Specialist – Power BI to join our team. As a key member of the analytics delivery team, you will be responsible for ensuring the accuracy and reliability of reports and data pipelines across Azure platforms.Main Responsibilities:Develop and execute...
-
Islamabad, Islamabad, Pakistan beBee Careers Full timeQuality Assurance Engineer – Business IntelligenceAbout the JobThis is an exciting opportunity for a Quality Assurance Engineer – Business Intelligence to join our team. As a key member of the analytics delivery team, you will be responsible for ensuring the accuracy and reliability of reports and data pipelines across Azure platforms.Main...
-
Data Systems Engineer
3 days ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeData Systems Engineer Job DescriptionWe are looking for a skilled Data Systems Engineer to join our team and contribute to the development of a cloud-based platform aimed at combating coral reef degradation caused by global warming.The successful candidate will have experience with designing and optimizing data systems using Azure cloud services.This role...
-
Cloud Data Engineer
3 days ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeSenior Cloud Data Architect PositionWe are seeking a highly skilled Senior Cloud Data Architect to join our team and contribute to the development of a cloud-based platform focused on marine conservation.The ideal candidate will have experience with designing and optimizing data systems using Azure cloud services and possess expertise in Python, PySpark,...
-
Data Quality Engineer
1 week ago
Islamabad, Islamabad, Pakistan beBee Careers Full timeData Quality EngineerIn this role, you will be responsible for ensuring the accuracy, completeness, and consistency of data across our platform. Your expertise in Big Data testing and Azure services will be instrumental in designing and executing tests for data validation, storage, and retrieval. Collaborating with our development teams, you will assess...
-
Principal DevOps Engineer-Azure
2 weeks ago
Islamabad, Islamabad, Pakistan Clustox Full timeAbout the jobJob Title: Principal DevOps Engineer-AzureLocation: IslamabadExperience Required: 8+ YearsAbout the RoleWe are looking for a Senior Lead DevOps Engineer with deep Azure expertise to lead infrastructure automation and DevOps practices across our projects. This is a hands-on leadership role where you will guide the design, implementation, and...