Sr. Data Engineer Azure Databricks
5 days ago
About Fusemachines
Fusemachines is a leading AI strategy, talent, and education services and products provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, United States, Canada, and Dominican Republic) and more than 400 full-time employees, Fusemachines seeks to bring its global expertise in AI to transform companies around the world.
About the Role
This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization, and Advanced Analytics).
We are looking for a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps, and cloud-based large-scale data applications with a passion for data quality, performance, and cost optimization. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products in the Aviation Industry, including migration from Synapse to Azure Data Lake. This role involves hands-on coding, mentoring junior staff, and collaboration with multi-disciplined teams to achieve project objectives.
Qualification & Experience
- Must have a full-time Bachelor's degree in Computer Science or similar.
- At least 5 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers.
- 5+ years of experience with Azure DevOps, GitHub.
- Proven experience delivering large scale projects and products for Data and Analytics, as a data engineer, including migrations.
- Following certifications:
- Databricks Certified Associate Developer for Apache Spark
- Databricks Certified Data Engineer Associate
- Microsoft Certified: Azure Fundamentals
- Microsoft Certified: Azure Data Engineer Associate
- Microsoft Exam: Designing and Implementing Microsoft DevOps Solutions (nice to have)
Required Skills/Competencies
- Strong programming skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, migration, storage, processing, and manipulation.
- Strong understanding and experience with SQL and writing advanced SQL queries.
- Thorough understanding of big data principles, techniques, and best practices.
- Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have: experience with Azure Databricks), DBT, and Kafka, to be able to handle large volumes of data.
- Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, NumPy in Azure environment.
- Strong experience in designing and implementing efficient ELT/ETL processes in Azure and Databricks and using open-source solutions being able to develop custom integration solutions as needed.
- Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming.
- Expertise in data cleansing, transformation, and validation.
- Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table).
- Good understanding of Data Modeling and Database Design Principles.
- Strong experience in designing and implementing Data Warehousing, data lake, and data lake house solutions in Azure and Databricks.
- Good experience with Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT).
- Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies.
- Strong knowledge of SDLC tools and technologies Azure DevOps and GitHub.
- Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC), configuration management, automated testing, performance tuning, and cost management and optimization.
- Strong knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics.
- Experience in Orchestration using technologies like Databricks workflows and Apache Airflow.
- Strong knowledge of data structures and algorithms and good software engineering practices.
- Proven experience migrating from Azure Synapse to Azure Data Lake, or other technologies.
- Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures.
- Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines.
- Good understanding of Data Quality and Governance.
- Experience with BI solutions including PowerBI is a plus.
- Strong written and verbal communication skills.
- Ability to document processes, procedures, and deployment configurations.
- Understanding of security practices, including network security groups, Azure Active Directory, encryption, and compliance standards.
- Ability to implement security controls and best practices within data and analytics solutions.
- Self-motivated with the ability to work well in a team, and experienced in mentoring and coaching different members of the team.
- A willingness to stay updated with the latest services, Data Engineering trends, and best practices in the field.
- Comfortable with picking up new technologies independently and working in a rapidly changing environment with ambiguous requirements.
- Care about architecture, observability, testing, and building reliable infrastructure and data pipelines.
Responsibilities
- Architect, design, develop, test, and maintain high-performance, large-scale, complex data architectures.
- Contribute to detailed design, architectural discussions, and customer requirements sessions.
- Actively participate in the design, development, and testing of big data products.
- Construct and fine-tune Apache Spark jobs and clusters within the Databricks platform.
- Migrate out of Azure Synapse to Azure Data Lake or other technologies.
- Assess best practices and design schemas that match business needs for delivering a modern analytics solution.
- Design and implement data models and schemas that support efficient data processing and analytics.
- Design and develop clear, maintainable code with automated testing.
- Collaborate with cross-functional teams to understand data requirements and develop data solutions.
- Evaluate and implement new technologies and tools to improve data integration, data processing, storage, and analysis.
- Evaluate, design, implement, and maintain data governance solutions.
- Continuously monitor and fine-tune workloads and clusters to achieve optimal performance.
- Provide guidance and mentorship to junior team members.
- Maintain clear and comprehensive documentation of the solutions.
- Promote and enforce best practices in data engineering, data governance, and data quality.
- Ensure data quality and accuracy.
- Design, implement, and maintain data security and privacy measures.
- Be an active member of an Agile team.
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
#J-18808-Ljbffr-
Senior Azure Databricks Engineer
5 days ago
Islamabad, Islamabad, Pakistan FuseMachines Full timeAbout FusemachinesWe're a global organization providing AI strategy, talent, and education services, committed to democratizing AI and bringing our expertise to companies worldwide.About the PositionThis contract role involves designing, building, and maintaining data infrastructure for integration, storage, processing, and analytics. As a skilled Senior...
-
Data Engineering Lead
1 week ago
Islamabad, Islamabad, Pakistan Fusemachines Full time**Job Summary**Fusemachines is a leader in AI innovation, dedicated to delivering cutting-edge AI products and solutions to various industries. We are currently seeking a highly skilled Sr. Data Engineer to lead the development of big data infrastructure using Microsoft Azure in the Media domain.This full-time role involves collaborating with a dedicated...
-
Data Engineer with Big Data Expertise
5 days ago
Islamabad, Islamabad, Pakistan FuseMachines Full timeAbout FusemachinesFusemachines is a leading provider of AI strategy, talent, and education services. With a global presence and a team of experts, we seek to democratize AI and bring our expertise to companies worldwide.About the RoleThis contract position involves designing, building, and maintaining data infrastructure for integration, storage, processing,...
-
Sr. Data Engineer
3 weeks ago
Islamabad, Islamabad, Pakistan Fusemachines Full timeGet AI-powered advice on this job and more exclusive features.About FusemachinesFusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to...
-
Cloud Data Engineer
4 weeks ago
Islamabad, Islamabad, Pakistan Nisum Full timeRequired Skills And QualificationsStrong proficiency in ETL tools and processes, particularly Azure Data FactoryExpertise in SQL for complex queries and stored proceduresExperience with Azure Synapse Analytics and Azure Data Lake StorageProficiency in Python for data manipulation and analysisUnderstanding of machine learning concepts and their application in...
-
Data Engineering Lead
2 weeks ago
Islamabad, Islamabad, Pakistan DigyCorp Full timeRequired Skills and QualificationsProgramming Languages: Python, PySpark, Spark StreamingData Storage: Azure Data Lake, Azure SQL Database, Cosmos DBData Pipelines: Azure Data Factory, Databricks, Synapse AnalyticsExperience: 5+ years of experience in data engineeringBenefitsWe offer a dynamic work environment that promotes innovation and collaboration. You...
-
Cloud Architect for Big Data Solutions
1 week ago
Islamabad, Islamabad, Pakistan Fusemachines Full time**Job Description**Fusemachines is a leading AI company that specializes in delivering state-of-the-art AI products and solutions to various industries. As a Sr. Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics using Microsoft Azure in the...
-
Enterprise Data Intelligence Professional
2 weeks ago
Islamabad, Islamabad, Pakistan Nisum Full timeKey Skills and QualificationsStrong proficiency in ETL tools and processes, particularly Azure Data FactoryExpertise in SQL for complex queries and stored proceduresExperience with Azure Synapse Analytics and Azure Data Lake StorageProficiency in Python for data manipulation and analysisUnderstanding of machine learning concepts and their application in data...
-
Data Engineers
3 weeks ago
Islamabad, Islamabad, Pakistan Nisum Full timeWhat You Will DoDevelop and maintain scalable data pipelines and build out new API integrations to support continuing increases in data volume and complexity. Working as an individual contributor while leading the data team.Collaborate with analytics and business teams to improve data models that feed business intelligence tools, increase data accessibility,...
-
Big Data Architect in Aviation
5 days ago
Islamabad, Islamabad, Pakistan FuseMachines Full timeAbout FusemachinesWe are a global leader in AI strategy, talent, and education services, aiming to democratize AI and bring our expertise to companies worldwide.About the Contract RoleThis remote contract position involves designing, building, and maintaining data infrastructure for integration, storage, processing, and analytics. As a skilled Senior Data...
-
Senior Data Architect
2 weeks ago
Islamabad, Islamabad, Pakistan DigyCorp Full timeCompany OverviewDigyCorp is a forward-thinking company that specializes in innovative data solutions. Our mission is to revolutionize the way businesses interact with data, making it more accessible and actionable for all.Job DescriptionWe are seeking an experienced Data Engineer to join our team. As a key member of our data infrastructure group, you will be...
-
Senior Data Engineering Expert
2 weeks ago
Islamabad, Islamabad, Pakistan Nisum Full timeAbout NisumWe're a leading digital transformation consulting firm that helps businesses unlock their full potential through innovative technology solutions.The OpportunityWe're seeking a skilled Data Engineer to join our team and drive the development of cutting-edge data engineering solutions using Azure cloud services.Your Key ResponsibilitiesLead the...
-
Cloud-Based Data Integration Specialist
5 days ago
Islamabad, Islamabad, Pakistan FuseMachines Full timeAbout FusemachinesWe're a global leader in AI strategy, talent, and education services, seeking to democratize AI and bring our expertise to companies worldwide.About the JobThis remote contract position involves designing, building, and maintaining data infrastructure for integration, storage, processing, and analytics. As a skilled Senior Data Engineer,...
-
Data Engineer
2 weeks ago
Islamabad, Islamabad, Pakistan DigyCorp Full timeContract Period: One Year (Extendable)Experience: 5+ years of experienceSeniority level: Mid/SeniorWe Have:Mission-led project that will revolutionize our ability to restore coral reefs affected by global warming.Team of Developers, Architects, ML Engineers, Bank-End Engineers and Front-End Developers interacting with back-end service architecture deployed...
-
Cloud Big Data Architect
1 week ago
Islamabad, Islamabad, Pakistan Fusemachines Full timeJob SummaryAbout FusemachinesFusemachines is a leading provider of AI solutions and products to various industries. Our company was founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, who aims to democratize AI and leverage its global impact. With a robust presence in four countries and a team of over 400 employees, we...
-
Big Data Solutions Expert
1 week ago
Islamabad, Islamabad, Pakistan Fusemachines Full timeAbout the RoleData Engineering LeadWe are seeking a highly skilled Senior Data Engineer to lead, design, build, and maintain the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics) using Microsoft Azure in the Media domain. The ideal candidate will possess hands-on experience with...
-
Data Architect Specialist
2 weeks ago
Islamabad, Islamabad, Pakistan Nisum Full timeJob SummaryNisum seeks an experienced Data Engineer to lead the design and development of data-intensive solutions using Azure cloud services.Key ResponsibilitiesDesign, implement, and maintain large-scale data processing pipelines using Azure functions, Databricks, and PySpark.Develop and manage data governance models to ensure data quality, security, and...
-
Cloud Data Solutions Lead
2 weeks ago
Islamabad, Islamabad, Pakistan Nisum Full timeOur TeamNisum's data engineering team is passionate about developing innovative solutions that drive business growth and customer satisfaction.Your RoleWe're seeking a seasoned Data Engineer to lead the development of data-intensive solutions using Azure cloud services.Your Key ResponsibilitiesDesign, implement, and maintain large-scale data processing...
-
Senior Data Engineer
1 week ago
Islamabad, Islamabad, Pakistan Fusemachines Full time**Overview**Fusemachines is a global AI company with a presence in four countries, committed to democratizing AI and harnessing its power to transform businesses worldwide. Our team of experts works tirelessly to deliver state-of-the-art AI products and solutions to diverse industries.We are currently seeking a highly skilled Sr. Data Engineer to join our...
-
Big Data Infrastructure Architect
1 week ago
Islamabad, Islamabad, Pakistan Fusemachines Full time**About the Role**Fusemachines is a global AI company with a presence in four countries, committed to democratizing AI and harnessing its power to transform businesses worldwide. We are currently seeking a highly skilled Sr. Data Engineer to join our team and lead the development of big data infrastructure using Microsoft Azure in the Media domain.This...