Reliable System Expert

4 days ago


Hyderabad City Taluka, Pakistan beBee Careers Full time
About This Role

Splunk, Information Technology Infrastructure Library (ITIL), IT Service Management (ITSM)


We're seeking an experienced Site Reliability Engineer with 8+ years in IT Service Management (ITSM) and hands-on expertise in Application Performance Monitoring (APM) tools like Datadog and Splunk. The ideal candidate will be a self-driven professional who can build resilient monitoring systems and improve application observability.

Role Type: Full Time

Experience: 8+ Years


Key Responsibilities
  • Utilize and manage APM tools such as Datadog to ensure efficient performance monitoring.
  • Leverage expertise in Splunk to support system observability and troubleshooting.
  • Integrate APM tools with third-party platforms for seamless communication.
  • Design and implement alerting mechanisms using monitoring and logging tools to detect and resolve potential issues proactively.
  • Define and track Key Performance Indicators (KPIs) for system health and efficiency.
  • Streamline alerts to reduce noise and improve incident relevance.
  • Conduct periodic reviews to expand observability coverage across applications.
  • Analyze metrics regularly to identify anomalies and performance patterns.
  • Update dashboards as needed to reflect current operational metrics and system status.
Requirements
  • 8+ years of experience in ITIL/ITSM process management.
  • At least 5 years of hands-on experience with Datadog or equivalent APM tools.
  • Proficiency in Splunk for data analysis and visualization.
  • Strong verbal and written communication skills.
  • Excellent organizational abilities with the capability to manage and prioritize multiple tasks.
  • Previous experience in an application support or similar SRE role.
  • Familiarity with other monitoring tools and emerging technologies is a plus.
Benefits

The selected candidate will have the opportunity to work on challenging projects, collaborate with a talented team, and contribute to the growth and success of our organization.



  • Hyderabad City Taluka, Pakistan beBee Careers Full time

    **Job Title:** Software Reliability SpecialistWe are seeking a highly skilled Software Reliability Specialist to ensure the high availability and performance of our critical systems. The role involves designing and implementing monitoring systems, analyzing system performance, and optimizing efficiency.The ideal candidate will have hands-on experience in...


  • Hyderabad City Taluka, Pakistan GSPANN Technologies, Inc Full time

    Site Reliability Engineering (SRE), Python, Django, FastAPI, Flask, SQL, RESTful, pytestDescriptionGSPANN is hiring a Site Reliability Engineer with to ensure high availability and performance of critical systems using tools like Prometheus and Nagios. The role involves developing reliable Python code, managing APIs, and optimizing system efficiency across...

  • Database Architect

    2 weeks ago


    Hyderabad City Taluka, Pakistan beBee Careers Full time

    Database Architect - SCADA SystemsThis Database Architect role is responsible for designing, architecting, and optimizing SCADA databases to ensure scalability, reliability, and efficiency. The ideal candidate will have expert-level knowledge of database management systems and experience in large-scale data handling.


  • Hyderabad City Taluka, Pakistan beBee Careers Full time

    Senior Site Reliability EngineerElevate your engineering expertise to unprecedented heights by working with a team of exceptionally talented professionals and position yourself among the top echelon in site reliability.You will work closely with stakeholders to define non-functional requirements (NFRs) and availability targets for applications and product...


  • Hyderabad City Taluka, Pakistan GSPANN Technologies, Inc Full time

    Splunk, Information Technology Infrastructure Library (ITIL), IT Service Management (ITSM)DescriptionGSPANN is hiring an experienced Site Reliability Engineer (SRE) with 8+ years in IT Service Management (ITSM) and hands-on expertise in Application Performance Monitoring (APM) tools like Datadog and Splunk. We're looking for a self-driven professional who...


  • Hyderabad City Taluka, Pakistan beBee Careers Full time

    About the Role:Take on a critical leadership position, defining the future of a global organization and driving significant impact in site reliability.Key Responsibilities:Demonstrate and champion site reliability culture and practices, exerting technical influence across your team.Lead initiatives to improve application and platform reliability, leveraging...


  • Hyderabad City Taluka, Pakistan JP Morgan Chase Full time

    Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.As a Principal Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, you will work with your stakeholders to define non-functional requirements (NFRs)...


  • Hyderabad City Taluka, Pakistan beBee Careers Full time

    High Availability and Performance EngineerWe are seeking a skilled professional to ensure the high availability and performance of our critical systems.The ideal candidate will have hands-on experience in developing reliable Python code, managing APIs, and optimizing system efficiency.This is an exciting opportunity for someone who enjoys working with...


  • Hyderabad City Taluka, Pakistan JP Morgan Chase Full time

    Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, you hold a leadership role in your team, demonstrate strong knowledge across multiple...


  • Hyderabad City Taluka, Pakistan beBee Careers Full time

    Site Reliability EngineerThis role is perfect for a skilled engineer who wants to take their career to the next level. As a Site Reliability Engineer, you will play a crucial part in ensuring the high availability and performance of critical systems.The ideal candidate should have 8+ years of experience in this field and be proficient in Python 3. They...