Senior Site Reliability Engineer Ii
6 days ago
Egypt
Careem is building the Everything App for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million Captains, simplified the lives of over 50 million customers, and built a platform for the region’s best talent to thrive and for entrepreneurs to scale their businesses. Careem operates in over 70 cities across 10 countries, from Morocco to Pakistan.
**About the team**:
We are looking for engineers who will work within the Cloud Engineering team. The team develops and maintains cloud-native technology for the Careem Service teams:
- Highly scalable Kubernetes clusters
- Cloud Access management automation and integration with k8s
**About the role**:
As an SRE, you’ll need to solve problems that arise using empirical data, teamwork, and your own unique expertise.
The Data Platform SRE will work directly with our data platform and engineering teams in an embedded SRE model, operating in unison with the developers to deliver seamless experiences for our customers.
**Key responsibilities include**:
- Make an impact from design phase, through development and operation of Data Platform over Kubernetes cluster and its ecosystem on AWS
- Build core services, and tooling and create technical processes that simplify and enable engineers across multiple services
- Identifying, automating and scaling system configurations without compromising on security and reliability.
- Participate in on-call rotations and help improve incident response
**Education and Experience**:
BS/MS in Computer Science or Equivalent (7+ years of software development or production operations experience in a large-scale environment)
**Qualifications**:
- Strong sense of ownership and integrity demonstrated through clear communication and collaboration
- Experience in architecting, developing, operating, and troubleshooting Kubernetes clusters and/or other highly available systems at scale.
- Proficiency with the architecture, deployment, performance tuning, and troubleshooting of open-source data analytics technologies, especially Apache Spark, Trino and related software in a large-scale environment
- The ability to design, author, and release code in languages like Go, Python, or Java
- Acute drive to automate manual operations and to improve them through repeated iteration
- Understanding of the Linux Operating System, standard networking protocols, and components
- Experience with cloud-native services on AWS/GCP
- Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Terraform, Cloudformation, ArgoCD, and Flux)
- Excellent troubleshooting and problem-solving skills
- Experience with scale testing, disaster recovery, and capacity planning
- Effective communication and collaboration skills: have the ability to drive and promote technical partnerships across teams
- Incident response and/or incident management experience
**What we’ll provide you**
We offer colleagues the opportunity to drive impact in the region while they learn and grow. As a full time Careem colleague, you will be able to:
- Work and learn from great minds by joining a community of inspiring colleagues.
- Put your passion to work in a purposeful organisation dedicated to creating impact in a region with a lot of untapped potential.
- Explore new opportunities to learn and grow every day.
- Work 4 days a week in office & 1 day from home, and remotely from any country in the world for 30 days a year with unlimited vacation days per year.
- Access to healthcare benefits and fitness reimbursements for health activities including gym, health club, and training classes.
-
Site Reliability Engineer Ii
2 weeks ago
مصر, Egypt Careem Full timeAt Careem we are led by a powerful purpose to simplify and improve lives in the Middle East, North Africa and Pakistan. We're pioneering the development of innovative services to aid the mobility of people, the mobility of things and the mobility of money. We're in the driving seat as we help to define how technology will shape progress in some of the...
-
Site Reliability Engineer
2 weeks ago
مصر, Egypt Envision Employment Solutions Full time**Ready and hungry for a new adventure? You are definitely in the right place! We at **Envision Employment Solutions** are always on the look for top talents around the globe and matching them with our partners' hiring needs, to help them build and scale! - Our partners offer awesome work environment, competitive salaries, full benefits, and many others...
-
Senior Site Reliability Engineer
2 days ago
مصر, Egypt Evolvice Full timeEvolvice is a German nearshore service provider with branches in Egypt and Ukraine. Founded in 2012, Evolvice has a strong technical background and business domain knowledge, combining software engineering and Agile methodology, leading its’ clients’ path to digital transformation. Headquartered in the heart of the automobile industry, Stuttgart...
-
Senior Site Reliability Engineer
1 week ago
مصر, Egypt Procore Technologies Full time**Job Description**: We’re looking for a **Senior Site Reliability Engineer** to join Procore’s Fintech cloud infrastructure team. In this role, you’ll work collaboratively with software engineers, software testing engineers, and product/project managers, to build, design, and shape the cloud infrastructure. Your role will also include improving and...
-
Reliability Engineer
1 week ago
مصر, Egypt PepsiCo Full time**Responsibilities**: - Key stakeholder in delivering PEMM results for the Maintenance Support Department. - Will have ownership of the Reliability section of the site maintenance improvement plan (MIAP) coming from every PeMM assessment. - Lead the site Asset Reliability program. Own and develop the site Major Incident Report (MIR), Analytical Problem...
-
Senior Site Reliability Engineer I
6 days ago
مصر, Egypt Careem Full timeCareem is building ‘the everything app’ for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Site Reliability/devops Engineer
1 week ago
مصر, Egypt Qoyod Full timeJob Summary As Site Reliability/DevOps Engineer, you will introduce processes, tools, and methodologies to balance needs throughout the software development life cycle, from coding and deployment to maintenance and updates. **Responsibilities**: - Focus on improving the scalability, robustness, and automation of our tools and processes, as well as...
-
Senior Site Reliability Engineer I
2 days ago
مصر, Egypt Careem Full timeCareem is building the Everything App for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Senior Site Reliability Engineer
1 week ago
مصر, Egypt Procore Technologies Full time**Job Description**: We’re looking for a **Senior SRE Engineer** to join Procore’s Fintech cloud infrastructure team. In this role, you’ll work collaboratively with software engineers, software testing engineers, and product/project managers, to build, design, and shape the cloud infrastructure. Your role will also include improving and developing new...
-
Senior Integration Engineer
10 hours ago
مصر, Egypt Top Business Human Resources Full time**Job Description**: As a System Integration Engineer, you will play a crucial role in both; the building and running phases of our projects. You will be responsible for installing, configuring, and integrating solutions. Additionally, you will provide technical support for Cloud, On-Premises, and Hybrid services, ensuring the reliability, performance, and...