![IBM](https://media.trabajo.org/img/noimg.jpg)
Site Reliability Engineer
4 weeks ago
Introduction
management tasks more efficiently.
We’re seeking skilled, automation-focused Network Engineers to maintain and administer the
Power Virtual Server Cloud Infrastructure-as-a-Service environment and provide reliable and secure
network operations.
The Network Infrastructure Operations Site Reliability Engineer works with clients to ensure
their specific networking requirements are provided, and handles issues reported by
monitoring/automation. Adhering to strict change control, the SRE will make required
configuration changes in the environment and perform various updates/upgrades to the Cisco
ACI-based software-defined networking environment. Constant attention to automating
manual toil is a core focus of this role.
Power Virtual Server is a fast-paced environment, our engineers provide technical support and resolve
client networking issues within the Power Virtual Server IaaS offering. They identify repetitive tasks and develop automation to reduce manual toil and seek proactive avoidance of client-impacting
events
Your Role and Responsibilities
As a Compute Operations Site Reliability Engineer, you will perform the following tasks:
- Remotely administer Power Server hardware environments across numerous datacenter
locations around the world (currently 20 datacenters and growing).
- Develop automation to reduce manual toil (automated, repetitive tasks) using shell
scripts (bash, etc), Python, Ansible, and related tools and languages.
- Perform code stack updates on infrastructure systems (VIOS, firmware, PowerVC, HMC,
Novalink, NIM servers) as well as cloud supporting systems (jump servers, sobox,
network nodes, gateways, TSM servers).
- Upload/maintain stock images.
- Remotely administer AIX and Linux servers
- Maintain User IDs (Add/delete) and passwords.
- Monitor daily/weekly backups to ensure they are working.
- Manage and maintain Nagios monitoring environment, troubleshoot scripts/plug-ins in
case of issues.
- Perform periodic Live Partition migrations, inactive migrations, or remote restarts of
customer VMs to perform system maintenance, balance workloads, or free up
resources.
- Monitor and provide details of Capacity utilized in each Datacenter.
- Attend scheduled meetings planned by customer for cutover/maintenance windows.
- Verify capacity requirements in case of provisioning failure issues by customers.
- Work with customers to resolve any RSCT issues so that LPM activities can be performed
without impacting customer workloads
Required Technical and Professional Expertise
- In-depth knowledge of Power Server hardware.
- Significant scripting/coding experience for automating all aspects of IBM Power systems
administration.
- Automation using Python, shell scripting (bash, etc), Ansible, and related tools and
languages.
- Experience with AIX and Linux administration, commands, and networking.
- Strong experience in one or more of the following: VIO, Novalink, and PowerVC.
Familiarity with one more (to include installation, configuration, administration).
- In-depth knowledge of PowerVM including installation/configuration and
administration.
- High level knowledge of Power Systems supported Operating Systems (AIX and IBMi).
- In-depth knowledge of how storage is connected and allocated to Power systems via
NPIV connections.
- Good understanding of Power Systems network configuration at the system level.
Preferred Technical and Professional Expertise
- Experience with configuring and tuning PowerVC
- Access PowerVS resources using IBM Cloud Portal,
- IBM Cloud CLI, APIs, Terraform
- 3+ years’ experience supporOng customers using ServiceNow or Salesforce.
- Experience training new personnel on tooling and processes.
- Storage & Power RTS, MVS Network for Cisco, Juniper; general support skills
About Business UnitIBM Systems helps IT leaders think differently about their infrastructure. IBM servers and storage are no longer inanimate - they can understand, reason, and learn so our clients can innovate while avoiding IT issues. Our systems power the world’s most important industries and our clients are the architects of the future. Join us to help build our leading-edge technology portfolio designed for cognitive business and optimized for cloud computing.
Being an IBMer means you’ll be able to learn and develop yourself and your career, you’ll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.
Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our
-
Site Reliability Engineer
3 weeks ago
مصر, Egypt Convertedin Full time**About us**: Convertedin is a marketing operating system for e-Commerce. It utilizes data and shoppers' insights to create personalized multi-channel marketing that boosts customer engagement and maximizes their return on their marketing budget by leveraging artificial intelligence capabilities. Convertedin has helped more than 800 e-Commerce...
-
Site Reliability
1 week ago
مصر, Egypt ASWAT Full time**We Are Hiring: Site Reliability and DevOps Engineer** **About Us**: **ZIWO** is an Omni-channel Cloud Contact Center Software (CCAAS) providing straightforward solutions for companies to communicate with their clients via Phone, WhatsApp, SMS, and more. We connect 145 countries globally, including the GCC, enabling users to instantly expand their reach...
-
Junior Site Reliability Engineer
1 week ago
مصر, Egypt DXC Technology Full time**Job brief** We are looking for a Juinor Site Reliability Engineer for our Early Grads Program. **Responsibilities**: - Set up, operate and manage environments. - Handle code deployments in all environments. - Operate over all types of infrastructures like on-prem cloud and/or container-based platforms - Develop automation with a focus on scalability,...
-
Site Reliability Engineer
2 weeks ago
مصر, Egypt IBM Full timeIntroduction management tasks more efficiently. We’re seeking skilled, automation-focused Network Engineers to maintain and administer the Power Virtual Server Cloud Infrastructure-as-a-Service environment and provide reliable and secure network operations. The Network Infrastructure Operations Site Reliability Engineer works with clients to...
-
Reliability Engineer
2 weeks ago
مصر, Egypt PepsiCo Full time**Responsibilities**: - Key stakeholder in delivering PEMM results for the Maintenance Support Department. - Will have ownership of the Reliability section of the site maintenance improvement plan (MIAP) coming from every PeMM assessment. - Lead the site Asset Reliability program. Own and develop the site Major Incident Report (MIR), Analytical Problem...
-
Site Reliability Engineer
2 weeks ago
مصر, Egypt IBM Full timeIntroduction management tasks more efficiently. We’re seeking skilled, automation-focused Network Engineers to maintain and administer the Power Virtual Server Cloud Infrastructure-as-a-Service environment and provide reliable and secure network operations. The Network Infrastructure Operations Site Reliability Engineer works with clients to...
-
Mid / Senior Site Reliability Engineer
3 weeks ago
مصر, Egypt Convertedin Full time**About us**: Convertedin is a marketing operating system for e-Commerce. It utilizes data and shoppers' insights to create personalized multi-channel marketing that boosts customer engagement and maximizes their return on their marketing budget by leveraging artificial intelligence capabilities. Convertedin has helped more than 800 e-Commerce...
-
Mid / Senior Site Reliability Engineer
4 weeks ago
مصر, Egypt Convertedin Full time**About us**: Convertedin is a marketing operating system for e-Commerce. It utilizes data and shoppers' insights to create personalized multi-channel marketing that boosts customer engagement and maximizes their return on their marketing budget by leveraging artificial intelligence capabilities. Convertedin has helped more than 800 e-Commerce...
-
Site Reliability/devops Engineer
2 weeks ago
مصر, Egypt Qoyod Full timeJob Summary As Site Reliability/DevOps Engineer, you will introduce processes, tools, and methodologies to balance needs throughout the software development life cycle, from coding and deployment to maintenance and updates. **Responsibilities**: - Focus on improving the scalability, robustness, and automation of our tools and processes, as well as...
-
Senior Site Reliability Engineer
1 week ago
مصر, Egypt Evolvice Full timeEvolvice is a German nearshore service provider with branches in Egypt and Ukraine. Founded in 2012, Evolvice has a strong technical background and business domain knowledge, combining software engineering and Agile methodology, leading its’ clients’ path to digital transformation. Headquartered in the heart of the automobile industry, Stuttgart...
-
Reliability Engineer
2 weeks ago
مصر, Egypt Si-Ware Systems Full timeWe are seeking a highly motivated and detail-oriented Reliability Engineer to join our team. As a Reliability Engineer, you will play a crucial role in ensuring the reliability and durability of our products. You will collaborate with cross-functional teams to identify potential issues, design and execute tests, and analyze data to provide valuable insights...
-
Site Reliability Engineer, Wwcc Coe
4 weeks ago
مصر, Egypt VMware Full time**Why will you enjoy this new opportunity?** **Success in the Role: What are the performance outcomes over the first 6-12 months you will work toward completing?** **The Work: What type of work will you be doing? What assignments, requirements, or skills will you be performing on a regular basis?** Your regular activities may be modified to suit your...
-
Senior Site Reliability Engineer Ii
3 weeks ago
مصر, Egypt Careem Full timeCareem is building the Everything App for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Senior Site Reliability Engineer
2 months ago
مصر, Egypt Procore Full timeWe’re looking for a **Senior Site Reliability Engineer** to join Procore’s Fintech cloud infrastructure team. In this role, you’ll work collaboratively with software engineers, software testing engineers, and product/project managers, to build, design, and shape the cloud infrastructure. Your role will also include improving and developing new platform...
-
Senior Site Reliability Engineer
2 months ago
مصر, Egypt Procore Technologies Full time**Job Description**: We’re looking for a **Senior Site Reliability Engineer** to join Procore’s Fintech cloud infrastructure team. In this role, you’ll work collaboratively with software engineers, software testing engineers, and product/project managers, to build, design, and shape the cloud infrastructure. Your role will also include improving and...
-
Senior Site Reliability Engineer
3 weeks ago
مصر, Egypt Careem Full timeCareem is building ‘the everything app’ for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Staff Site Reliability Engineer I
4 weeks ago
مصر, Egypt Careem Full timeCareem is building ‘the everything app’ for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Senior Site Reliability Engineer I
2 weeks ago
مصر, Egypt Careem Full timeCareem is building the Everything App for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Senior Site Reliability Engineer I
4 days ago
مصر, Egypt Careem Full timeCareem is building the Everything App for the greater Middle East, making it easier than ever to move around, order food and groceries, manage payments, and more. Careem is led by a powerful purpose to simplify and improve the lives of people and build an awesome organisation that inspires. Since 2012, Careem has created earnings for over 2.5 million...
-
Site Reliability Engineer vois
3 weeks ago
مصر, Egypt Vodafone Full time**Role Purpose**: **Key Accountabilities and Decision Ownership**: - Use software as a tool to manage systems, solve problems, and automate resolution to achieve zero touch operations - Design and enhance software architecture to improve scalability, service reliability, capacity, and performance. - Defines, creates, promotes and monitors SLO’s/SLI’s...