Meme Image Content Designer
Job Description
Designation: Site Reliability Engineer (SRE) – Observability
Experience: 4-6 Years
Employment Type: Full Time (1 month Assignment) (Contractual)
Job Description:
We are seeking a highly skilled SRE – Observability Engineer to design, implement, and
manage observability solutions across complex infrastructure environments. The ideal
candidate will have hands-on experience in monitoring applications, APIs, networks, and
both on-premise and cloud-based systems, ensuring high availability, performance, and
reliability for end customers.
Key Responsibilities:
Design and implement end-to-end observability solutions (metrics, logs, traces).
Manage and support observability platforms for customer environments.
Monitor:
- Application performance and availability
- APIs and payment gateway systems
- Network infrastructure
- On-premise and cloud servers
- Build dashboards, alerts, and SLIs/SLOs for proactive monitoring.
- Collaborate with DevOps, development, and infrastructure teams.
- Automate monitoring, alerting, and remediation workflows.
- Ensure system reliability, scalability, and performance optimization.
- Provide L2/L3 support for observability tools and platforms.
Mandatory Skills Set:
Strong experience in observability and monitoring platforms such as:
- Pandora FMS – preferred
- Nagios, Op-manager, Datadog, New-relic
- Zabbix, Prteg
- Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Manage engine(any)
Hands-on experience:
- Application Performance Monitoring (APM)
- API monitoring and payment system tracking
- Network monitoring (e.g., Router, switches, firewall)
- Experience with cloud platforms: AWS / Azure / GCP
- Knowledge of on-premises infrastructure and hybrid environments
Strong understanding:
- Linux systems
- Networking concepts (TCP/IP, DNS, Load Balancing)
- Experience with scripting/automation: Python / Bash / Shell scripting
Familiarity with containerization and orchestration:
Docker, Kubernetes
Preferred Qualifications:
- Experience in implementing observability for fintech/payment systems
- SRE best practices (SLI/SLO, error budgets)
- Certification in cloud platforms (AWS/Azure/GCP)
Soft Skills:
- Strong problem-solving and analytical skills
- Excellent communication and stakeholder management
- Ability to work in a fast-paced, customer-facing environment
Key Deliverables:
- Understand client system/ application architecture
- Configure customized monitoring workflows for applications and infrastructure
- High system uptime and reliability
- Effective monitoring dashboards and alerting systems
Good To Have:
- Experience in 24x7 production support environments
- Knowledge of security monitoring and compliance
Location: Pune (Work from Office)
Pay: Up to ₹75,000.00 per month
Work Location: In person
Preparing for this role?
Practice with an AI interviewer tailored to Site Reliability Engineer at FTB Communications.
More Jobs
View all jobsParalegal