Site Reliability Engineer II
Talkdesk
This job is no longer accepting applications
See open jobs at Talkdesk.See open jobs similar to "Site Reliability Engineer II" Threshold.At Talkdesk, we are courageous innovators focused on redefining the customer experience, making the impossible possible for companies globally. We champion an inclusive and diverse culture representative of the communities in which we live and serve. And, we give back to our community by volunteering our time, supporting non-profits, and minimizing our global footprint. Each day, thousands of employees, customers, and partners all over the world trust Talkdesk to deliver a better way to great experiences.
We are recognized as a cloud contact center leader by many of the most influential research organizations, including Gartner and Forrester. With $498 million in total funding, a valuation of more than $10 Billion, and a ranking of #16 on the Forbes Cloud 100 list, now is the time to be part of the Talkdesk legacy to help accelerate our success in a new decade of transformational growth.
At Talkdesk, we embrace FAST, our fundamental operating principles that define who we are as an organization. These principles drive us to make the impossible possible. FAST: Focus + Accountability + Speed = Talkdesker.
- Focus: Focus time, energy and attention on what is most impactful for the business and thoughtful about how and when to partner with others.
- Accountability: Hold self and others accountable to meet commitments and drive results. Accept responsibility for successes and failures.
- Speed: Execute with agility and urgency. Act promptly, decisively, and without delay. Make good and timely decisions that keep the organization moving forward.
- Talkdesker: YOU!
We are looking for an engineer to focus on Developer Experience and who can help us design, build, and maintain high-performance, scalable, and reliable services. As Talkdesk provides a Contact Center service, we play a very critical role in our Customer’s business operations and therefore need to provide a highly available and fault tolerant service.
We believe in a DevOps philosophy where every engineering team at Talkdesk should be responsible for the software they build and deploy and SREs play a critical role in ensuring that the teams have the tools, practices, and expertise to make that happen in a blame free culture.Our mission is to improve developers’ experience by giving them the tools to manage the entire software lifecycle and to be self-sufficient. To help with this we are building our own internal PaaS using the latest technologies like Kubernetes, Prometheus, Kotlin and others. This platform is an important pillar in Talkdesk’s engineering effort and helps us deliver better, faster and more reliable solutions for our customers.
Responsibilities:
- Maintain and improve availability, latency, and performance of production services.
- Participate in incident response.
- Write and maintain operational documentation, runbooks, and architecture diagrams.
- Evolve infrastructure automation using Terraform or similar tools to remove as much as possible any human intervention.
- Help automate infrastructure provisioning and other engineering processes.
- Build internal platforms, tools, and frameworks to improve developer productivity and service reliability.
- Work with software engineers to understand system behavior and build reliability into services
Skills and Qualifications:
- 3–5 years of experience in Site Reliability Engineering, DevOps, or Platform Engineering.
- Understand large-scale complex systems from a reliability perspective.
- Passion for producing clean, standards-compliant, secure code.
- Bringing a developer mindset and applying it to infrastructure.
- Experience with Linux/Unix systems.
- Experience with Kubernetes.
- Solid experience with tools like Terraform, Ansible, Helm.
- Proficiency in script writing for automating the execution of certain tasks with a programming language like Python, Bash or any other scripting language.
- Experience with at least one relational and non-relational databases (ex: PostgreSQL, MySQL, MongoDB, Redis, ElasticSearch).
- Familiarity with debugging distributed systems and analyzing system logs and metrics.
Nice to haves / Pluses:
- Experience with cloud-based solutions such as Amazon AWS, Google Cloud, or Microsoft Azure.
- Experience supporting scalable DBs like PostgreSQL, or MongoDB in production.
Work Environment and Physical Requirements:
Primarily office-environment work, extended periods of sitting or standing, computer-based work. Limited lifting, and equipment usage limited to computer-related equipment (keyboards, mouse, etc.)
The Talkdesk story hinges on empathy and acceptance. It is the shared goal among all Talkdeskers to empower a new kind of customer hero through our innovative software solution, and we firmly believe that the best path to success for our mission is inclusivity, diversity, and genuine acceptance. To that end, we will hire, promote, work along, cheer for, bond with, and warmly welcome into the Talkdesk family all persons without regard to ethnic and racial identity, indigenous heritage, national origin, religion, gender, gender identity, gender expression, sexual orientation, age, disability, marital status, veteran status, genetic information, or any other legally protected status.
This job is no longer accepting applications
See open jobs at Talkdesk.See open jobs similar to "Site Reliability Engineer II" Threshold.