Platform Engineer
Job Title: Platform Engineer
Type: Remote
Coverage: Pacific Hours 8am-5pm PST
Job Description:
We are looking for an experienced Platform Engineer to join our team. This role involves designing and building scalable infrastructure while ensuring reliability and performance across critical systems and services. The ideal candidate will have a strong background in software engineering and platform reliability, with a deep understanding of core system components. They should be flexible in taking on diverse tasks across different technologies and easily handle context-switching.
Key Responsibilities:
Platform Design and Infrastructure:
- Design, develop, and maintain reliable and scalable infrastructure solutions.
- Partner with engineering teams to ensure platform architecture supports reliability, scalability, and optimal performance.
- Evaluate and implement new technologies and tools to enhance the infrastructure.
Monitoring, Incident Response, and Troubleshooting:
- Set up, maintain, and improve monitoring and alerting systems to detect issues proactively.
- Lead incident response, troubleshooting, and root cause analysis efforts for critical platform issues.
- Perform post-incident reviews to identify areas for improvement and drive future initiatives.
Automation and Infrastructure Management:
- Develop and implement automation projects (preferably Python, Go, or similar) to streamline platform tasks and minimize manual intervention.
- Create scripts for automating system upgrades, health checks, and deployments.
- Utilize Infrastructure as Code (IaC) tools like Terraform, Ansible, or Pulumi to manage infrastructure configuration and deployment.
Collaboration and Technical Leadership:
- Collaborate with cross-functional teams to deliver high-quality infrastructure solutions.
- Mentor junior engineers and advocate for platform engineering best practices across teams.
- Promote a culture of reliability and automation through workshops, documentation, and hands-on guidance.
Continuous Improvement:
- Drive initiatives to enhance platform reliability, capacity planning, and service performance.
- Participate in disaster recovery planning and execution.
- Stay updated with industry trends, tools, and technologies to continually improve platform capabilities.
Qualifications:
Education and Experience:
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
- 8+ years of industry experience, including roles as a Software Engineer, SRE, or Platform Engineer.
- At least 3+ years of experience in platform engineering, SRE, or infrastructure roles with large-scale, mission-critical environments.
Technical Skills:
- Strong knowledge of Linux/Unix systems, networking, and core system internals.
- Experience with one or more programming languages (e.g., Python, Go, Java).
- Advanced skills in Bash scripting for task automation.
- Proficiency with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes).
- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Hands-on experience with CI/CD tools and workflows (e.g., Jenkins, GitLab CI).
Soft Skills:
- Strong analytical and problem-solving skills with a proactive mindset.
- Excellent communication skills and the ability to work collaboratively across teams.
- Leadership qualities with a track record of mentoring and guiding team members effectively.
- Our team
- Virtasant - Consulting
- Locations
- HQ
- Remote status
- Fully Remote
Already working at Virtasant?
Let’s recruit together and find your next colleague.