About
Ryan is experienced in developing reliable and scalable production cloud systems. He specializes in SRE, DevOps, microservices, cloud architecture, and observability. He has a solid technical background as a back-end developer. He has good soft skills, is self-motivated, and is comfortable networking to achieve project goals. Ryan has an excellent ability to understand the business needs behind requirements and is able to program in several languages.
Experience
About 19.4 yrs of professional experience, estimated from the roles below (overlaps counted once).
- Jan 2025Present
Resident Architect
Honeycomb
Onboarding large enterprise organisations onto Honeycomb. Open Telemetry migrations and custom instrumentation. Honeycomb.io setup and refinement. Educating clients on Honeycomb and observability 2.0.
- Jan 2025 – Jan 2025
DevOps Developer
Pfizer - PGS Operations And Insights
Added OpenTelemetry instrumentation across services. Set up Sysdig dashboards to monitor deployed services. Introduced root cause analysis for production issues and got the project owner on board. Instrumented ElasticSearch monitoring for shard issues. Onboarded a project onto Honeycomb for traces and metrics.
- Jan 2021 – Jan 2023
Site Reliability Engineer (Datadog Specialist)
BCG - Gamma
Worked with multiple product teams within the organization, designing their observability (monitoring) solutions. Guided teams on architectural considerations for observability. Defined observability best practices and coached the various teams. Worked to get as close to real-time awareness of customer visible issues as possible. Segmented alerting into different paths for different levels of severity. Developed Terraform to set up dashboards and alerting for Kubernetes clusters and canonical architecture (fe/be+db) applications (Datadog).
- Jan 2020 – Jan 2021
Site Reliability Engineer (ECS)
Toptal Project
Re-architected parts of the system that were vulnerable to high load, resulting in a perfect performance with no degradation during peak traffic Black Friday periods. Launched the new version of their website on the new infrastructure. Completed with only 10 minutes of planned downtime. The total downtime over two years on the project was less than three hours. Implemented alerting and monitoring for the new clusters. Customized Fastly CDN to provide outage mitigation. Wrapped the endpoint for an unreliable 3rd-party API with a CDN-managed endpoint that redirected to a backup if latency was high on the main API. Coached the team to improve their architectural designs according to the twelve-factor app principles and SRE best practices. Created Terraform-managed AWS Fargate clusters for deployed services.
- Jan 2019 – Jan 2019
Site Reliability Engineer (EKS)
Global Fashion Group
Created new Terraform-managed AWS EKS Kubernetes clusters (multi-region). Executed live cluster migrations to new Kubernetes clusters with zero downtime. Broke up a PHP back end into microservices, which improved reliability and scalability. Moved from self-hosted services to AWS-managed ones, improving reliability using Redis and SQL databases. Replaced Jenkins with AWS CodePipeline, which reduced maintenance costs. Replaced legacy storage with S3, resulting in improved reliability. Reworked database usage, eliminating bottlenecks during the high load.
- Jan 2016 – Jan 2018
DevOps Engineer and Release Manager
HERE Technologies
Designed and developed Jenkins deployment pipelines into AWS. Contributed to the programmatic generation of Jenkins pipelines using Job DSL. Set up the production Docker on Amazon EC2 instances. Ran the AWS autoscaling, microservices, Kafka, Flink, and windowed stream processing. Developed IoT-specific testing that fed continuous test data into production. This allowed us to build real-time dashboards to identify which part of a complex microservices system was failing.
- Jan 2015 – Jan 2016
Test Lead
HERE Technologies
Oversaw the analytics and A/B testing using Apptimize and Amplitude. Developed test strategies for mobile devices.
- Jan 2013 – Jan 2014
Test Lead
Auckland Transport
Defined and executed test strategies for a citywide critical infrastructure. Created tooling to optimize work methods.
- Jan 2012 – Jan 2013
Test Lead
Serato, Inc.
Oversaw and mentored junior developers. Introduced tools and processes for bug tracking, test management, peer review, crash report collection and analysis, beta test cycles, and improving the communication between customer support and product management teams. Tested iOS apps. Aided Scrum teams to adopt best practices in their testing and quality control.
- Jan 2011 – Jan 2012
Test Team Manager
IBM
Oversaw the management and technical rigor for a team of 11 testers. This included five products in flight from IBM's virtualization, security, operating system performance, and failover stacks. Changed the way the development and QA teams interacted by focusing on rapid iterative feedback. This reduced the release cycles from 2-3 months down to 2-3 weeks. Successfully oversaw two new major product launches.
- Jan 2010 – Jan 2011
Project Manager
IBM
Managed the development and release cycle for a small software team.
- Jan 2001 – Jan 2009
C++ Developer
Transitive
Developed automated testing infrastructure, including toolchains (cross-linking and bootstrapping build systems), assembly, linkers, CPU, and memory management architecture (SPARC, x86, X86_64, ARM, Itanium), and Linux kernel patching and building. Developed dynamic binary translators that would load binaries for one processor and execute them on another using UNIX kernel interface (syscalls). Acted as the lead engineer on a specialist performance analysis team. Studied the principles of performance analysis and improvement and applied them to solve performance issues when clients experienced lower-than-expected on-site performance.