Responsible for a globally distributed computing grid running
across 300K+ servers in multiple datacenters
Contributed design leadership and code for the automation and
monitoring of major infrastructure services: distributed
storage, job scheduling, distributed
locking service, automated machine management
Provided emergency response and troubleshooted system-level issues
across entire computing grid
Assisted in testing, qualification, and rollout automation of new
Linux kernels across Google's server fleet
Produced training material and weekly disaster simulations for
entire team
Worked as part of a small worldwide team responsible for
some of the most critical servers in the Fixed Income,
Currency, and Commodities (FICC) division
Provided long-term system engineering and 24/7 operations
support for Solaris, Linux, and NetApp servers
Maintained the most widely deployed Linux distribution in
the firm