Engineering the
Silicon Synapse.

Senior Staff Engineer at Google, building production ML systems at the intersection of large-scale learning and neuromorphic computing.

Leto Hillza Profile
ACTIVE

Core Directives

01. Biomimicry

Borrow from biology only where it improves efficiency, adaptability, and signal fidelity in real systems.

02. Asynchronicity

Compute should react to events, not waste power waiting for them. The best systems wake up only when the signal matters.

03. Scalability

A model is only valuable when it survives production traffic, hard latency budgets, and real failure modes at scale.

Key Architectures

Representative systems across reinforcement learning, retrieval, model delivery, and neuromorphic runtime design. The through-line is production ML: high-scale inference, efficient adaptation, and architectures that survive real traffic.

Reinforcement Learning

Budgeting Optimization System (BOS)

A real-time decision engine for Google Ads that used reinforcement learning to optimize pacing, allocation, and market liquidity across multi-billion-dollar spend.

Retrieval & Ranking

Semantic Discovery Stack

Retrieval and ranking systems for Google Drive and Ads that combined embeddings, nearest-neighbor search, and intent signals to surface the right result under tight latency budgets.

ML Platform

ML Deployment & Model Freshness

Production pipelines for training, deployment, and low-latency serving that kept models fresh in hours instead of days while maintaining reliability at global scale.

Neuromorphic Runtime

Hybrid SNN Edge Runtime

A low-power runtime for spiking neural networks that pushed event-driven inference closer to the sensor, reducing wasted compute and power on edge hardware.

NEURAL_ARCHIVE_V1

Declassified Concepts

Hover over data cards to decrypt technical definitions.

SNN

SPIKING NEURAL NET

RL

REINFORCEMENT LEARNING

VON NEUMANN

THE BOTTLENECK

EDGE

LOCAL INFERENCE

Career Trajectory

Operational History

Senior Staff Software Engineer

Neuromorphic Computing & Hybrid AI

  • Integrating spiking neural networks (SNNs) into scalable production systems, developing spike-timing-dependent plasticity (STDP) learning rules for real-time adaptation.
  • Designing neuromorphic algorithms targeting sub-milliwatt inference on custom silicon — optimized for low-power edge devices.
  • Co-designing AI accelerators (TPU/ASIC) with hardware teams, translating biological neural dynamics into dataflow architectures.

Staff Software Engineer

Applied ML & RL Infrastructure

  • Built reinforcement learning systems that optimized Google Ads budgeting and pacing for multi-billion-dollar spend in real time.
  • Designed online learning, retrieval, and inference pipelines that kept ads models fresh at trillions-of-events scale.
  • Set the engineering standard for reproducibility, observability, and production reliability across the ads ML stack.

Senior Software Engineer

Ranking, Retrieval & ML Platform

  • Built ML deployment systems for Google Workspace that cut model release latency by 60%.
  • Improved Google Drive discovery with semantic retrieval and ranking models that surfaced the right files faster for enterprise users.
  • Scaled collaboration and model-serving systems through a 300% traffic surge without degrading user experience.

Software Engineer

Computer Vision & Multimodal ML

  • Built computer vision and retrieval systems for real-time visual understanding in Google Lens.
  • Scaled multimodal inference pipelines that powered visual discovery for 1B+ users.
  • Improved on-device vision latency and efficiency across mobile inference and retrieval workflows.

Research & IP

PATENT US-110452A Pending

Event-Driven Memory Allocation for Sparse Tensors

REQUEST_DOCS
WHITEPAPER Google Internal

Optimizing RL Agents for TPUv4 Architecture

READ_WHITEPAPER

Initiate Handshake

Working on large-scale ML systems, neuromorphic computing, or production AI architecture? Let's talk.

Send Signal

System_Dependencies

Python
ML SYSTEMS
C++
LOW LATENCY
Go
BACKEND SYSTEMS
Rust
MEMORY SAFE
JAX
AUTOGRAD
Verilog
HDL
Torch
DEEP LEARNING
TPUv4
HARDWARE