Position Overview
The Performance Engineer will play a critical role in analyzing, optimizing, and scaling ArcOne's data and AI systems, with a focus on revenue management.This role involves deep performance profiling across application, middleware, runtime, and infrastructure layers, developing advanced observability tools, and collaborating with cross-functional teams to meet stringent latency, throughput, and scalability goals.
Qualifications
Education : Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
Experience
- 7+ years of software engineering experience, with a strong focus on performance or reliability engineering for high-scale distributed systems.
- Proven expertise in optimizing performance across one or more layers of the stack (e.g, database, networking, storage, application runtime, GC tuning, Python/Golang internals, GPU utilization).
- Hands-on experience with real-time and batch processing frameworks (e.g, Apache Kafka, Spark, Flink).
- Demonstrated success in building observability, benchmarking, or performance-focused infrastructure at scale.
- Experience in revenue management systems or similar domains (e.g , pricing, forecasting) is a plus.
Technical Skills
- Deep proficiency with performance profiling tools (e.g , perf, eBPF, VTune) and tracing systems (e.g , Jaeger, Open Telemetry).
- Strong understanding of OS internals, including scheduling, memory management, and IO patterns.
- Expertise in programming languages such as Python, Go, or Java, with a focus on runtime optimization.
Key Responsibilities
Performance Analysis & Optimization :
- Analyze and optimize performance across the full stack, including application, middleware, runtime (e.
, Python runtime, GPU utilization), and infrastructure layers (e.g, networking, storage).
- Perform deep performance profiling, tuning, and optimization for databases, data pipelines, AI model inference, and distributed systems.
- Optimize critical components such as garbage collection (GC), memory management, IO patterns, and scheduling to ensure high efficiency.
Observability & Tooling
- Develop and maintain tooling and metrics to provide deep observability into system performance, enabling proactive identification of bottlenecks and inefficiencies.
- Implement and enhance performance monitoring systems (e.g , tracing, logging, dashboards) to track latency, throughput, and resource utilization in real-time.
- Contribute to benchmarking frameworks and performance-focused infrastructure to support continuous improvement.
Cross-Functional Collaboration
- Partner with infrastructure, platform, training, and product teams to define and achieve key performance goals for revenue management systems.
- Influence architecture and design decisions to prioritize latency, throughput, and scalability in large-scale data and AI systems.
- Align stakeholders around performance objectives, navigating ambiguity to deliver measurable improvements.
Performance Testing & SLAs
- Lead the development and execution of performance testing strategies, including load, stress, and scalability tests, for real-time and batch processing workloads.
- Define and monitor Service Level Agreements (SLAs) and Service Level Objectives (SLOs) around latency, throughput, and system reliability.
- Drive investigations into high-impact performance regressions or scalability issues in production, ensuring rapid resolution and root cause analysis.
System Design & Scalability
- Collaborate on the design of robust data architectures and AI systems, ensuring scalability and performance for revenue management use cases.
- Optimize real-time streaming (e.g , Apache Kafka, Flink) and batch processing (e.g , Spark, Hadoop) workloads for high-scale environments.
- Advocate for simplicity and rigor in system design to address complex performance challenges.
(ref:hirist.tech)