Splunk ITSI: Expertise in utilizing Citrix observability data with Splunk IT Service Intelligence for comprehensive monitoring and analytics of Citrix environments.
VDI and VM Observability: Proficient in monitoring Virtual Desktop Infrastructure (VDI) and Virtual Machines (VMs) for performance, reliability, and scalability.
OpenTelemetry, Kafka, and Splunk/Grafana: Hands-on experience with OpenTelemetry for unified telemetry, Kafka for real-time data ingestion, and Splunk/Grafana for powerful data visualization and alerting.
Data Science & Machine Learning: Proficient in Python, TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy for developing machine learning models for anomaly detection and root cause analysis.
ETL & Data Analysis: Extensive knowledge of ETL techniques using Apache Airflow, Apache NiFi, and Spark.
Distributed Systems & Cloud: Thorough understanding of Kubernetes, Docker, and service mesh technologies (Istio, Linkerd) with experience in AWS, Azure, and GCP observability services.
Event-Driven Architectures: Experience with Kafka, RabbitMQ, and integrating observability into event-driven architectures.
Time-Series Analysis & Predictive Analytics: Skills in time-series analysis for predictive maintenance and alerting.
Security & Compliance: Ensuring observability data is managed securely and in compliance with regulations like GDPR, HIPAA.
Performance Optimization: Ability to conduct root cause analysis and improve observability for system optimization.
Experience:
Lead Observability Project in Citrix Environment: Directed the implementation of Citrix observability with Splunk ITSI, enhancing the monitoring of over 1,000 virtual desktops and applications. Improved MTTR by 40% and increased user satisfaction.
VDI and VM Observability: Designed and deployed observability solutions for VDI and VMs using OpenTelemetry and Splunk, ensuring performance and availability of critical applications and infrastructure.
Advanced Monitoring & Analytics with Kafka and Splunk: Spearheaded the deployment of real-time monitoring solutions using Kafka for event streaming and Splunk for visualization and alerting in a high-traffic environment.
Machine Learning-Driven Anomaly Detection: Developed and implemented machine learning algorithms in Python for anomaly detection in telemetry data.
Cross-Functional Collaboration: Worked closely with SREs, DevOps, and development teams to enhance system reliability and incident response.