0 years
0 Lacs
Posted:1 day ago|
Platform:
Remote
Full Time
Dusker AI is an innovation-driven company advancing the reliability and real-world performance of AI systems. We build research-grade benchmarks, evaluation frameworks, and automated testing environments that push AI agents beyond demos and into practical, terminal-based workflows.
Our work includes large-scale evaluation projects such as Terminal-Bench, where we rigorously test how AI agents behave in real Linux environments. We focus on reproducibility, correctness, and robustness, combining deep systems engineering with thoughtful automation.
We’re growing quickly and are looking for engineers who enjoy hands-on problem solving, clean systems design, and real ownership.
We are looking for a Software Engineer (Terminal Systems & Automation) to join us on a 3-month remote contract, with a strong likelihood of extension based on performance and project needs.
This role is ideal for engineers who enjoy working close to the operating system, debugging real execution environments, and building reliable automation that runs at scale.
You’ll work on designing, maintaining, and improving Terminal-Bench tasks, automation pipelines, and evaluation workflows that test AI agents in realistic Linux environments.
This is a fully remote role and starts immediately.
Design, implement, and maintain terminal-based evaluation tasks for AI benchmarking
Build and debug Bash and Python automation used in large-scale evaluation pipelines
Work extensively in Linux/Ubuntu environments, diagnosing process, filesystem, and execution issues
Create and maintain Docker-based execution environments to ensure isolation and reproducibility
Investigate flaky or failing tasks and improve their reliability, determinism, and clarity
Document task behavior, edge cases, and execution requirements
Collaborate asynchronously with a distributed engineering team to improve evaluation quality and system robustness
We’re looking for engineers who are curious, detail-oriented, and comfortable working independently.
Required Qualifications
Strong proficiency with Linux/Ubuntu command-line environments
Solid experience writing and debugging Bash scripts for automation
Hands-on experience with Docker (Dockerfiles, containers, volumes, execution environments)
Proficiency in Python for scripting, automation, or tooling
Good understanding of:
Processes, signals, and system behavior
Filesystems, permissions, and environment variables
Comfortable using Git in collaborative codebases
Strong debugging and problem-solving skills
Ability to work effectively in a fully remote environment
Experience with Terminal-Bench, OSWorld-SFT, or similar evaluation frameworks
Background in systems programming, developer tooling, or infrastructure
Familiarity with additional languages (Go, Rust, C/C++, JavaScript)
Interest in AI systems, agent evaluation, or benchmarking
$30 per completed task
Clearly scoped, engineering-focused tasks
Increased task volume
Contract extensions beyond 3 months
Long-term collaboration opportunities
Work on real systems problems, not toy projects
Flexible, remote-first work environment
Clear, measurable impact on AI reliability and evaluation
Opportunity to grow with a fast-moving, technically focused team
Merit-based opportunities for longer-term engagement
Dusker AI
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Nowajmer, rajasthan, india
Experience: Not specified
Salary: Not disclosed
Experience: Not specified
Salary: Not disclosed
ajmer, rajasthan, india
Experience: Not specified
Salary: Not disclosed
Experience: Not specified
Salary: Not disclosed