Job
Description
Role Summary You will own the full ML stack that turns raw dielines, PDFs, and e-commerce images into a self-learning system that reads, reasons about, and designs packaging artwork. This includes: Building data-ingestion and annotation pipelines (SVG and PDF to JSON) Designing or modifying model heads on top of LayoutLM-v3, CLIP, GNNs, diffusion LoRAs Training and fine-tuning on GPUs Shipping inference APIs and evaluation dashboards You will work day-to-day with packaging designers and a product manager. You will be the technical authority on deep learning for this domain. Key Responsibilities Data and Pre-processing (approx. 40 percent) Write robust Python scripts to parse PDF, AI, SVG files; extract text, colour separations, images, and panel polygons Implement Ghostscript, Tesseract, YOLO, and CLIP pipelines Automate synthetic-copy generation for ECMA dielines Maintain vocabulary YAMLs and JSON schemas Model R&D (approx. 40 percent) Modify LayoutLM-v3 heads (panel-ID, bounding box regression, colour, contrastive learning) Build panel-encoder pre-training (mask-panel prediction) Add Graph Transformer and CLIP-retrieval heads Optional: diffusion generator Run experiments, hyperparameter sweeps, ablations Track KPIs such as IoU, panel-F1, and colour recall MLOps and Deployment (approx. 20 percent) Package training and inference into Docker, SageMaker, or GCP Vertex jobs Maintain CI/CD and experiment tracking using Weights and Biases or MLflow Serve REST or GraphQL endpoints used by designers and front-end teams Implement an active-learning loop that ingests designer corrections nightly Must-Have Qualifications 5+ years experience with Python, and 3+ years with deep learning (PyTorch, Hugging Face) Hands-on experience with Transformer-based vision-language models (LayoutLM, Pix2Struct) Familiarity with object detection (YOLOv5 or YOLOv8, DETR) Comfortable with PDF and SVG toolchains such as PyMuPDF, pdfplumber, Ghostscript, svgpathtools, and OpenCV Experience designing custom model heads and loss functions Ability to fine-tune large pretrained checkpoints on limited data Solid Linux and GPU knowledge Experience with GNNs or relational transformers Clean and idiomatic Git usage with reproducible experiment tracking Nice to Have Knowledge of colour science (Lab, ICC profiles, Pantone tables) or print production Experience with multimodal retrieval (CLIP, ImageBind) or diffusion fine-tuning (LoRA, ControlNet) Background in packaging or consumer goods industry (e.g., Nutrition Facts, ECMA codes) Experience using FAISS and ML tooling on AWS or GCP Familiarity with Typescript or React for simple label preview UIs Primary Tool Stack PyTorch, Hugging Face Transformers, torch-geometric PyMuPDF, pdfplumber, svgpathtools, OpenCV, Ghostscript Tesseract, YOLOv8, Grounding DINO CLIP, ImageBind, FAISS Docker, GitHub Actions, W&B, MLflow AWS SageMaker, GCP Vertex Python (primary), plus some Bash, JSON, YAML First 6 Months: Sample Deliverables Data pipeline v1 that converts 500+ ECMA dielines and 200 PDFs into training-ready JSON Panel-encoder model with less than 5 percent masked-panel error MVP copy-placement model achieving 85 percent IoU on validation data Inference service with preview UI capable of generating packaging layout for one SKU in under 10 seconds Nightly active learning retraining loop integrated with real user corrections