Senior Application Security Tester & AI Red Team Subject Matter Expert

Evolve Security

Chicago, Illinois 60290 United States View Map

Posted: Jun 09, 2026

Full Time
Federal Government

Summary

The Senior Application Security Tester & AI Red Team Subject Matter Expert is a senior-level offensive security role for a tester who has mastered modern web and API security and is now defining how Evolve Security tests AI-enabled applications, large language models, and agentic systems. This role wears two hats: hands‑on senior application penetration tester for our most complex client engagements, and the firm-wide subject matter expert who builds, scales, and represents Evolve Security's AI red team practice. The senior tester executes assessments with full autonomy, owns the technical relationship with client security and engineering leadership, mentors mid‑level engineers and OSOC analysts, and is the recognized internal authority on offensive AI/ML testing methodology, tooling, and threat modelling.Typical Experience: 5–8+ years of offensive security experience with a deep concentration in web application and API penetration testing, plus demonstrable hands‑on work testing AI/ML systems — LLM‑backed applications, RAG pipelines, fine‑tuned models, multi‑agent systems, or production ML inference. A track record of dozens of completed assessments, published research, conference talks, CVEs, or open‑source contributions is expected.Domain Expertise: Mastery of web application and API security beyond the OWASP Top 10 — business logic abuse, complex authentication and authorization flows (OAuth 2.0 / OIDC, SAML, JWT, mTLS), SSRF chains, deserialization, request smuggling, prototype pollution, and modern SPA / GraphQL attack surface. Equally fluent in the OWASP Top 10 for LLM Applications and OWASP ML Top 10 — prompt injection (direct, indirect, multi‑modal), jailbreaks and safety bypasses, insecure output handling, training data poisoning and extraction, model denial of service, supply chain vulnerabilities in model and plugin ecosystems, excessive agency in agentic systems, sensitive data leakage from system prompts and embeddings, and vector store / RAG poisoning.Technical Skills: Expert with the modern offensive toolchain — Burp Suite Pro (including custom extensions), OWASP ZAP, Nuclei, Postman, Nmap, Metasploit, BloodHound — and able to build bespoke tooling when the off‑the‑shelf option falls short. Comfortable with AI red‑teaming tooling such as Garak, PyRIT, Promptfoo, Giskard, and adversarial ML libraries, and confident designing custom evaluation harnesses against client‑specific LLM and agent stacks. Strong scripting and small‑tool development in Python, with working knowledge of JavaScript / TypeScript, Bash, and PowerShell. Familiar with the components of modern AI applications: vector databases (Pinecone, Weaviate, pgvector), embedding models, retrieval pipelines, agent frameworks (LangChain, LlamaIndex, CrewAI), and tool‑use protocols including MCP.Soft Skills: Excellent written and verbal communication — produces publication‑quality reports with no editorial rework, leads CISO and engineering‑leader briefings, and de‑escalates contested findings with technical rigor. Mentors mid‑level engineers and OSOC analysts through code review, paired testing, and methodology coaching. Comfortable representing Evolve Security externally — webinars, podcasts, conference CFPs, and client thought‑leadership content.Certifications (Preferred, not required): OSWE, OSCP, OSEP, GWAPT, GXPN, Burp Suite Certified Practitioner; AI/ML‑adjacent credentials and contributions such as AI Red Team certifications, published prompt injection research, MITRE ATLAS contributions, or SANS SEC545/SEC595.Expertise that aligns to our approachLead end‑to‑end web application and API penetration tests as the senior technical owner, scoping the engagement, executing the assessment, and presenting findings to client security and engineering leadership.Apply structured testing techniques aligned to OWASP WSTG and OWASP API Security Top 10 to assess authentication, session management, access control (vertical and horizontal privilege escalation), input validation, error handling, and business logic flaws.Design and execute AI red‑team engagements against LLM‑backed applications, RAG systems, and agentic workflows — covering prompt injection (direct, indirect, multi‑modal), jailbreak resilience, system prompt and tool‑use exfiltration, training data and embedding leakage, insecure output handling, and excessive agency in tool‑using agents.Map AI findings to the OWASP Top 10 for LLM Applications, OWASP ML Top 10, MITRE ATLAS, and the NIST AI Risk Management Framework so client stakeholders can defend severity and remediation calls internally.Test the full AI application surface: model endpoints, prompt and response pipelines, retrieval augmentation, vector stores, fine‑tuning pipelines, plugin / tool integrations (including MCP servers), guardrail and safety layers, and supporting cloud infrastructure.Demonstrate proficiency in manual exploit development for both classical web vulnerabilities (XSS, SQLi, SSRF, IDOR, CSRF, deserialization) and LLM‑specific attacks (jailbreak chains, indirect prompt injection via RAG content, agent hijacking via crafted tool outputs).Validate authentication mechanisms — OAuth, OIDC, SAML, MFA implementations, and JWT — and how they extend into AI‑specific surfaces such as agent identity, per‑user tool scoping, and prompt‑level authorization.Assess session management, secrets handling, and data‑flow controls in AI applications, including how user data ends up in prompts, logs, vector stores, and model fine‑tunes.Execute client‑side testing using browser dev tools and proxy‑based inspection, evaluating DOM‑based vulnerabilities, insecure local storage, and AI‑driven client behaviors (e.g., embedded copilots and in‑page agents).Test REST and GraphQL APIs using a combination of dynamic, manual, and automated methods; extend the same rigor to model and agent APIs.Perform code‑assisted (grey‑box) and full source review when available, identifying logic flaws, insecure configurations, and dangerous patterns specific to AI integrations (untrusted‑content‑to‑prompt, unbounded tool use, missing output sanitization).Build, maintain, and contribute to Evolve Security's AI red‑team methodology, payload libraries, evaluation harnesses, and reporting templates — and serve as the firm‑wide reviewer for AI‑related findings.Mentor mid‑level penetration testing engineers and OSOC analysts through paired testing, technical review, knowledge‑sharing sessions, and contributions to internal training and the academy.Represent Evolve Security externally through conference talks, blog posts, webinars, and client thought‑leadership content on application security and AI red‑teaming.Communicate findings clearly, with strong emphasis on business impact, reproducibility, and strategic remediation guidance that engineering teams can actually ship.Success in the first 6 months looks like:Published, version‑controlled AI red‑team methodology covering LLM applications, RAG systems, and agentic workflows, adopted across Evolve Security engagements.A reusable AI red‑team toolkit (custom Garak/PyRIT probes, payload libraries, evaluation harnesses) ready for any tester to use on a client engagement.Senior technical ownership of at least one strategic, AI‑focused client account.Mentorship cadence in place with mid‑level engineers and OSOC analysts; demonstrable uplift in their AI‑related findings and reporting quality.At least one piece of public thought leadership (talk, blog, or research) attributed to Evolve Security.Benefits IncludeHealthcare Benefits401(k) MatchParental LeaveFlexible Paid Time OffAnnual vacation reimbursement#J-18808-Ljbffr
Job Description

The Senior Application Security Tester & AI Red Team Subject Matter Expert is a senior-level offensive security role for a tester who has mastered modern web and API security and is now defining how Evolve Security tests AI-enabled applications, large language models, and agentic systems. This role wears two hats: hands‑on senior application penetration tester for our most complex client engagements, and the firm-wide subject matter expert who builds, scales, and represents Evolve Security's AI red team practice. The senior tester executes assessments with full autonomy, owns the technical relationship with client security and engineering leadership, mentors mid‑level engineers and OSOC analysts, and is the recognized internal authority on offensive AI/ML testing methodology, tooling, and threat modelling.Typical Experience: 5–8+ years of offensive security experience with a deep concentration in web application and API penetration testing, plus demonstrable hands‑on work testing AI/ML systems — LLM‑backed applications, RAG pipelines, fine‑tuned models, multi‑agent systems, or production ML inference. A track record of dozens of completed assessments, published research, conference talks, CVEs, or open‑source contributions is expected.Domain Expertise: Mastery of web application and API security beyond the OWASP Top 10 — business logic abuse, complex authentication and authorization flows (OAuth 2.0 / OIDC, SAML, JWT, mTLS), SSRF chains, deserialization, request smuggling, prototype pollution, and modern SPA / GraphQL attack surface. Equally fluent in the OWASP Top 10 for LLM Applications and OWASP ML Top 10 — prompt injection (direct, indirect, multi‑modal), jailbreaks and safety bypasses, insecure output handling, training data poisoning and extraction, model denial of service, supply chain vulnerabilities in model and plugin ecosystems, excessive agency in agentic systems, sensitive data leakage from system prompts and embeddings, and vector store / RAG poisoning.Technical Skills: Expert with the modern offensive toolchain — Burp Suite Pro (including custom extensions), OWASP ZAP, Nuclei, Postman, Nmap, Metasploit, BloodHound — and able to build bespoke tooling when the off‑the‑shelf option falls short. Comfortable with AI red‑teaming tooling such as Garak, PyRIT, Promptfoo, Giskard, and adversarial ML libraries, and confident designing custom evaluation harnesses against client‑specific LLM and agent stacks. Strong scripting and small‑tool development in Python, with working knowledge of JavaScript / TypeScript, Bash, and PowerShell. Familiar with the components of modern AI applications: vector databases (Pinecone, Weaviate, pgvector), embedding models, retrieval pipelines, agent frameworks (LangChain, LlamaIndex, CrewAI), and tool‑use protocols including MCP.Soft Skills: Excellent written and verbal communication — produces publication‑quality reports with no editorial rework, leads CISO and engineering‑leader briefings, and de‑escalates contested findings with technical rigor. Mentors mid‑level engineers and OSOC analysts through code review, paired testing, and methodology coaching. Comfortable representing Evolve Security externally — webinars, podcasts, conference CFPs, and client thought‑leadership content.Certifications (Preferred, not required): OSWE, OSCP, OSEP, GWAPT, GXPN, Burp Suite Certified Practitioner; AI/ML‑adjacent credentials and contributions such as AI Red Team certifications, published prompt injection research, MITRE ATLAS contributions, or SANS SEC545/SEC595.Expertise that aligns to our approachLead end‑to‑end web application and API penetration tests as the senior technical owner, scoping the engagement, executing the assessment, and presenting findings to client security and engineering leadership.Apply structured testing techniques aligned to OWASP WSTG and OWASP API Security Top 10 to assess authentication, session management, access control (vertical and horizontal privilege escalation), input validation, error handling, and business logic flaws.Design and execute AI red‑team engagements against LLM‑backed applications, RAG systems, and agentic workflows — covering prompt injection (direct, indirect, multi‑modal), jailbreak resilience, system prompt and tool‑use exfiltration, training data and embedding leakage, insecure output handling, and excessive agency in tool‑using agents.Map AI findings to the OWASP Top 10 for LLM Applications, OWASP ML Top 10, MITRE ATLAS, and the NIST AI Risk Management Framework so client stakeholders can defend severity and remediation calls internally.Test the full AI application surface: model endpoints, prompt and response pipelines, retrieval augmentation, vector stores, fine‑tuning pipelines, plugin / tool integrations (including MCP servers), guardrail and safety layers, and supporting cloud infrastructure.Demonstrate proficiency in manual exploit development for both classical web vulnerabilities (XSS, SQLi, SSRF, IDOR, CSRF, deserialization) and LLM‑specific attacks (jailbreak chains, indirect prompt injection via RAG content, agent hijacking via crafted tool outputs).Validate authentication mechanisms — OAuth, OIDC, SAML, MFA implementations, and JWT — and how they extend into AI‑specific surfaces such as agent identity, per‑user tool scoping, and prompt‑level authorization.Assess session management, secrets handling, and data‑flow controls in AI applications, including how user data ends up in prompts, logs, vector stores, and model fine‑tunes.Execute client‑side testing using browser dev tools and proxy‑based inspection, evaluating DOM‑based vulnerabilities, insecure local storage, and AI‑driven client behaviors (e.g., embedded copilots and in‑page agents).Test REST and GraphQL APIs using a combination of dynamic, manual, and automated methods; extend the same rigor to model and agent APIs.Perform code‑assisted (grey‑box) and full source review when available, identifying logic flaws, insecure configurations, and dangerous patterns specific to AI integrations (untrusted‑content‑to‑prompt, unbounded tool use, missing output sanitization).Build, maintain, and contribute to Evolve Security's AI red‑team methodology, payload libraries, evaluation harnesses, and reporting templates — and serve as the firm‑wide reviewer for AI‑related findings.Mentor mid‑level penetration testing engineers and OSOC analysts through paired testing, technical review, knowledge‑sharing sessions, and contributions to internal training and the academy.Represent Evolve Security externally through conference talks, blog posts, webinars, and client thought‑leadership content on application security and AI red‑teaming.Communicate findings clearly, with strong emphasis on business impact, reproducibility, and strategic remediation guidance that engineering teams can actually ship.Success in the first 6 months looks like:Published, version‑controlled AI red‑team methodology covering LLM applications, RAG systems, and agentic workflows, adopted across Evolve Security engagements.A reusable AI red‑team toolkit (custom Garak/PyRIT probes, payload libraries, evaluation harnesses) ready for any tester to use on a client engagement.Senior technical ownership of at least one strategic, AI‑focused client account.Mentorship cadence in place with mid‑level engineers and OSOC analysts; demonstrable uplift in their AI‑related findings and reporting quality.At least one piece of public thought leadership (talk, blog, or research) attributed to Evolve Security.Benefits IncludeHealthcare Benefits401(k) MatchParental LeaveFlexible Paid Time OffAnnual vacation reimbursement#J-18808-Ljbffr
ABOUT THE COMPANY
- Government Careers
Government jobs offer stability, competitive benefits, and the chance to make a meaningful impact on your community and country.

Whether you’re starting your career or seeking new opportunities, these roles provide pathways for growth, security, and service.

Explore positions across a wide range of fields and take the first step toward a rewarding future in public service.

Show more

Senior Application Security Tester & AI Red Team Subject Matter Expert

Summary

Job Description

ABOUT THE COMPANY

Government Careers

MORE JOBS

Air Interdiction Agent New Hire Sign-On Incentives

Military Fire Stations Alerting Consultant

Strategic State Government Affairs Leader

Intelligence Analyst One - TS required to apply - Vienna, Virginia

Customs and Border Protection Officer (CBPO) Entry Level New Hire Sign-On and Retention Incentives

R&D Budget Analyst SAP ERP & Secret Clearance