The short answer
AI safety jobs are scarce, prestige-heavy, and very competitive — most research leads have PhDs. But it's the best-funded field for a student to enter: paid fellowships like Anthropic Fellows ($3,850/week plus compute) and MATS explicitly don't require a PhD, and red-teaming arenas offer a zero-credential on-ramp with real prize money.
Read this before you apply anywhere
Here’s the hard part first, because everyone else buries it. AI safety is a small, prestigious field, and it is genuinely difficult to get into. 80,000 Hours — the career org closest to this space — describes technical safety research as “very competitive to enter” and notes that the vast majority of research leads have a PhD. Their cited practical bar for empirical research work: you should be able to reproduce a typical ML paper in a few hundred hours. If you were hoping this was a back door into AI because it sounds less technical than machine learning engineering, it isn’t. In some ways it’s harder — fewer seats, more applicants who care intensely.
Now the part that makes this page worth reading anyway: of the three “AI conscience” fields — safety, AI ethics, and AI governance — safety is the only one that pays students to learn the research itself. Ethics is the most credential-gated of the three, and governance’s entry door — paid compliance internships around $20–40/hour — is real but conventional. Safety, unusually, funds its own research pipeline: multiple fellowships pay real stipends — in one case $3,850 a week — to people with no PhD, no prior ML research, and no published papers, because the field has more money than trained people. That’s a structural oddity you can exploit, and the verified list is below.
One reframe before the details: you don’t enter AI safety by searching job boards for “AI safety analyst” and applying. Almost nobody gets in that way. You enter sideways, through a funded program or a public red-teaming track record, and convert that into a role. The job boards are where you land after the on-ramp, not before.
What “AI safety” actually means (and the two tracks inside it)
When people search this term they usually mean the technical field: making sure advanced AI models behave as intended, and building the tests that prove it. It’s distinct from its two sibling fields, and employers notice when you conflate them. AI ethics jobs ask “is this system fair, and does it harm anyone?” — fairness, bias, harm review inside product companies. AI governance jobs ask “does this system meet the law?” — compliance, risk documentation, the EU AI Act. AI safety asks “will this model do what we intend, and how do we test that?” — and it lives in frontier labs, safety nonprofits like METR, Apollo Research, and Redwood Research, and government evaluation bodies.
Inside safety there are two tracks, and you should pick one before you apply to anything:
The technical track — alignment research, interpretability, and evaluations. Reproducing papers, running experiments, building eval harnesses, adversarially testing models. This is where the money and the prestige are, and where the Python bar applies.
The policy track — researching how frontier AI should be governed, drafting policy, translating eval results into regulation. Home base is think tanks like GovAI, IAPS, and RAND, plus government institutes. No ML engineering required; sharp writing and quantitative literacy are the currency. This track borders governance closely enough that you should read the AI governance jobs guide alongside this one.
A naming note that doubles as a trust test: the US AI Safety Institute was renamed CAISI (Center for AI Standards and Innovation) in June 2025, with its mission shifted toward voluntary standards and national-security evaluations, and the UK’s institute is now the AI Security Institute. Both still hire technical evaluators. If a career guide you’re reading still says “US AISI,” it hasn’t been updated since 2025 — and in a field this fast-moving, that tells you what its pay numbers are worth too.
The real roles and what they pay
Per-role ranges, never per-company. The oft-quoted SF-lab numbers — 80,000 Hours cites a ~$222k median for SF software engineers as the reference band for lab research engineers — are context for where the field tops out, not what you’ll be offered. Safety nonprofits pay less than labs. Your entry expectations should anchor to the stipend and contract rates below, because that’s what entry actually pays.
Research engineer / research assistant (empirical safety). Reproduce ML papers, run interpretability and robustness experiments, build eval harnesses. Lab-scale pay at the top; nonprofit pay below it. Requires strong Python and ML engineering — the reproduce-a-paper bar — but notably, many contributor roles do not require a PhD. Apply via the 80,000 Hours job board (~850 roles listed across the ecosystem), aisafety.com/jobs, and the labs and safety orgs directly. As of mid-2026, Apollo Research alone had around a dozen open roles across evals, governance, and operations.
Model-evaluation / red-team analyst. Design evals, adversarially test models, write up failures as reproducible reports. Contract eval work runs $18–35/hour; ZipRecruiter’s “AI safety” average of $32.38/hour (June 2026) is a low-confidence aggregate but lands in the same band. Creative prompting and clear writeups matter more than credentials here — a leaderboard placing beats a resume line. Some queues need no coding at all.
Safety fellow. The real student entry. Three to four months of mentored research aimed at a public output, paid a genuine stipend. Detailed table below, because this is the heart of the page.
AI safety policy analyst. Early-career think-tank roles run from fellowship stipends up to roughly $70k–110k. IAPS pays its policy fellows $15,000 for three months ($22,000 at senior level) — a useful anchor for what funded policy entry looks like.
Ranges compiled from platform listings, job postings, and worker reports · last verified July 2026.
The paid fellowship list — the best-funded door in AI
This is the differentiator of the field. Every program below is verified and designed for people who are not yet safety researchers, and most pay a real stipend. Windows rotate fast, so every date is stamped: all windows below are as of July 2026 — re-check the program page before you plan around one.
| Program | What it pays | Format | Window (as of July 2026) |
|---|---|---|---|
| Anthropic Fellows | $3,850/week stipend + ~$15k/mo compute budget | 4 months, mentored empirical research | Cohorts starting May & July 2026 |
| OpenAI Safety Fellowship | Paid fellowship | Safety evals, robustness, oversight | Sep 14, 2026 – Feb 5, 2027 |
| MATS | Funded research program | Mentors from OpenAI, Anthropic, DeepMind, Redwood, ARC; J-1 visa support | Summer & Autumn 2026 cohorts |
| CBAI Summer Research Fellowship | Fully funded | 9 weeks, Cambridge MA | Jun 8 – Aug 10, 2026 |
| ERA:AI Fellowship | Stipend + lodging, visa, transport | 10 weeks, Cambridge UK | From July 2026 |
| ARENA 9.0 | Technical upskilling bootcamp (London) | Alignment engineering curriculum | Oct 5 – Nov 6, 2026 — apply by Jul 12, 2026 |
| SPAR | Part-time, remote, undergrad-friendly | 5–20 hrs/week research | Spring 2026 closed; next cohort TBA |
Three things to notice in that table. First, the Anthropic Fellows bar, in the program’s own words: you don’t need a PhD, prior ML experience, or published papers — they select on ability to execute research. That is a radically lower credential gate than the full-time roles, attached to a higher weekly stipend than most graduate jobs. Second, the OpenAI Safety Fellowship explicitly welcomes people from CS, social science, cybersecurity, and HCI backgrounds — the policy-and-evals side of safety genuinely wants non-ML people. Third, MATS reports that around 80% of its alumni now work in alignment — these programs are conversion machines, not resume decorations.
The honest caveat: fellowships this well-paid are competitive in proportion. Treat the application itself as a project — most of them ask you to demonstrate research execution, which is exactly what the proof-of-work method in AI jobs with no experience manufactures. A rejected-then-reapplied cycle with a stronger portfolio is a normal path in, not a failure.
The zero-credential on-ramp: get paid to break models
If the fellowship table feels out of reach this semester, there’s a floor-level entrance that requires no degree, no Python, and no application essay: adversarial red-teaming.
Gray Swan Arena is the standout. It’s a public red-teaming competition platform — you try to make models misbehave, in a browser, with words. No coding required. Prize pools have historically run past $40,000 per event, the top 50 red-teamers get invited to a paid private network, and standout performers have been hired off the leaderboard. This is the rare corner of AI hiring where a public scoreboard genuinely substitutes for a resume. HackerOne’s AI bounty programs are the adjacent, more security-flavored version.
One tier below that sits paid evaluation-queue work: rating model outputs against safety rubrics, flagging unsafe responses, writing up failure cases at the $18–35/hour contract rates above. It’s the same platform ecosystem covered in AI training jobs, pointed at safety-specific queues — and it teaches you, from the inside, how models actually fail. That intuition is precisely what eval roles and fellowship applications select for. If you’re starting from absolute zero, the broader entry landscape is mapped in entry-level AI jobs.
The sequence that works: eval-queue work or Gray Swan for income and model intuition → one public artifact (a writeup of a novel jailbreak or failure mode, published where a stranger can check it) → fellowship application with that artifact attached. Each step funds the next.
How students actually break in
If you’re technical (or willing to become technical): your gap is research execution, not credentials. Work through ARENA’s curriculum or BlueDot Impact’s free AI Safety Fundamentals technical stream, then reproduce one ML paper — yes, the few-hundred-hours version — and publish the reproduction with your notes. That single artifact answers the exact question every fellowship asks. Then apply to MATS, Anthropic Fellows, and CBAI in the same cycle; the applications overlap heavily. A machine learning internship in parallel keeps a conventional path open while you swing at the fellowships.
If you’re policy-shaped: skip the Python guilt. Take BlueDot’s free governance stream, write two short, sharply-argued pieces on a live frontier-AI policy question, and apply to IAPS ($15k, no policy experience required), GovAI’s fellowship, and the Horizon Fellowship’s US policy placements. Watch CAISI and the UK’s AI Security Institute for early-career openings — government eval bodies hire more junior than labs do.
If you’re neither yet: start in the paid queues and the Arena tonight. It’s the only entrance with no gate, and it’s real money while you figure out which track fits.
Tools that get the interview
Fellowship and safety-org applications are won on evidence, not gear. But once you’re converting a fellowship or a leaderboard run into actual job applications, a few tools save time. Our current picks — with the honest caveats and what each actually costs — live on one page: the tools we actually recommend.
FAQ
Do you need a PhD to work in AI safety? For research lead roles, effectively yes — 80,000 Hours notes the vast majority of leads have one. For everything below that, no. Many contributor roles don’t require a PhD, and the major fellowships say so outright: Anthropic Fellows requires no PhD, no prior ML experience, and no published papers. The bar is demonstrated research execution, not the credential.
How do students get into AI safety? Sideways, through programs — not by applying to job postings. The working sequence is: free coursework (BlueDot, ARENA curriculum) → one public artifact (a paper reproduction or a red-teaming writeup) → a paid fellowship (Anthropic Fellows, MATS, CBAI, ERA) → conversion into a lab or safety-org role. Paid eval work and Gray Swan Arena fund the early steps.
Do AI safety fellowships actually pay? Yes, unusually well — this is the field’s defining quirk. Anthropic Fellows pays $3,850/week plus a compute budget; IAPS pays policy fellows $15,000 for three months; CBAI and ERA are fully funded including lodging. All figures as of July 2026 — windows and terms move, so verify on the program page.
What’s the difference between AI safety and AI ethics? Safety is the technical field: will the model behave as intended, and how do we test it — evals, alignment, red-teaming, mostly at labs and safety nonprofits. AI ethics is responsible-AI practice inside companies: fairness, bias, harm review. Governance is the legal-compliance sibling. They interlink, but employers hire for them separately.
Can you get an AI safety job without coding? Two doors, honestly. The policy track — think tanks, fellowships like IAPS, government institutes — runs on writing and analysis, not code. And red-teaming via Gray Swan Arena requires no coding at all; it’s adversarial creativity in plain language, with prize money and hiring attached. The technical research track, though, does require real Python and ML engineering — no way around it.
Is AI safety a growing field? Yes, but from a small base — think hundreds of roles across the ecosystem, not tens of thousands. The 80,000 Hours board lists ~850 roles across all its categories, and individual safety orgs hire in the single digits to low dozens. Funding is growing faster than headcount, which is exactly why the fellowship pipeline is so well-paid: the field is buying its future researchers.
Related guides
- AI governance jobs — the compliance-driven sibling field, and the most enterable of the three via internships and GRC work.
- AI ethics jobs — the responsible-AI sibling: hardest to enter directly, and what the titles really look like at entry.
- AI jobs with no experience — the proof-of-work method that fellowship applications quietly select for.