Checkmark Plagiarism Logo
Checkmark Plagiarism
Menu
Back to Blogs
IndustryNews~7 min read

Blue Books Are Back. The Real Question Is What Schools Do Next.

Princeton ended a 133-year-old honor code and blue book sales are surging as schools revert to handwritten, proctored, and oral exams to fight AI cheating. Why reverting the format is a tourniquet, not a treatment — and what a layered approach looks like.

The Checkmark Plagiarism Team
Blue Books Are Back. The Real Question Is What Schools Do Next.

When Princeton's faculty voted in May 2026 to require proctors at every in-person exam, they were not just changing a scheduling rule. They were ending a 133-year-old tradition. Since 1893, Princeton students had taken unproctored exams under an honor code that a student petition originally created to get rid of proctoring. The faculty reversed that arrangement with a single dissenting vote, citing AI and personal devices as the catalyst (Inside Higher Ed; The Daily Princetonian).

Princeton is the headline, but it is not an outlier. Across higher education, instructors are quietly rolling assessment back to a pre-digital state: handwritten blue books, in-class essays, oral exams. The question worth asking is not whether this is happening — it clearly is — but whether reverting the format of assessment actually solves the problem it is reacting to. Our read, as a company that builds originality and AI-detection tools: the reversion is a reasonable emergency measure, but it is a blunt instrument, and the schools that lean on it alone will trade one set of failures for another.

The blue book comeback is real, and the numbers show it

The clearest evidence is commercial. According to Wall Street Journal reporting summarized by The Daily Cardinal, blue book sales at the University of Florida rose roughly 50% in the 2024–2025 academic year, and at UC Berkeley they jumped about 80%. At the University of Wisconsin–Madison, the 12-page books sold out in early September with no restock date, and smaller books faced three-week backorders. When a paper product that had been drifting toward obsolescence suddenly sells out, something structural is going on.

That something is a collapse in faculty confidence. A survey of 337 higher-education leaders — presidents, provosts, chancellors, and deans — run by the American Association of Colleges & Universities and Elon University's Imagining the Digital Future Center between November 4 and December 7, 2024, found that 21% believed cheating on their campus had increased "a lot" and another 38% said it had increased "a little" since generative AI became widely available (GovTech). More telling was the second finding: more than half of those leaders said their faculty were "not at all effective" or "not very effective" at recognizing AI-generated work. That is the real driver. When you cannot trust your own eyes on a take-home essay, the in-class blue book starts to look like the only ground you can stand on.

Individual instructors describe exactly that reasoning. At Georgia State, associate professor Jason Coupet switched back to paper blue books for 2024–2025 after watching generative AI seep into students' online work (KQED). St. Michael's College historian Alexandra Garrett never left them: "I've never not done blue books for exams and I have no incentive to change it." There is even a learning-science argument in the mix — Vanderbilt's Sophia Vinci-Booher researches the cognitive benefits of handwriting, noting that when "note-taking and testing modes align, a student is more likely to perform better." The pen-and-paper revival is not pure nostalgia; some of it is defensible pedagogy.

Oral exams and proctoring are the same instinct in different clothes

The blue book is the low-tech end of a broader move back toward witnessed work. At the high-stakes end sits the oral exam. The Washington Post reported in December 2025 on professors reviving oral examinations specifically because a student can submit a flawless written assignment and then fail to explain it out loud (Washington Post). Education researchers have been making the case for a while that the spoken defense is one of the few formats a language model cannot sit in for (The Conversation), and assessment specialists have floated "sampled vivas" — short oral checks on a random subset of students — as a scalable deterrent (Times Higher Education).

There is a genuine insight buried in the oral-exam enthusiasm, and it is worth stating plainly because it applies to detection too: oral exams do not merely catch AI use after the fact, they change how students study in the first place. If you know you will have to talk through your reasoning, you prepare to own the ideas rather than to assemble them. That incentive shift is the part of this trend most worth protecting.

Why reversion alone is the wrong lesson to take

Here is where we part company with the "just go back to paper" framing. Reverting the format addresses the symptom — unverifiable take-home work — without addressing the thing schools actually exist to do, which is teach students to write, research, and think over time. Three problems make pure reversion a poor long-term strategy.

It penalizes the wrong students. Timed, single-draft handwriting is not a neutral test of knowledge; it is also a test of handwriting speed, working under pressure, and writing in your first language. The UW–Madison instructors quoted by The Daily Cardinal saw this immediately: many first-year students are "taking their first semester of college in another country, in another language," which is why some instructors permit dictionaries during exams. Students with documented accommodations — extended time, assistive technology, processing differences — are disadvantaged by exactly the constraints that make a blue book "AI-proof." An integrity measure that quietly lowers scores for multilingual and disabled students has simply swapped an academic-integrity problem for an equity problem.

It throws out the skill it claims to protect. Real writing is iterative: you draft, you sit with an idea, you revise. UW–Madison's Clinton Castro responded to AI not by banning the keyboard but by building multi-step assessments with peer review and rewrites, precisely to preserve "something really valuable about sitting with an idea, writing in stages, doing drafts." A semester of in-class, one-shot essays trains students to produce rushed first drafts under a clock — which is neither how good writing happens nor what employers say they want from graduates. Surveys of employers consistently rank communication and the ability to develop and explain ideas above raw speed.

It does not scale, and it does not cover everything. Oral exams are hard to run fairly in a 300-person lecture, grading consistency suffers, and the lack of anonymity invites bias. Proctored in-class work says nothing about the take-home lab report, the semester project, the thesis. You cannot put every meaningful assignment in a blue book without gutting the curriculum.

What a layered approach looks like — and where detection fits

The honest position, even from a detection company, is that no single tool — not a blue book, not an oral exam, and not an AI detector — is a verdict on its own. What works is layering, where each method covers another's blind spot.

Start with assessment design, because it is the highest-leverage and least-discussed lever. Assignments that ask for personal connection to course material, that build on in-class discussion, that require students to critique an AI-generated draft rather than produce one — these are harder to outsource and more worth doing. This is the Castro model: redesign the task so that doing it honestly is also the best way to learn.

Layer in process evidence. Version history in Google Docs, draft submissions, brief in-person check-ins, the occasional sampled viva — these create a record of the work developing over time. A student who can walk you through their reasoning has demonstrated something a finished document never can, and the threat of being asked changes behavior up front.

Then use detection as a signal, not a sentence. This is where our own product lives, so we will be precise about what it is for. An AI or plagiarism detector is an early-warning flag that tells an instructor where to look more closely — not a tribunal that decides guilt. The same AAC&U/Elon survey that explains the blue book panic also explains why: faculty do not trust their unaided judgment on AI text, and they are right not to. A good detector restores some of that signal. But it is a starting point for a conversation with a student, paired with process evidence and the instructor's knowledge of that student's prior work — never a number that ends the conversation. Detection tools have false positives, they can be unreliable on writing by multilingual authors, and treating a probability score as proof recreates the same equity failure as the blue book. Used as one input among several, detection lets schools keep assigning the take-home, iterative, real-world writing that reverting to paper forces them to abandon.

The takeaway

The blue book revival and the Princeton vote are rational reactions to a real loss of trust. We do not think instructors who reach for them are wrong to be worried — the survey data, the sold-out bookstores, and Princeton's own senior survey (where nearly 30% of seniors admitted to cheating and 44.6% knew of violations they never reported) all describe a genuine crisis. But reverting the format of assessment is a tourniquet, not a treatment. The schools that come out of this era strongest will not be the ones that retreat fastest to 1893. They will be the ones that redesign assessment so honest work is the path of least resistance, gather evidence of the writing process, and use detection the way a good clinician uses a lab result — as a signal that tells you where to look, not a diagnosis that ends the inquiry.

Sources

Blue Books Are Back. The Real Question Is What Schools Do Next.