A ChatGPT detector is a tool that reads a piece of writing and estimates the probability that a large language model, rather than a person, produced it. That is the whole job described in one sentence. Everything else, all the statistics and the confidence bars and the highlighted sentences, is in service of that single guess: human or machine.
The word "estimates" is doing a lot of work in that sentence, and most of the confusion around these tools comes from skipping over it. A ChatGPT detector does not know who wrote anything. It has never seen the student, the assignment, or the draft history. It is looking at the finished text and making a statistical bet. Understanding how it places that bet is the difference between using a detector well and misusing it badly.
This guide walks through what is actually happening under the hood, what the numbers mean, why two detectors can look at the same paragraph and disagree, and how a teacher or administrator should read a result without either ignoring it or trusting it blindly.
What a detector is actually measuring
Start with a strange but useful fact: ChatGPT and a human writing about the same topic will often use many of the same words. The detector is not looking for forbidden vocabulary or secret watermarks. It is looking at the shape of the writing, the statistical texture that emerges when you measure how predictable each word is given the words around it.
Two concepts carry most of the weight here.
The first is perplexity. Loosely, perplexity measures how surprised a language model is by a piece of text. If every word is exactly what the model would have predicted, perplexity is low. If the writing keeps zigging where the model expected it to zag, perplexity is high. Human writing tends to be a little surprising. We reach for an odd word, we leave a thought half-finished, we make a strange comparison because it amused us at the time. Models trained to predict the most likely next word tend to produce text that is, by design, less surprising. So low perplexity is one fingerprint a detector looks for.
The second is burstiness, which measures variation in sentence structure and rhythm. Humans write in bursts. A long winding sentence followed by a short one. A fragment. Then a paragraph that runs on because the writer got excited. Machine-generated prose, especially from a default ChatGPT prompt, often settles into a smoother, more even cadence where sentences cluster around a similar length and complexity. Low burstiness is the second fingerprint.
A detector combines signals like these, usually many more than two, and produces a probability. That is the engine. Now the details.
How the score is produced, step by step
Modern detectors do not eyeball perplexity and call it a day. The typical pipeline looks like this.
First, the text is broken into tokens, the small chunks (roughly word-pieces) that language models actually operate on. The detector runs a model over those tokens and records, for each one, how likely the model thought that token was. This produces a long sequence of probabilities, a kind of confidence trace through the whole document.
Second, the tool computes features from that trace. Average predictability, the variance of predictability, how the predictability rises and falls across sentences, where the most "model-like" stretches are. Some detectors also feed the raw text into a separate classifier that has been trained on thousands of labeled human and AI samples, learning patterns no hand-written rule would catch.
Third, those features are scored, often passage by passage rather than all at once. This is why good detectors can highlight specific sentences as more or less likely to be AI-generated instead of stamping a single number on the whole essay. A document is rarely all human or all machine. A student might draft most of it themselves and paste in two AI paragraphs, and a sentence-level detector is built to surface exactly that.
Fourth, everything is rolled up into the summary you actually see: a percentage, a band like "likely AI," or a color. That final number is a compression of all the underlying signal, which is both convenient and dangerous, because it hides how confident the tool really is.
Why detectors disagree with each other
Run the same paragraph through three ChatGPT checkers and you may get three different answers. This is not a glitch. It is a direct consequence of how they are built.
Each detector uses a different underlying model to measure predictability, and was trained on a different mix of human and AI text. One might have been tuned heavily on student essays, another on news articles, another on generic web text. They weight perplexity and burstiness differently. They set their decision thresholds in different places, with some erring toward catching more AI at the cost of more false alarms, and others doing the reverse.
There is also the moving-target problem. Detectors are trained on outputs from particular model versions. When OpenAI ships a new version of ChatGPT, or a student asks it to "write more casually" or "vary your sentence length," the statistical fingerprint shifts. A detector tuned on last year's output can lag behind this year's. Detection and generation are locked in a quiet arms race, and the detectors are always responding to a model that has already moved.
So disagreement is the normal state of affairs. The right response is not to pick whichever tool gives the answer you want, but to treat each score as one noisy reading rather than a verdict.
What the percentage does and does not mean
Here is the single most important thing to internalize: a result of "90% AI" does not mean there is a 90% chance a student cheated. It means the text has statistical properties the tool associates with AI-generated writing about 90% of the time in its training data. Those are completely different claims.
Why the gap matters in practice:
- Some humans naturally write "machine-like" prose. Highly structured, even-paced, formulaic writing scores as AI even when a person produced every word. English-language learners, students taught rigid five-paragraph templates, and technical writers are especially prone to false positives because their style is, by training, low in burstiness.
- Short samples are unreliable. Perplexity and burstiness need room to show themselves. A 40-word answer simply does not contain enough signal, and detectors are far less trustworthy on anything under a few hundred words.
- Light editing can flip the score. Paraphrasing a few sentences, swapping vocabulary, or asking the AI to write less predictably can pull a confident "AI" result down toward the middle, where the tool is honestly unsure.
A detector score is evidence, in the way a smoke alarm is evidence. It is a reason to look more closely. It is not a conviction.
Common misconceptions
"The detector found proof." It found a probability. There is no proof inside the text itself, because ChatGPT does not leave a signature. Anyone selling certainty is overselling.
"A 0% AI score clears the student." Low scores can be false negatives. A student who edited AI output, or used a tool specifically built to evade detection, may sail through clean. Absence of a flag is not evidence of original work.
"If I run it twice I will get the same number." Many detectors are deterministic, but plenty produce slightly different results across runs or across small text changes. The number has a margin of error even if the interface does not show one.
"Detectors can tell which AI model was used." Some make a guess, but reliably distinguishing ChatGPT from a competing model from a heavily-edited hybrid is far harder than detecting machine involvement at all. Treat any specific model attribution with extra skepticism.
How to actually use a ChatGPT detector
The tool is a starting point for a conversation, not the end of one. A few practices keep it honest.
Read the sentence-level highlights, not just the headline percentage. A document flagged 70% overall is much more useful when you can see which two paragraphs drove that number.
Weight the score against everything you already know: the student's prior writing, the draft history, whether the voice in the essay sounds like them. A detector that contradicts a folder full of consistent earlier work deserves suspicion of the detector, not automatically the student.
Use a score as a reason to ask, never as the answer. "This came back with a high AI signal, walk me through how you wrote it" is a fair conversation. "The computer says you cheated" is not, and it does not survive contact with a false positive.
And give the writing enough text to judge. Run full assignments, not fragments, and be openly cautious about anything short.
ChatGPT detection is real, it is useful, and it is getting better. It is also probabilistic, fallible, and easy to misread as something more certain than it is. The teachers who get the most out of these tools are the ones who understand exactly that: a detector hands you a well-informed guess, and the judgment of what to do with it stays human.

