What is a context window? Why your AI chat degrades
Your long chat used to be sharp and now it forgets, repeats itself, or contradicts what it said earlier. The context window is the reason. Here is what it actually is, in plain English.
Every AI chat has a limit on how much text it can hold in mind at once. That limit is the context window. It is not a memory bank that grows forever. It is a fixed budget. Once you understand that, everything about why long chats get worse suddenly makes sense.
The one idea: a fixed budget the model re-reads every turn
Here is the part most people never get told. The model does not "remember" your conversation the way a person does. Each time you send a message, it re-reads the whole thread from the top to figure out what to say next. Your system instructions, every message you sent, every reply it gave, and any files you pasted all go back in together, every single turn.
All of that has to fit inside one fixed budget. Think of it as a whiteboard of a fixed size. Everything the model can consider right now has to be written on that one whiteboard. There is no second whiteboard, and it does not get bigger as your chat grows.
The short version
The context window is a fixed amount of text the model can look at in a single turn. Your entire conversation competes for that space. When the conversation gets longer than the budget, something has to give.
Why long chats get worse
At the start of a chat, everything fits comfortably. The model can see your original goal, your constraints, and every decision you made along the way. It feels sharp because it genuinely has the full picture in view.
As the thread grows, the whiteboard fills up. Eventually there is more conversation than the budget can hold. At that point the tool has to make room, and it does so by dropping or compressing the parts it judges least relevant right now. That is usually the oldest content: your first message, where you often stated the real goal and the rules that matter most.
So the failures you notice are not random. They are predictable side-effects of a full budget:
- It forgets earlier decisions. The turn where you locked something in has scrolled out of view.
- It repeats questions you already answered. The answer is no longer in what it can see.
- It contradicts itself. It can only reason over what is currently on the whiteboard, not the parts that dropped off.
- It drifts off your original goal. The recent messages weigh more heavily than the beginning that framed everything.
People call this "context rot": the slow decay of quality in a chat that has simply outgrown its budget. Nothing broke. The window just filled up.
Why you can't fix it by asking nicely
The instinct is to type "please remember what we agreed earlier." But if that agreement has already dropped out of the window, there is nothing to remember. You are asking the model to recall something it can no longer see. Adding more messages only pushes more of the early context out, which makes the problem worse, not better.
This is also why re-explaining inside the same thread rarely sticks. The re-explanation itself takes up budget, and the underlying cause, an overloaded window, is still there.
The real fix: a fresh chat plus a handoff
A brand-new chat starts with an empty budget. Everything you put in it gets full attention again. The trick is to carry over only what matters, instead of dragging the entire bloated history along.
That short summary is called a handoff: a compact reboot prompt that captures the goal, the locked decisions, the constraints that kept coming up, and where things stand right now. Paste it as the first message in a new chat and the model is back to full sharpness, minus the dead weight.
You can write one by hand:
- State the goal. One or two sentences on what you are actually trying to produce.
- List the locked decisions. Things already settled that should not be reopened.
- List the standing constraints. The rules you found yourself repeating.
- Describe the current state. What is done, what is in progress, and the immediate next step.
Keep it tight. The whole point is to spend as little of the new budget as possible while still giving the model everything it needs. A focused handoff beats a giant transcript every time.
The faster way
Let Uncook write the handoff for you
Reading back through a long chat to pull out the goal and decisions is exactly the tedious part. Paste a share link to your ChatGPT or Claude conversation and Uncook does it for you — your goal, locked decisions, repeated constraints and current state, already assembled into a clean reboot prompt. Skim it, paste it into a fresh chat, and keep going.
Uncook my chat →Honest about your data: pasted text is analyzed in your browser; a share link is fetched once through our server to read the conversation, then discarded — never stored, never used for training. A share link makes the chat viewable by anyone with the URL; un-share it once you're done.
Related: Why long AI chats get worse (context rot) · How to hand off an AI conversation to a fresh chat