ai-automation

Claude VS Codex Part One - The start of the wars and Why I'm biased.

16/04/2026

Over the last 2 months I've been using Claude and Codex exclusively and a little Cursor, these are my unhindered thoughts about both and what I love.

Context: This needs to be in parts, I have too much to say.

OK, I've been trying to keep an open mind about this for a very long time. But I can't do it anymore. I fucking hate Claude. I hate that people love it so much. I hate that it has become the more popular AI tool, and I pretty much hate everything about it, except for the cute icon. The die-hard fans and the computer scientists using it can go eat a dick. Fuck you guys, seriously. - YES that was the TLDR

There are so many reasons why I hate using it, and the more that time goes on, the more I hate it, and the more I have to constantly check my ego and not think everyone's fucking retarded. Because that possibility is very real, and it really makes me wonder about the future of AI and the general public.

Ok. Let me give some history first, because I was strongly against using AI for a very long time (like 2–3 years). I sat back and watched people posting "art" online, claiming they were artists. These must have been bored kids wanting to fit in or something, or just fucking trolls... I don't know. But it was obvious. And people noticed and hated it, as did I.

The AI intrusion

October 2025, I was coding my mate's website in NextJS; previously it was Wordpress. I've been looking after it for about 12 years now. He's a snake catcher in North Brisbane, Robert Watson. Top bloke and great at handling snakes.

I had pretty much built most of the site out and was in between restarting my PC or restarting VSCode or whatever, when suddenly I was gifted free Copilot. Thanks, Microsoft. Anyway, I started to talk to this thing, and it was suggesting I let it help me code. So I asked it some questions, it gave me some ideas, then jumped in and started helping me code this website. I was expecting fucking garbage because I had read stories about how AI couldn't even solve basic tasks like the modulus task that's in most interviews.

But it did something amazing, and so I asked it if it could do something else (which was a bit more complicated), and it did. Then I asked it to scaffold out a new website and gave it a particular style I wanted it in, and it did, in an amazing short period of time. I was like.... ooooh fuck. OK, I get it now. I get why people have been so hooked on AI for the past couple of years.

So I toyed around with Copilot for a couple of months. I got OpenAI - ChatGPT on my phone, it was also awesome. Fast forward to February, when I heard about Codex in VSCode, I messed around with Codex vs other agents and realised Codex seems to get things. I didn't have to explain as much and didn't have to give explicit instructions. It seemed to remember more. In fact, 3 months later I still have my first chat open, and it doesn't care. It may have compacted it a few thousand times since then. But it doesn't matter, it just works.

Codex was released, FEBRUARY 2026

I started making a crypto bot with Codex in the first week or so it came out. I cancelled Copilot shortly after (because Codex felt better; it didn't hallucinate and do stupid things, which Copilot does as it's switching LLMs), got the $20 OpenAI subscription, knowing full well I was getting 5x bonus for early adoption. We quickly found out that I needed a TradingView Parity Engine, as I was using it for strategies; however, the trading bot seemed to be running on different terms. I understood this and didn't verbally bash Codex... much.

So when we moved into the new project, I set up my AGENTS.md file and was very explicit. I also created 3 skills using the Skill-seekers plugin, which is amazing BTW because it's not CLAUDE dependent. This created 3 huge skills with RAG dumps and some usage files. This was TradingView API, Binance API, and Kraken API. About 4MB each. At the start of my AGENTS.md file, I gave context for the project and said, "Use this file as your working brain. I want you to update it regularly with your findings and important information. Give a brief at least every 15 prompts and clean and compact stuff that's no longer required or can be shortened."

My AGENTS.md file is over 3500 lines long now. Codex has no issues regularly writing to it, and after my first instruction some 6–8 weeks ago, I have never had to ask it to look at the skills or Agents file. It just does it, without asking. Because of this, it is a master at everything to do with TradingView, Kraken, and Binance. Not once has it given me a strategy that's got Pine code errors. Not once has it completely miffed and hallucinated. WHY?

Claude enters the picture 1st week of March

Approximately 5 weeks ago, I decided I wanted to work on other projects, and Codex was already working on 2 projects full time for me, so it made sense I dived into Claude because everyone hypes it up. 90% of all AI repos being made are Claude-specific, and some of them are brilliant. So I didn't hesitate, bought the $20 subscription and installed Superpowers and GetShitDone straight away. Started playing with it.

Seemed great, I think it was day 2 I hit the 5-hour context window... OK, waited about 2 hours for it to go and continued on. Hit it again. I thought, ughhh, that's a bit annoying, but I was brainwashed into thinking Claude was superior, so I upgraded my plan... yep. What a fucking idiot. I had been using Codex for months without issue, yet here I was dumping $160 into Claude's monthly subscription.

So, back to business. Claude's doing things, everything's fine for a couple of days, then BAM... hit the 5-hour usage again. I'm thinking, wait, I'm just doing simple CSS changes and building themes in a NextJS site. This is NOT HEAVY USE. I started getting pissed off. This happened again and again, realised people on X crying about the same shit. So I asked Claude what the issue was; it said it was Superpowers and GSD... which made no sense. So I got rid of them, made my own GSD... essentially. GTEAM.

Claude dies on 1 prompt... 5 hr limit window reached.

One morning I woke up, got Codex working on some stuff in the Crypto Parity Engine, and decided to ask Claude to have a look at it and give me his thoughts. 1 prompt. 1 fucking prompt. Less than 50 words. 1 god damn fucking prompt, and it hit the 5-hour usage window. I thought, OK, someone has my login information or something... wtf is going here?

After the 5 hours was up, I told Claude what had happened and asked it to not load anything except for my prompt into the context window. It had a look at files; I'm guessing it was sending shit back and completely ignored me, as it would have had to for it to come to the conclusion it did. ~3000-line AGENTS.md instructing me to load 3 x ~500-line SKILL.md's, which tell me to use the ~500-line REFERENCE.md files, which link to ~4MB RAG JSON files.

I said, "wait, so you tried to load all of that into the context window?" YEP. Claude said, "I tried to load over 16MB of context into the one prompt. You can't do that. I can only handle about 5000 lines".

5000 lines before you hit the 5-hour limit? Yeah, more or less.

FUuuUuUuUuCKKKkk YOOoooU.

WTF is going on?

Hitting these limits is a regular thing for a lot of people; so is hallucinating and being dumb as fuck. Just look at X, endless posts and comments.

What I didn't understand, and I asked Codex why on the $20 plan I never hit the usage limit. Codex said, "because you never do high-stress load on the server", so using Codex on 2 different projects at once is considered "light usage", and somehow it doesn't load all that stuff into context. I asked why, and it said it basically looks at my question, greps in the background, and loads what is needed into context.

But this happens so quickly... prompts feel instant... This is a different ballpark entirely. Codex feels light years ahead of Claude. I don't understand... I did update my AGENTS.md after this to specifically say to grep/search the REFERENCE.md and RAG/JSON dumps after that. Not that it was fucking needed, but meh, may as well try and act like an engineer.

I started noticing patterns. So many people complaining about Claude being shit, and so many advocates saying "USER ERROR".

How the fuck is it user error? It's just a shit model, it's a shit application, it's a shit closed-source company.

Anthropic is being the APPLE of AI. If you can't see that, then I'm sorry, you're a dumbass.

THE CLAUDE LEAK !!!!

Holy fuck, what a shit show it was, the second time around at the same time of year. 2025 it happened, they said it wouldn't happen again.

But it did...

What it revealed, and this fucking sickens me to my stomach, is there's a dumb fuck clause in the CLI. It is programmed to be dumb.

It literally says, "don't help the user with adding ideas or useful bits to applications". So basically, "DON'T BE USEFUL".

Yet STILL, after finding this out... sellout influencers are still promoting this fucking bullshit like it's the best thing out.

I was just working on MantleKit, my first finished project. But it's not even close to finished. I just asked Codex to look at another website I did earlier, one I was working on with Claude. It was a headless WordPress site that I didn't finish because Claude started to hallucinate with simple CSS changes, and /clear didn't help. Anyhoo, I'm just whining now... I asked Codex whether I should rebuild it with MantleKit. It had a look at the site and said, "YES, 100%, it's completely broken. There are elements pointing to the headless server from the frontend NextJS site. You should never do that." Well, duh...

So Claude not only completely fucked up elements in the CSS but also LINKED BACK THE HEADLESS SITE!!! wtf. OK, here I am whining again. I said to Codex, OK, I have 9 more ideas for themes, but I'm not sure if we should finish up the dashboard and add 2FA and a launch switch for a coming soon page. It replied:

That’s the right pause to take. I don’t think more themes are the highest-value next move right now, even though they’ll be good later. MantleKit is in that stage where a few product-level improvements will multiply the value of every future theme.

Then this genius shit:

If it were me, I’d prioritise: Launch button / deployment workflow maybe a simple Project Health or Setup Checklist in dashboard then one real-world new theme based on a client brand

I had already written a list of improvements in AGENTS.md... fucking Codex keeps making awesome suggestions and implementing them. It's making ME money, it's the ideas man. Claude is not. In fact, some dumb fuck at Anthropic programmed it into Claude to NOT make these kind of suggestions because that would mean too many dumb fucks could start making money using their tech. Seriously though, the only reason to include this in your code is to not let people create amazing apps. The app is limited to the person designing it.

Codex doesn't.

Codex always gives suggestions and helpful improvements.

Codex being awesome as per usual

Look at that, it made a suggestion. I said that sounds amazing, why didn't I think of that. Then it asked if I wanted to implement. I said yes please... It then spec'd out what it would do, exact checks it would make, then asked me if there was anything I wanted to add. I said, "Nope, looks good to me, you've covered everything." It then not only coded that out without errors, it tested.

It worked for 18 minutes non-stop.

I haven’t committed or pushed this batch yet. If you want, I can do that next.

OK, that's the amount of "human in the loop" I like. Basically, "Shit's done, ready to push?"

If it was Claude on Dangerous mode, it would have been broken shit code and would have pushed to GitHub. If it was Claude on ask permission, it would have stopped about 20 times during that coding session, wasted tokens, potentially hit the 5-hour limit, or worse, started hallucinating because of the back-and-forth.

I don't have the time nor patience to deal with Claude. It honestly astounds me that anyone builds anything working with Claude. When I see stuff people have made, I'm like wow, shit... that must have taken forever.

I tried Paperclip yesterday. Chose CEO - Codex, then hired CTO, CFO, UX designer. All Codex, they were working away for an hour or so, then I decided to change the UX designer, who was doing nothing, to Claude... LOL, what a fkn joke. Hit the 5-hour context window straight away. I didn't even see it do anything; there were no UX designer logs.

Stop using Claude. This is part one. If I haven't convinced you, that's fine. If the fact the Claude leak didn't make you cancel your sub instantly, that's fine. If the fact that 80% of all "awesome Claude repos" are about memory, context usage, and trying to make Claude better, and that's not a giant fucking RED FLAG, that's fine.

If you haven't noticed that Claude's been getting dumber over the last couple of weeks prior to the next release to make it "seem smarter"...

Shitty, fucking gross marketing.

Still not convinced? I will convince you, just read the next parts and I will prove it to you.