Kaden Bilyeu

@bikatr7

CS/日本語 Senior @UCCS | Jr Full-Stack Eng @ NGi | Working on @NegationGame | views are my own

UCCS | Colorado Springs

Joined February 2024

165Following

2KFollowers

Pinned

Kaden Bilyeu@bikatr7 · Mar 20

I just wrote my first blog post in 3 months. I have class in 3 hours oof Here it is: kadenbilyeu.com/blog/82b60d0e-… It's regarding why I find Sonnet 3.5 so human compared to literally every other model. Also why GPT-4.5 isn't there yet, and how 3.7 is sadly a downgrade in that aspect.

13.0K

Kaden Bilyeu@bikatr7 · 11 h

Oh Core Flight System, how I hate you so much

120

Kaden Bilyeu@bikatr7 · 11 h

Grok 4 is absolutely not the smartest model by raw intellect, lmfao Not even close, it's a good model, but it's too expensive and falls behind o3 in intelligence and Opus on agency. Gemini 2.5 Pro is still excellent, but it's falling out of my favor as it's getting really…

SSauers@Sauers_ · Jul 28

Am I wrong or is OpenAI not really on the capabilities frontier right now? Best agentic ability: Claude 4 Opus Best raw intellect: Grok 4 Best over long context: Gemini 2.5 Pro

794

Kaden Bilyeu@bikatr7 · 12 h

It's actually so over 😭

232

Kaden Bilyeu@bikatr7 · 13 h

bruh these aren't even the good vpns

DDmitrii Kovanikov@ChShersh · 24 h

Apparently, gooning in the UK is so essential for the country function, VPN is #1 app in the productivity chart

846

Kaden Bilyeu@bikatr7 · 13 h

o3-pro is still the smartest publically available model btw. It's not even close either. Even if I prefer other models like Sonnet or Opus for daily programming, I don't think any model has come remotely close to simply solving tough issues or actually surprising me like o3.

232

Kaden Bilyeu@bikatr7 · Jul 27

I don't get why people think like this, because literally every study we have shows it's completely bullshit lmfao.

__s.a.m.e.m.e.m.e_@st_louis_stan · Jul 27

As well all know, SSRIs and therapy are a joke. The human brain is actually a very finely-tuned, sophisticated machine and conditions like depression+anxiety are reactions to very real environmental conditions and stressors, not "chemical imbalances" that you were randomly cursed…

460

Kaden Bilyeu@bikatr7 · Jul 27

I’m fucking stupid. I backed up an encrypted system. But macrium reflect backups a live os. So when I booted into this it’s expecting a a decryption key. But the system isn’t encrypted, and it’s impossible to actually get the password right since again, the fucking encryption…

KKaden Bilyeu@bikatr7 · Jul 27

Done. Need to add a memory stick whenever it gets here (fuck you crucial for sending me a defective stick). But otherwise waiting on an image restore and i’m solid. Laptop has been put aside, pending a home lab transformation whenever I can get around to that. Putting windows…

1.0K

Kaden Bilyeu@bikatr7 · Jul 27

KKaden Bilyeu@bikatr7 · Jul 15

Recently decided that after 6 years since my last desktop pc i’m going to finally splurge and ditch laptop at home. Waiting on parts to be delivered but: CPU: AMD Ryzen 9 7900X – 12 cores / 24 threads Motherboard: MSI MAG B650 Tomahawk WiFi (AM5, ATX) Memory: Corsair…

2.0K

Kaden Bilyeu@bikatr7 · Jul 27

Amazon fucking sold me a bad ram stick…. Boots with one. Now I gotta order another pack sigh

KKaden Bilyeu@bikatr7 · Jul 25

Soon

404

Kaden Bilyeu@bikatr7 · Jul 25

If you haven’t eliminated em dashes out of your vocabulary you’re honestly harming yourself lol. My grammar was never nothing special but thankfully I have some mannerisms that don’t look LLMIsh. Completed unrelated but a good anonymizer is feeding your text through an LLM on…

GGarry Tan@garrytan · Jul 25

How long before people intentionally put in typos to make sure people believe an LLM didn’t write a thing?

849

Kaden Bilyeu@bikatr7 · Jul 25

Was not happy about the vet

387

Kaden Bilyeu@bikatr7 · Jul 25

> hack > attributed to vibe coding > look inside > firebase misconfiguration Every time

CCrémieux@cremieuxrecueil · Jul 25

The Tea app has been hacked, and you can go download 59.3 gigabytes of user selfies right now. The hack is real. A picture from someone I know who signed up just to see what was on there was in it. This was an obviously vibe-coded app and was bound to be insecure.

2.0K

44.0K

Kaden Bilyeu@bikatr7 · Jul 25

Soon

KKaden Bilyeu@bikatr7 · Jul 15

1.0K

Kaden Bilyeu@bikatr7 · Jul 24

"1M+ bugs" while admitting that half of them weren’t even real isn’t *great* It’s an admission of noise at scale. That’s a 50% false positive rate clogging dev workflows, and wasting time. Silent on false negatives too, how many actual bugs slipped through such a system. Without…

CCursor@cursor_ai · Jul 24

In the past month, Cursor found 1M+ bugs in human-written PRs. Over half were real logic issues that were fixed before merging. Today, we're releasing the system that spotted these bugs. It's already become a required pre-merge check for many teams.

483

Kaden Bilyeu@bikatr7 · Jul 24

Looking at my old code is like coming to lucidity at a murder scene, but I'm holding the weapon.

321

Kaden Bilyeu@bikatr7 · Jul 24

Former employer of mine sent the letter of recommendation letter I had requested at 4am this morning 😭 Told them to go to sleep but quickly realized i’m not one to talk since I was also up at 4 lmfao

300

Kaden Bilyeu@bikatr7 · Jul 24

Real photo of me talking to Claude after it lies to my face directly for the eighth time in a row after responding: "You're absolutely correct! I made a mistake."

bikatr7's tweet image. Real photo of me talking to Claude after it lies to my face directly for the eighth time in a row after responding: "You're absolutely correct! I made a mistake."

140

Kaden Bilyeu@bikatr7 · Jul 24

Literally the only benchmark I needed for Grok 4 aside from my own vibe check was Taelin and it confirms that Grok 4 is kinda meh compared to the other 2 models that actually matter for SOTA

TTaelin@VictorTaelin · Jul 23

sorry my verdict on Grok-4 is that it is not better than Opus for coding, and not better for o3 for reasoning. I don't think it has been trained on benchmarks, but I think its brain is deep friend into a problem-solution mindset that doesn't extend to real-world situations...…

310

Kaden Bilyeu@bikatr7 · Jul 24

It’s crazy how being called European is an insult 😭😭😭

628

Kaden Bilyeu@bikatr7 · Jul 24

The amount of money i spent unfucking my LLC paperwork because i was a dipshit a few years ago is insane you should totally have an LLC though

206