You've Been a Bad Agent | Transcript: Structuring Codebases for AI, Claude Code in GitHub, Scale Acquired! Granola Cafe, AI Rules and More MCP

Structuring Codebases for AI, Claude Code in GitHub, Scale Acquired! Granola Cafe, AI Rules and More MCP

June 12, 2025 / 01:03:46/E4

Wilhelm Klopp (00:00)
Back in the studio. Good to see you Matt.

Matt Carey (00:02)
Dude, lovely to see you too.

Wilhelm Klopp (00:04)
How's it going man? We're doing this a lot earlier than normal. How are you feeling?

Matt Carey (00:06)
Yeah, I'm

a bit sleepy, gonna lie. I did that thing yesterday where I got really stoked on, I've been watching Yellowstone, you know the series. I don't know, I'm assuming you guys get it. It's about ranching in Montana and then I got really stoked on it at like midnight. You should not get really stoked on anything at midnight. And yeah, I definitely watched.

Wilhelm Klopp (00:16)
No. Tell me more.

Wait, so what do you mean? You got like

really hyped up about it and you wanted to watch it.

Matt Carey (00:30)
Yeah, I

was like just close to the end of the series and then I ended up watching it and I was like, it's a bit late now. I'm silly.

Wilhelm Klopp (00:34)
cool cool

No way. Yeah, if not, I feel like I remember seeing some ads for this many, years ago.

Matt Carey (00:41)
It's really good.

It's really good. And just the scenery is beautiful. that's fun.

Wilhelm Klopp (00:45)
Does

it make you want to live in Montana?

Matt Carey (00:47)
on a ranch in Montana. No, there's like a crazy amount of fighting. They just keep beating each other up, which I'm not the biggest fan of, but the scenery's lovely. It makes me want to go ride some more horses.

Wilhelm Klopp (00:58)
nice, that's good, Yeah, there used to be a lot more fighting in our lives, right? Back in the day. We all used to fight a lot, right? What happened?

Matt Carey (01:05)
This is pretty modern.

I don't know. was watching an episode where they... I'm going to spoil it now for anyone, so turn this off. basically, some of the farmhands get beaten up in a pub because they were starting with someone else. And so they go back to the pub in the evening and they just let a bull off in the pub. They just release a bull into the pub.

Wilhelm Klopp (01:25)
God.

Matt Carey (01:26)
And then as everyone's coming out, they just like beat them up with baseball bats. And it's like, holy shit. This is so extra. Yeah. ⁓

Wilhelm Klopp (01:32)
Things get pretty intense.

Yeah,

what's the what's the closest equivalent to that we see in tech? You don't really you don't really see that maybe the whole Deel v Rippling drama

Matt Carey (01:43)
I was literally gonna say that, some like Twitter spat that like has something underlying. Yeah.

Wilhelm Klopp (01:50)
Yeah, we don't see a lot of fights for mostly for better I would imagine.

Matt Carey (01:54)
Did you see, okay, speaking of that sort of thing, some crazy stuff happened in the last few weeks. Did you see that Meta bought Scale AI? 49 %

Wilhelm Klopp (01:57)
Ha

yeah,

mm-hmm, mm-hmm. I saw something, some stuff about this. Yeah, yeah, yeah. Yeah, fascinating. I mean, congrats to the scale, everyone with scale, I guess.

Matt Carey (02:11)
Yeah, I mean, it's pretty cool, but it's also like, you're buying the distribution of data. you're buying the collection of data, which is kind of interesting. And you buy 49 % of it. I mean, the rumors are like you buy 49%, so then you don't cause a, you don't cause like any sort of like regulatory pressure, which is kind of interesting.

Wilhelm Klopp (02:19)
Mm-hmm.

Yeah,

yeah, yeah, I totally buy it. Although I feel like also with these acquisitions, it's always a little bit like, like, I just remember when, when GitHub was acquired by Microsoft and everyone on Hacker News, seemed to like have like, like, if you're just reading Hacker News at the same time, you would think like, yeah, these guys have like figured out like all the, all the reasons for the whys and like what why this happened and like what it means. And, and if you have like the inside perspective, it's just like, wow.

Matt Carey (02:48)
You

Wilhelm Klopp (02:54)
So much of this is just like completely wrong. But like, yeah, I don't know. Who knows, I guess. I mean, like, do you think it's related to this whole like Mark Zuckerberg wanting to like staff up like a much better like internal AI lab? Like that was the other rumor that was going around like the past few days, paying like seven to nine figure salaries to people who are like coming from open AI or whatever.

Matt Carey (03:10)
Yeah.

Yeah, well, I think they've had a massive talent drain recently. Like you see all the time, oh, I left meta, it was great from working on Llama, but now I'm joining Anthropic. And it's like, yeah. So I think that's happened. That's like been an actual phenomenon recently. And I don't know if that's just caused because of how bad Llama 4 was. I don't know if like, I'm assuming something along those lines, like people.

Wilhelm Klopp (03:26)
I see, Yeah, yeah, yeah.

Yeah.

Matt Carey (03:40)
there's like a research of cred, right? And like, if you're not working on the best model, then you've got no creds. So they all go join Anthropoc, instead. ⁓ Yeah.

Wilhelm Klopp (03:49)
Yeah, that makes

sense to me. That's funny. I wanted to start this week with some letters from listeners. And then I want to ask you about what you've been up to. And I think we should both talk about what we've been up to, like just in the past week, because I think that'll be interesting to a lot of people. So the first one was a friend over the weekend mentioned that I gave a shout out last week to the Cortex podcast.

Matt Carey (03:54)
Wow.

Wilhelm Klopp (04:09)
And I just mentioned in passing without saying and I quote that it's a God tier podcast. Have you come across this podcast before? So like, just wanted to correct the record on that one. Say again.

Matt Carey (04:09)
Hmm.

You talked about it last week. I need to have a look.

Wilhelm Klopp (04:22)
Yeah, have you, are you familiar with it or have you listened to it ever?

Matt Carey (04:25)
literally there.

Wilhelm Klopp (04:26)
Yeah, it is a very, good podcast. So you should everyone should check it out. It's just like

Matt Carey (04:32)
I've got like five on the go now so there's only a certain amount that one can take but ⁓ I'll add it to my rotation.

Wilhelm Klopp (04:37)
That is very true, yeah.

It fits just the bill nicely. It's the same two people. The release schedule is actually just every month they put out one. I think the best selling point, the best pitch I can make for it is they do once a year a couple of episodes, which are just these mega monster, incredibly interesting episodes. We could even maybe do a format of one of these. So example, they do State of the Apps.

So once a year they talk through all of the apps they use in their personal life, their professional life, like all the workflows, they talk about like what's in their home screen. They like review each other's home screens. And it's just like productivity porn galore. So like that's my sales pitch for it.

Matt Carey (05:18)
Okay, okay, I'm gonna think about it.

Wilhelm Klopp (05:20)
sounds like you're not a fan of that idea. That would be super fun,

Matt Carey (05:23)
I think seeing my home screen, think that's like, I mean, we need to go for dinner first.

Wilhelm Klopp (05:27)
No one else gets to see your home screen. I get to see your home screen. And then I get to review it. Okay. And then the other listener letter was from my brother actually, who has been listening and who says that you and I are just too nice to each other.

Matt Carey (05:29)
But just.

Okay, what if we mean a do it?

Wilhelm Klopp (05:43)
Which I thought

was interesting. Yeah, one to think about maybe.

Matt Carey (05:47)
did.

Not enough contention.

Wilhelm Klopp (05:48)
Not a... Yeah, exactly. Okay, so with that out of the way, what's been going on in your life the past week,

Matt Carey (05:55)
I went to the granola co-working cafe on Sunday.

Wilhelm Klopp (05:58)
that looked incredibly cool.

Matt Carey (05:59)
It was really fun actually. Sheree's done an awesome job there. I'm not normally that stoked on doing organized fun on the weekends, especially when it's like organized fun plus organized work. But this was really cool. was like really nice group of people, completely different circle to people I'd met before in London.

Wilhelm Klopp (06:11)
Mm-hmm.

no way.

Matt Carey (06:19)
Yeah, it's really fun when you go to an event and it's like, it's not just the same people that I invite to my events. It's like, it's like they're tapped into like a different subculture of the London. And that, that's sick. And yeah, this one was awesome. And first time having a salad project for me. So I think it was a big step in corporate, in corporate, corporate map, salad project. You know, there's a

Wilhelm Klopp (06:24)
Totally.

Yeah, yeah, yeah.

Having what, sorry?

No.

Matt Carey (06:42)
They're like big pots of salad. Shrey got them for lunch, like one for each of us. It's like a big, it's like really big pot. It's a farm of Jays, but it's one meal.

Wilhelm Klopp (06:51)
Damn, has London lunch culture evolved as much since I left a few weeks ago?

Matt Carey (06:56)
I was like, this feels, it feels so corporate America. Cause you're like, everyone's individually like mandated salad box. It was massive. Like, there's that place in Soho that's really good.

Wilhelm Klopp (07:05)
Salad? That sounds great. Get me though.

Matt Carey (07:12)
Anyway, it was like three times the size of any salad box that I've had before. And it made me, yeah, I've now entered the workforce apparently, properly. That's all.

Wilhelm Klopp (07:22)
Oh because

you okay nice nice nice

Matt Carey (07:26)
So yeah, no, once that, that

was sick. Brought my girlfriend's brother along as well. He's like just getting, just thinking about getting a job in tech. So that was cool. got to some real techies, not just like me and my friends. People that actually.

Wilhelm Klopp (07:35)
sick.

That's awesome.

What was the setup? Was it like a day of hacking on things on a Sunday?

Matt Carey (07:44)
Yeah, so you turn up at around 11, there's lunch at 12. Shre was like, there's an hour between 12 and one where, like the only hour that I'll mandate you to do something. And that's like, we go around the room and say who everyone is, everyone introduce themselves, what they're working on, like what they're struggling with, what they can help with, that sort of stuff. So it was a bit organized fun, but it was great. Cause like there were people that I'd never met before. And then I think I left around five.

Wilhelm Klopp (08:04)
Nice, nice, nice.

Matt Carey (08:11)
released like two new things on Shippie, which was cool. Shippie's got subagents now. Yeah, well, I was pretty inspired by the AMP team. They looked like they were doing some awesome stuff with subagents. So was like, I'll just try it and add it. And then when I added it, I was like, how can I tell if this is working better? So then I like ditched my evals pipeline and started making a new one. So if anyone's interested, Shippie has a new evals section, which...

Wilhelm Klopp (08:15)
know it. Nice.

yeah.

Mmm.

Matt Carey (08:37)
I think it's gonna be kind of cool. It's like a whole new framework. It's like scenario-based testing. it's like, given, ⁓ fully jumping topics here, but given like, you've opened this file and you've opened this file and you've looked at this file, but this file has like an exposed secret. What should the next tool call be? And what should the summary have in it? And should it push a suggestion? if all of those things are true, then the shippie is working as expected.

Wilhelm Klopp (08:43)
Mmm.

No, no, the script.

Mmm.

Matt Carey (09:03)
So it's like very much scenario based and you like write a little config for each scenario. And I think, I think it's kind of cool. I've really been doing a lot of config based coding recently, which I don't know if it's kind of fun.

Wilhelm Klopp (09:04)
Nice.

That's awesome. No, that sounds

great. Wait, so was the eval change inspired by or like necessitated by this sub agent change or unruled?

Matt Carey (09:23)
Well, I wrote I wrote the sub agent tool so essentially the sub agent is just a tool that the main agent can call And I wrote that and then I was trying to work out how you could call it because by default it just wasn't getting called and so then I added something for custom instructions based on the Claude code Like set up they have a little thing for custom instructions. I think

Wilhelm Klopp (09:43)
yeah, wait, you were tweeting something about this. It seems like there was some inside you uncovered or whatever, right? From the Cloud Code system prompt?

Matt Carey (09:50)
yeah, the system property, I hope I don't get sued by Anthropic. We'll talk about that in a sec. So yeah. Where was I going with this? So Shippie, yeah, subagents. Basically I was trying to work out how to test the subagents. And so I made a little bit of an eval's pipeline. No, but the core code is sick, dude. Have you tried it?

Wilhelm Klopp (09:55)
Okay, okay.

Wait, wait, just

on the, just on the sub agents, like just for people who aren't familiar, as I understand that the main point of the sub agents, there's like two kinds of points, but the idea is like you're running your agent, right? You're like doing stuff and then you, the, sub agents is a way for like the main agent to, delegate a task to the sub agent with the two, the, the, the two benefits being that like the sub agent can do, go off and do something without like,

polluting the context window of the main agent with like tons of tool calls and whatever. And then the other benefit is that you can have multiple subagents at the same time. So you can like parallelize some stuff.

Matt Carey (10:37)
Exactly.

In reality, the parallelizing doesn't work very well for me yet. I think I need to sort out some prompting situation to actually make it call multiple subagents at the same time. But the really cool thing is code bases can be quite fractured. And so you have like one part of the code base over here, maybe it's the testing framework, you have another part of the code base over here, maybe it's some like UI components, and you've made a change to some logic of like stores or something.

Wilhelm Klopp (10:51)
Hmm.

Mm-hmm.

Matt Carey (11:10)
or some like database logic, and you want to see how that impacts like the testing framework. Well, if you go and open all the testing files in your main agent, now you've got this like crazy large context in your main agent, which you're going to get over a certain context window and it's going to start the sending into madness. So what's really nice is just to ask it to be like, go and have a look at this testing stuff and write a report. And then you ended up, when we up with like a, a simplified amount of context that comes back into the main agent.

Wilhelm Klopp (11:27)
Haha.

Yeah, that totally makes sense. Yeah, interesting.

Matt Carey (11:40)
It's really cool.

it's like something that we've been something that I know I've been thinking about for a while and agent to agent interaction and like why that's useful. And it feels like the main use for it is context window. So I don't know if this is also another one of those pieces of scaffolding that we're going to look back on in a year and be like, haha, that was pointless. As some. Yeah.

Wilhelm Klopp (11:59)
Right, right, right. Some big change. I mean, to

me in theory, the parallelization thing is also like super appealing. Like I mentioned briefly last time that I'm doing this like big refactor and it ends up being like a bunch of transforms like across like dozens and dozens of files. And like I've been like doing some manual stuff, some final and replay stuff, some agent stuff.

But an, but you know, whenever it is an agent stuff at the moment, it's very like step by step, like very, very slow. But if you could just like have Claude code, write a bunch of like rules and then apply them to like every single file that matches like the pattern we're trying to replace. Like I could see that being super, super useful because now like all the, all the transforms are done like in parallel and it takes like five or it takes like two minutes.

Matt Carey (12:42)
Yeah, definitely that.

So, okay. So go back to Claude coding GitHub. Okay. I've found this has opened up a whole new world for me. guess I was it Devon was released about a year ago. Devon's like the autonomous coding agent that has its own. Yeah. Background thing. Yeah. Cursor called them background agents. think background agents actually really good idea. And then like cosine genie was released in the summer. and then, there's been a bunch of others. There's like the one by factory droid.

Wilhelm Klopp (12:48)
Mm.

Background agent tasks, yep.

Matt Carey (13:08)
And I tried a couple of them. I never tried Devon because we never paid the $500 a month. but I tried cosine. It was, it was pretty cool. It was like, I was like very impressed with the UX. The performance wasn't there for my tasks. I tried, a bunch of other stuff and I never really like got like that, aha moment where it's like, this is actually helpful. Like this is

Wilhelm Klopp (13:21)
Mm-hmm.

Right.

Matt Carey (13:30)
saving me like significant amounts of time. And it is tough because you have to remove yourself from your IDE to go to some other platform to do this autonomous thing. That's where Cursor have a massive advantage, right? You can kick off background agents inside Cursor, inside your IDE, and then they go off and do stuff. But before that all existed. But now, Claude Code in GitHub, honestly, man, like, I think it's the best developer tool this year. It's insane. It's insane. The setup...

Wilhelm Klopp (13:53)
seriously?

Matt Carey (13:56)
they've made so neat because I've been trying to work out this UX setup for Shippie as well. Cloud code is like cloud code and then forward slash like install GitHub app. And it just install the app and then it automatically pushes an Anthropic API key to your repo as well. That was super neat. Created PR, the workflow file. The workflow file is like very, very minimal. And then it's just you to set it up, right?

Wilhelm Klopp (14:12)
That's good.

Matt Carey (14:19)
And then in any issue or any PR, can just tag Claude and be like, review this, write this, do this. So back to your example, a thing that I did yesterday was I've been rebuilding how internally we specify MCP servers. Previously, each server had its own like piece of infrastructure, which was fine when we had like three internal servers. Now we have like 30 and a bunch of external ones that we're going to publicize and think all of this sort of stuff. And we're going to have like thousands.

so we can't have each piece, can't have its own piece of infrastructure. So I built this whole new thing and it's taken like last couple of, like last few days really to like work out how we're going to call these, this, this little thing and like how it's going to work and all this stuff. And so I made a few, I made like a few versions of the V2, let's call it. And then last night I just set off a Claude code task being like, see these new few new versions migrate.

the 20 that we had previously to this new version. And like that sort of task, like really, really good. And there's loads of stuff where it's like a bit lazy. It misses some functionality and you definitely have to review it, but all of that like, like sort of express, find and replace you can remove. And I think you could do some even cooler stuff with getting it to write code to do a lot of these things, because a lot of...

Wilhelm Klopp (15:13)
Nice.

Matt Carey (15:35)
the migration type stuff, I reckon you can specify in a Python script if you're really smart about it. And so that's all.

Wilhelm Klopp (15:41)
Yeah, yeah, yeah. Or

maybe I don't need to be smart or Claude can be smart to specify it in the Python script, right? Like, yeah.

Matt Carey (15:46)
Yeah, you wouldn't have to write

Python script. You just have to set up Claude. I think what Anthelpic definitely can do better is like the setup documentation, just some like best practice on how you set it up. Like I was using it for just under a week now. And we've already built like little utility library because we use cursor rules. So we use cursor rules, but Claude automatically ingests the Claude.md file. So it's like a tiny little library that's npx.

Wilhelm Klopp (16:03)
interesting.

Matt Carey (16:11)
cursor rules to Claude and you just stick that thing and then so it generates you a Claude.md file on demand.

Wilhelm Klopp (16:13)
Very, very cool.

That's awesome. This is actually super interesting because I actually, so I was thinking about what could be like the main topic to discuss this time around. Just, you know, so we have some big meaty things that are, and this is all actually very, related. But yeah, actually I'll say it in a second, but just to clarify one thing on the CloudCode GitHub app, because I think I tried it out as well when it shipped as part of the Sonnet 4 release, right?

Matt Carey (16:41)
Yeah, it was...

Wilhelm Klopp (16:42)
And I

agree, it sounded like one of the best parts of the whole thing.

Matt Carey (16:45)
It was, I think it was undersold of like how beautiful and how beautiful the UX is. they've for something for a GitHub action. Normally people just like dump text into a comment. This is so stuck. They've gone to levels. There's like a GIF that moves. ⁓ Dude, it's super nice. And basically, and so what I did was I was like, how does all this work? Cause I've spent quite a long time making GitHub apps, right? Like Shippies a GitHub app.

Wilhelm Klopp (16:48)
Mm.

the little animation, like loading animation thing.

Matt Carey (17:12)
And I was like, this is way better than I've ever done. I was like, this is sick. Like the level of quality here is so good. And so I, because it's in GitHub and because they log everything, they literally log the whole system prompt. So that's what I extract. Any of it works, it's in the system prompt and I made a GitHub gist. Anthropic, please don't sue me. I think this is fine.

Wilhelm Klopp (17:24)
Gotcha. ⁓ cool.

⁓

No, that's awesome. That's very cool. And there's some things to learn from that. And actually, we can also talk about interesting things in the system prompt if you want. But I remember trying out the... I clearly need to go to another try. I think I tried it for some random thing. And then the way it works is it's basically Cloud code running in your GitHub actions. And then it has the ability to push up stuff.

Matt Carey (17:55)
So it could be super expensive if it runs for a long period of time because actions

are famously expensive.

Wilhelm Klopp (18:00)
Right, because GitHub Actions you pay per minute of runtime. And usually that makes some sense because the CI workloads, the tests you run are very, very compute constrained. Sorry, CPU constrained. Whereas Cloud Code is very much not that. It's just doing a ton of network calls, waiting for things, editing files. So it feels like a slightly strange model, but it's certainly very, very convenient, right? Because you can just have the runtime there.

Matt Carey (18:19)
Yeah.

⁓ You

know how you can do external GitHub actions? I wonder if anyone has built like a proxy to do your GitHub action through like a worker or like a Cloudflare worker?

Wilhelm Klopp (18:31)
Mm.

I love that. That's so good. So there's, so I definitely saw some chatter about someone like wiring up like a VM because you can have X.

Matt Carey (18:45)
Hmm.

With the company called warp. like I don't know if they're any good, but we, we, we use them. our DevOps guy loves them. And so we, we outsource a lot of our GitHub action, like compute minutes. I think they.

Wilhelm Klopp (18:49)
Yeah.

Totally. Yeah, we

use one as well. I forget what it's called, but it's a very easy setup. We use something called BuildJet, I think, which is just half the cost of GitHub Actions and double the speed.

Matt Carey (19:08)
think Depot, is Depot the same

thing as well? Depot? think they were really, yeah, was a really good podcast.

Wilhelm Klopp (19:12)
Uhhhh

But

I love the workers thing because like, obviously you can't really run your CI workload on Cloudflare workers or whatever. doesn't like, probably, like probably you can't sign up a Postgres inside your, inside your worker. But I'm like, or whatever, dependencies. I mean, who knows? Maybe at some point, but if you're just running Cloud code, maybe you can do a bunch of that in a, in a worker. Can you? I don't know.

Matt Carey (19:24)
No.

Yeah, I think you

would need worker plus VM because like Claude code does do things like it's going to run linting and it's going to run tests and it's going to run all that stuff. So you do need a full VM, but it's whether like the actual part of Claude code could run not in the VM or maybe that's like over optimizing. Maybe like just stand up a VM on AWS and, but can't wait the content to come out soon.

Wilhelm Klopp (19:53)
Yeah.

Mm-hmm.

How would you,

yeah, I heard, oh yeah, it's coming this month or something, right? Yeah, I'm excited for that. Although I remember chatting to someone about this at an event like few months ago, and because I'm excited to just like run full on Python in Cloudflare, but I'm not sure how well it'll actually work. But I'm excited for this. I've often, like in the past couple of weeks, I've often thought, oh.

nice, like as soon as the container cloudflare thing comes out, like I can do this, I can do that. So I'm pumped for it.

Matt Carey (20:32)
They're

to crazy usage. just, wonder what got you as up. Go on, go on. No, no, no, no. had a question.

Wilhelm Klopp (20:34)
But wait, the question... sorry, go on. Go on, go on. No,

we're just too nice to each other.

Matt Carey (20:41)
No go on hit me, I'm just trying not to be shit. Shut up boy!

Wilhelm Klopp (20:42)
Shut the fuck up, Matt. ⁓

No, I was going to ask, how would you clone a repo inside a worker? Is that a thing?

Matt Carey (20:52)
Nah, you can't do it. What you can do, okay, so there two options that I've seen on Twitter recently. Inside a durable object, which is like the stateful bit, you can do, so the ready-made option is something called GitLip, which you've probably heard of.

Wilhelm Klopp (21:02)
Mm-hmm.

Yeah, yeah, yeah, yeah, yeah, I've seen this around, yep, yep.

Matt Carey (21:11)
It's

Natalie Malini and Zoran. They've done that. It's an awesome startup. They've remade a Git server in Wasm. They hosted it in a general. It's super smart.

Wilhelm Klopp (21:20)
no way. That's wild.

I mean, were they at AI demo days or something?

Matt Carey (21:25)
Yeah, they demoed quite recently, they've demoed a lot of stuff as well.

Wilhelm Klopp (21:29)
Yeah,

I remember just I remember meeting Natalie like ages and ages ago at some Vercell event or some something adjacent. Yeah, she seemed great.

Matt Carey (21:36)
Now she's lovely. Yeah, it's an incredible piece of technology. I think they're the third people ever to make a Git server from scratch. How nuts is that? I'm pretty sure, I don't know, when you say it, feels like an anti-signal, but it's just so nuts. How do they explain it to me? So they're the third people ever to make a Git server implementation, like the full one from scratch. The GitHub.

Wilhelm Klopp (21:43)
Amazing.

Matt Carey (21:58)
didn't even make a full Git implementation because they're using like one of the original two GitHub servers. There's one in JavaScript and there's one in C, I think. Yeah.

Wilhelm Klopp (22:05)
Mm.

assume Git has its own server thing built into Git? Or is that not how works?

Matt Carey (22:15)
They wrote a really good blog post on it. I don't want to paraphrase and get it all completely wrong.

Wilhelm Klopp (22:18)
Nice, maybe we can link to it in the show notes and ⁓ I can educate myself.

Matt Carey (22:20)
Yeah, it was pretty good. Okay,

so that's the first way. That's like ready-made. They've spent a while working on it and that's literally its use case. And then the second way is to do some slightly more hacky thing running a file system on object storage inside a durable object. So they have like an object storage API and you can kind of run a file system if you're a bit smart with it. And so there's a bunch of guys on, I'm gonna get this guy's

Wilhelm Klopp (22:47)
That's wild.

Matt Carey (22:49)
this completely wrong, but there's a bunch of people that took the challenge to make this. I don't have any of them set, any of the stuff saved, but there's something like Dorm maybe they called it. I don't know. If you search durable object file system, I'm sure you'll find it.

Yeah.

Wilhelm Klopp (23:04)
Yeah,

fascinating. But okay, that sounds like it becomes a lot harder. you don't, it's, if yeah, getting access to file system is like so tricky, then it's going to be hard to run cloud code.

Matt Carey (23:14)
Yeah, super tough, super tough. Your file

system is not going to be legit unless you're using... Yeah, so Jan made one, Cloudflare Virtual FSDO. What a catchy name, What a catchy name. It's a very cool project though. And yeah, it's like the...

Wilhelm Klopp (23:26)
Right.

the DOFS,

the Durable Object File System.

Matt Carey (23:34)
Yeah, he made this and then a bunch of other people made a different version. I think there's a few versions now out there. They're all quite hacky. I really want Cloudflare to have a Positz compatible file system inside a durable object. It would make all this stuff so easy. And then it becomes the platform for agents. I actually might write this down and put it on my grief list.

Wilhelm Klopp (23:56)
Maybe

you should work there and just build it.

Matt Carey (23:58)
I'm put on my grief list for Sunil. He gets a list of things I get angry about. Every week, sometimes more than that. Well dude, I've been playing around with, obviously, MCP a lot recently, and this thing called Muppet.

Wilhelm Klopp (24:03)
Every week. That's awesome.

here we go, MCP, back on the Bad Agent podcast.

Matt Carey (24:17)
Dude, we're building agents and agents need integrations. So, fuck you. There's this thing called Muppet. There's this thing called Muppet, right? Muppet.dev. And I think they have the best MCP server SDK for JavaScript around. It runs on any runtime. It's based on Hono.

Wilhelm Klopp (24:21)
No, I'm a big fan of MCP. Sorry, go on. Muppet.

Matt Carey (24:38)
It's very like router centric. I'm not sure about the dev experience, but it works and that's the main thing like it

Wilhelm Klopp (24:43)
Is

Hono different from Honcho?

Matt Carey (24:46)
Yeah, who knows the router? You know the router?

Wilhelm Klopp (24:48)
Mate, feel like I feel so old. Sorry, I feel like that I want to go onto some rant about how there's a new framework in JavaScript all the time. No, let's not have this chat right now. No, no, no. A long time away from 30. But I just remember, remember when we were having the hackathon at my place and I was just like trying to figure out, okay, how do I deploy?

Matt Carey (24:55)
Hold on, are you 30 yet?

Hahaha!

Wilhelm Klopp (25:11)
I just kept running into invalid combinations of like, oh, you can't deploy this project on this provider, that's so 2018. You've to use the Linux combo or whatever. Okay, anyway, so Hono is the new hotness in the...

Matt Carey (25:27)
Honolulu is

express, but a little bit cleaner, a little bit smaller, a little bit less bloated.

Wilhelm Klopp (25:34)
Didn't know you could make Express any cleaner. It was already so minimal.

Matt Carey (25:38)
Yeah, I don't know. Hono is good. Hono is good. Yeah. This Muppet uses Hono and it uses Hono to the absolute extreme where you specify your tools as a Hono app. you do app.post and then add a new tool, which is super weird. I'm not sure about this.

Wilhelm Klopp (25:40)
Anyway, so Muppet uses Hono.

Right?

Matt Carey (26:02)
But the rest of it, way, the, they've done is really good. Like the transport layer, which I think is going to live in Hono now, they're moving it to Hono. Like that solved so many bugs that were present with the, with the, with the other JavaScript implementations.

Wilhelm Klopp (26:13)
Hmm.

I had a MCP question for you actually. Do you know... So obviously everyone has been going ham implementing tool support in their clients. But do you know any clients that support resources very well?

Matt Carey (26:18)
well now you want to talk about CP, do you?

Nope.

Wilhelm Klopp (26:33)
Okay. And for context, anyone who doesn't know,

like in MCP resources, this thing where MCP servers, like the stuff that we're building, like just exposes like documents of text, right? Or maybe images as well. So it's not like call this tool and get a response. It's like, here is just stuff that you, the client could consider. But yeah, I also am not sure who supports this well.

Matt Carey (26:52)
Yeah. Kind of an

Yeah, I don't think anyone's forced it. Hopefully we're get a comment on this or we'll get David messaging us and telling us, no, these clients are what, maybe Goose. I don't know, that might be a good one to have a look at. Codename Goose. Yeah, maybe then, but I don't know, I haven't used it. most people, what they do to get around this and it's so hacky is you just return it from a tool. ⁓ You can do, but it's not the nice UX. It feels like using like a...

Wilhelm Klopp (27:05)
code name Goose.

Yeah, yeah, right.

Matt Carey (27:21)
post for a get or something in HTML and rest. It's like, you can do it. It's fine. Just annoying.

Wilhelm Klopp (27:27)
Yeah,

there used to be this, or I mean, I'm sure there still is. There's like the list of like example clients in the MCP website. And there are a bunch on there that have resource support, like.

Matt Carey (27:34)
Yeah.

on the docks.

Okay, that'd be cool. Well, I should keep playing this game.

Wilhelm Klopp (27:44)
But I just don't know how it works from a user's perspective.

Matt Carey (27:46)
⁓ No,

call desktop does call desktop does. And it does actually. Yeah. No, because there's, there was a few things that came out recently that I was really interested in. If anyone wants to see a very clean MCP implementation, the super memory MCP server is like a hundred lines of code and it's actually beautiful. Javier did such a good job with that and he open sourced it, which as he's, when he's open sourcing it, he's like, this repo is probably worth quite a lot of money.

Wilhelm Klopp (27:50)
of qualm.

nice.

Matt Carey (28:12)
But here you go. And I'm like, yeah, fair enough. It probably is because it's like one of the cleanest implementations. And he wrote a blog post about how it all works as well. ⁓

Wilhelm Klopp (28:20)
nice. Does

he use resources too? Or does he mostly use?

Matt Carey (28:23)
⁓ has one

resource, I think. no, he has one prompt. He has one prompt. Cause it's prompts, resources and tools. and he has one prompt, like teach the client how to use this MCP server, which I think he says it works quite well. I haven't like tried it immediately. I just copied and pasted it and then immediately gutted it for my own use case. So I never actually tried it for his, but it's a very clean.

Wilhelm Klopp (28:28)
cool. Right, right, right.

nice.

Can you

actually explain this MCP server to me? Because I've seen the hype about it and he seems like a really cool guy. I should hit him up when I'm back in SF. But like what actually is the idea or like how does it work?

Matt Carey (28:49)
Thank you.

So I don't know like how far we want to go into this, but super memory is like an external memory service. So it's just a database that you can carry around between different like chat applications. So you could have it in core desktop and it knows about you. could then put it into chat GBT when they support MCP and it knows about you. You could put it into cursor and it knows something about you. And so I think the use case is quite far and wide about what you want it to know.

Wilhelm Klopp (29:20)
Mm-hmm.

Matt Carey (29:21)
I think that's probably what he needs to work on is like explaining use cases for like normies like us. Like I'm assuming if you have a very good use case for an external memory and you're using his API, then using his MCP is pretty good because you can just query his API through natural language. That's kind of neat. But, but if you're not using his API, like in a project already, then I'm sure there are use cases. just, we need those explained. And that's why, why he's writing a closer.

Wilhelm Klopp (29:38)
cool.

Yeah, yeah. wow, interesting. Okay, so wait, and wait, super memory is a separate thing from super memory MCP server.

Matt Carey (29:57)
Well, the MCP server wraps the super memory API.

Wilhelm Klopp (30:01)
but the super memory API is like a much older thing. It has like 10,000 stars on GitHub.

Matt Carey (30:04)
Yeah, he's,

yeah, he's, well, I think that was the V zero he published and it got a lot of stars. I think now it's like a hosted version. He went closed source for the next version. And they just leave the V zero because it's got stars, you know.

Wilhelm Klopp (30:14)
Okay.

I see, see, I see. Okay, interesting. That's great context.

Matt Carey (30:20)
It's like

pretty free, I think almost. And it's a very good demonstration of a lot of what you can do on the Cloudflare platform as well, which is, which is cool.

Wilhelm Klopp (30:29)
Nice. Yeah, I'm just

on the mcp.supermemory.ai website. And one thing that's really cool is that you get your own MCP remote URL that has a secret in it. That's really cool. Cool way to solve auth. It's just like, here's your URL.

Matt Carey (30:39)
Yeah. Well, there's lot of

controversy around this, right? I'll explain. Okay. So if you put a secret in the URL, um, or some sort of key in the URL, it means that you never have to worry about client side authentication. You like never have to do it. So anything, all of these apps with broken OAuth flow, all of these apps that only support a bearer token, not

Wilhelm Klopp (30:46)
really?

Matt Carey (31:07)
Like Retool came out and they support no auth or basic auth. I'm pretty sure basic auth is not even in the spec. Like how are these internal servers?

Wilhelm Klopp (31:12)
Hmm. ⁓ Basic auth

is where you have like username, colon, password, like in the URL or something.

Matt Carey (31:19)
Yeah. And you have to write basic.

not like, it's like, it's actually, it's actually interpreted a little bit differently as well. Um, cause then it's basic 64. Uh, so you have to write basic to know to un-base 64, the key to then split it from username and password. And, and I was like, why is this basic also? Basically. So the authentication is pretty a mess with the clients. Like all of the clients are doing different stuff. And so by having authentication in the URL.

Wilhelm Klopp (31:35)
delightful.

Matt Carey (31:46)
you get around all this and it's a very neat way. And that's actually what we do with integrations.cool is that the key is in the URL. So we're quite pro this. There are a lot of, I would say, like there's a bunch of security people that are not super pro this and they have various reasons stemming from these keys get logged quite regularly in logs, in whatever, because like logging a URL is kind of normal. And so,

Wilhelm Klopp (32:05)
Hmm.

Sure,

yep.

Matt Carey (32:10)
that they're not kept super secret. And I think this would be a problem for people where the key is actually an API key to an external system. Because I think that's what some people were doing. And that is pretty cool. So because you're like logging your external API key. And so probably don't do that. But yeah, what Clavier is doing is like, is cool. It is cool. And we've, we've

Wilhelm Klopp (32:23)
I see.

Matt Carey (32:32)
we've done the same sort of thing. When the super memory MCP came out, was like, is neat. We're going to do that because this breaks you with all of this pain, like dealing with client authentication. Cause so many of them are broken. Like for the first like two weeks, I couldn't get the Clawed AI one working at all. Like the one online, cause it was just also broken. But there's like...

Wilhelm Klopp (32:38)
Yeah, yeah, yeah.

Yep, yep, yep.

Yeah, no, exactly.

that's what I was going to say. Like, I think if you, if you put this URL in like your claw.ai, that seems like perfectly safe, right? Like even if it gets logged somewhere, like you probably trust Anthropic to like figure that stuff out on their end.

Matt Carey (32:56)
.

Yeah, yeah, I think that's cool. I mean, like just not wanting to beat a dead horse, but the MCP stuff, there is still like a massive lack of like a good collection of servers. I was chatting with someone yesterday, you know, Ashley, I'm going to out him now. You know, Ashley Peacock, he's got a new name, Peacock. He wrote a book on Cloudflare. He's very like Cloudflare advocate. Anyway, I was chatting with him. They were doing a hackathon at work and he was like,

Wilhelm Klopp (33:19)
Actually who?

Matt Carey (33:30)
where do I find remote MCP servers that I can just use and then we can build an agent that connects to Notion and then to Slack and we can just build that internally at work. And he was like, this must be a solved problem, right? And I was like, yeah, kinda. You can look at Smithery, you can look at MCP.run, you can look at MCP.so that these are all collections of MCP servers. MCP.run does some extra stuff, I haven't quite worked out yet, but.

They're all like collections of servers and like integrations that you can turn on or off Smithree gives you a URL and then and so I sent that to him and he was like, yeah, I mean I tried for an hour playing around with Smithree and It doesn't really work for me because like you have to the old flow has to be per user I kind of just want to set it up initially and so I think yeah, there's

Wilhelm Klopp (34:14)
Hmm.

Matt Carey (34:16)
It's still not a solved problem, this whole thing. And I think the cl- he came back to me this morning actually and said the closest thing that he'd found was Zapier agents. Because Zapier has exposed the MCP server for every one of their integrations. And so you can use that instead of, you don't have to use their integration builder. You can just like directly use an MCP server. So I think like Rafael there has done an amazing job, like productizing that in a way that's absolutely useful. Cause there is a difference like-

Wilhelm Klopp (34:24)
Mmm, right.

Yeah.

Matt Carey (34:42)
And from like this kind of works to this like actually works and we can build something with it today. and we are still very early in this, in this.

Wilhelm Klopp (34:42)
Totally.

100 % yeah, no, that's really interesting. Yeah, I haven't tried the Zapier thing at all. I think like there's also the centralized MCP server registry that's coming, right? That's like being discussed. yeah, yeah, like the, mean, it's interesting because I think like it will be like Anthropic and Doris in the same way like MCP is Anthropic and Doris, but I think it's literally being built in the open by like

Matt Carey (35:02)
the anthropic one, the legit anthropic one, yeah.

Wilhelm Klopp (35:15)
people from lots of different companies and it's being like what it even is, is decided by like a bunch of people who are not affiliated with Anthropic and then also some people affiliated with Anthropic. like, and I think it's just on one of the open source repos and like the discussions and issues where they're talking about how it'll work.

Matt Carey (35:33)
Do you okay wild one here do you think that if MCP becomes like a big protocol that Agents use across the internet. Do you think we'll have a standard where every website will have a MCP subdomain and And do you think that at that point it maybe not even MCP subdomain maybe an actual MCP protocol soon, you know, it's a HTTP

we actually have like CPE protocol. If you think that, then is this registry actually gonna become like a new version of registering DNS?

Wilhelm Klopp (35:56)
Mm-hmm.

That's very interesting.

Hmm. Yeah, I don't, I don't know. I haven't really thought about it in this way. I mean, for sure, I think the register will be helpful like in the short term. And I, and I imagine, I don't know, like the whole everything will have an MCP server. It's a bit like, it feels a bit like the semantic web kind of idea, right? And that like, every website will publish it's like structured data and no, no, like the

Matt Carey (36:29)
like the airlines of TFT.

Wilhelm Klopp (36:33)
Like the Sparkle ODF RDF, is that what it's called? Do know what I mean?

Matt Carey (36:38)
you're gonna have to send me this. I have no idea you're on about.

Wilhelm Klopp (36:40)
Like this was a thing like so Tim Berners-Lee, the creator of the web, right? He came up with this future version of the web where instead of just publishing unstructured HTML, you have this like much more complicated XML format where like an event website would like encode, for example, the event location, the event date and the event time in like a really structured format. And then you could,

Matt Carey (37:00)
It's kind of like RSS,

but like for any website, not just something that releases content.

Wilhelm Klopp (37:06)
for any website and you could run like these SQL-esque queries across the whole web to like resolve stuff. think actually Wikipedia has something like this built in so you can run these like sparkle queries on Wikipedia. But like it just never took off for the web. Like I just think in reality the web stuff is always gonna be like very messy and people will...

Matt Carey (37:12)
Wow.

Wilhelm Klopp (37:31)
not do what conforms to the standard, just like do whatever works for their use case.

Matt Carey (37:36)
Yeah, that definitely, that makes sense. Sparkle is such a great name for a query language. It just makes...

Wilhelm Klopp (37:42)
We had a whole

module at my uni about this and they were teaching it as if the web was about to like adopt all of this stuff. And I was like, this is really dumb. And I actually got the module canceled for the following year. And now I don't think it's taught anymore. At least not as a core module.

Matt Carey (37:59)
Well, that's hectic. Okay. I didn't have a look into that. That's kind of cool. Yeah. Go on.

Wilhelm Klopp (38:02)
I'll send you, yeah, it's the

other thing is called RD. So I think Sparkle is like one thing and then RDF is, what does RDF stand for? It's, think RDF was supposed to be how you write the documents.

Matt Carey (38:17)
You do realise we've gone from like, we've gone from like agents to nerding about MCP to like now nerding about the internet. We've gone down such a rabbit hole. ⁓

Wilhelm Klopp (38:27)
We've gone a big

rabbit hole. Okay, let me pick this back up. I have one more question for you on the super memory thing, and then I have a big question for you. So for super memory, it's kind of like, would you say the problem that super memory solves is solved more effectively by rules files for like coding use cases?

Matt Carey (38:34)
Let's go, let's go.

But the problem that super memory, no, super memory solves the problem where you want to take knowledge and move it around different chat applications or different applications. So if you're in, so for instance, you're in notion AI and you're writing a report. Now you're in a core desktop and you want to like ideate on something. You don't have, you know, all of the custom instructions that you write, like I am Matt, I am a software engineer. It's in London. Like that sort of stuff that you can end up writing out super memory just knows that. And by default.

Wilhelm Klopp (39:09)
Okay, okay. I

get that part, but I think like the most effective version of those instructions that I've ever written aren't like I am well, I live in blah, blah. It's like this code base is X that uses this testing format, like whatever. And those tend to be like rules files and you can take those across different models today, right? Like.

Matt Carey (39:27)
So there's this thing called, yeah, like I'm not sure super memory is the great use case for that because that's like a developer use case. And I think super memory is like an application level or consumer level use case. So it's like made to use as infrastructure in your application or it's made to use, like I'm not sure, or it's made to use directly by the consumer. I'm not sure it would be like.

I don't know, we can see what DriveAid does with this go to market. The thing that you want to use there is something called Vibe Tools or Vibe Rules, think. Vibe Rules, also a really like perplexing name. But yeah, what that does is like, is...

Wilhelm Klopp (39:56)
⁓ Nice.

Matt Carey (40:03)
Some libraries export their rules files at forward slash LLM and so it allows you to like import libraries rules files It allows you to like change rules files from one format to another and their whole vibe. I'm not their whole thing I'm not sure if it's gonna Is that their whole thing is that you create? Like like packages in your repo. This is the thing. I'm not so sure about they have all the rules

Wilhelm Klopp (40:18)
Ha ⁓

This is

awesome.

Matt Carey (40:28)
You create like

a package, like in your monorepo, which means you have to have a monorepo, but you make like a package to say what the repo is about. And that has all your rules in it. And then whenever you're using a different IDE or whatever, you just do like vibe rules, install cursor, and it looks for the package in your repo and it installs as a curse rules. And then you never commit your rules files. I think that's the idea. I'm probably going to get slated and that's not the idea, but to me that seems like so much setup. ⁓

Wilhelm Klopp (40:33)
Mm-hmm.

That's

amazing. I think I just found, yeah, yeah, yeah, no, it seems like a lot. It's that, yeah.

Matt Carey (40:57)
Like I just, yeah. Like what I want is you just want to

store it natively as cursor rules, because I think you've got such an anti pattern there. your IDE is going to be tuned to store rules files in the format that your IDE uses. So why are you trying to store it in a different format? Why don't you just use the format your IDE uses? And then if your IDE changes, have like a migration script.

Wilhelm Klopp (41:08)
Mmm.

Matt Carey (41:24)
that just changes like, like, and that's what we do with the Claude code. It's like very, it's like a one line that you just on demand, make a Claude.md file.

Wilhelm Klopp (41:24)
Yeah.

Yeah, totally. With Zed, by the way, we made it look at all of the different rules files that we could find. ⁓ And then also, it'll just respect the dot rules, which maybe that would be cool if that became a bit of a pattern.

Matt Carey (41:37)
Yeah, we're just sick. We're just sick.

Yeah, vibrals,

they use that as well, think. The dot rule.

Wilhelm Klopp (41:51)
nice,

nice. By the way, this is great. So I'm just on the Vibrals repo. It has 95 stars. And this situation just reminds me of like, I feel like you're so close to the action. You've just shown me like, you know, some cool undiscovered tech. Reminds me a bit of one time a friend showed me this like music video for like some music he loved. And I looked on YouTube and it had like 300 views. And I'm like...

Matt Carey (41:57)
That's it.

Wilhelm Klopp (42:14)
Wow, how have you even found this? How have you even discovered this thing? I feel so privileged to have been one of the 300 people that has even seen this.

Matt Carey (42:23)
Nah, Twitter's pretty cool like that. Like the people that you meet and that sort of thing.

Wilhelm Klopp (42:27)
Okay,

I have a big question for you. Big topic. I was thinking about this a little bit and I think it would be interesting to chat about. I think like, if you, yeah, okay, so here's the thing. If you were to build a product, maybe even a paid product, where the purpose of the product is to transform a code base, any code base, into such a shape that agents can work really seamlessly within the code base.

Coding agents, how, like what features would the product have? Like what would you do to the code base?

Matt Carey (42:56)
Yes.

Wilhelm Klopp (42:56)
I know there's some

stuff with this that you've already thought about with Shippie, especially with rules files.

Matt Carey (43:02)
Shippie, we've thought about a little bit. It's like the classic thing with Shippie where I've thought about it a lot for other people and haven't necessarily implemented it that well for myself. So I need to be better at that. But I would say the first thing is to have, like number one has to be, you have to have really good tooling in the code base that's well signposted. So like your literature has to be super strict. Your formatter has to work.

Wilhelm Klopp (43:24)
Mm-hmm.

Hmm.

Matt Carey (43:28)
actually apply the linting rules. ⁓ Like type checking, you really should be using a typed language or enforcing type hints if you're using Python. And I think that's like number one, like mega. And that should be super well sign posted with a rules file. Like this is what you do every time you make a change, run... ⁓

Wilhelm Klopp (43:31)
smart. Yeah, yeah, yeah.

Mmm.

Matt Carey (43:50)
Lint or something. And if it's broken, run the Lint fix before you do anything else. And maybe you'll fix it. like in, instance, if you're using TypeScript in your TS config, you need to have things like no unused variables, no, like no unused function parameters, like all of the very strict stuff, because models are awful at cleaning up after themselves. So you need to, so that's that, that'd be the first thing. second thing,

Wilhelm Klopp (43:52)
Yeah, yeah. Love it.

Matt Carey (44:14)
I'm sure there's a lot of debates about code-based structure that I really don't want to get into. I think that's very... I think it's super up in the air about what the best code-based structure is. Probably something that is in the training data a lot. So you probably want to use libraries that are either very familiar to the model or rely on basic fundamentals that the model is very used to.

Wilhelm Klopp (44:17)
Let's get into them, man.

You

Mmm.

Matt Carey (44:40)
So things like

Wilhelm Klopp (44:41)
Damn,

yeah, that's such a good point.

Matt Carey (44:44)
So you don't necessarily have to use React from five years ago. You could use something like Redwood JS because it relies on React server components and it relies on a fundamental computing paradigm that is quite well understood by a model. And also they've done a lot of work in releasing rules files for that particular repo. So it's actually quite a good use.

Yeah, like, I think you just have to feel like, think very carefully about what libraries you're going to use. And that doesn't always mean you have to use something from five years ago. It just means you have to use the thing where the maintainer is caring about keeping this stuff relevant and, and using, using standards. Like the standard based stuff is always going to work better than anything else. Yeah. I guess that's like high level overview.

Wilhelm Klopp (45:23)
Yep. Then some like custom thing. Yep.

Yeah,

that makes sense. No, that's really cool. Those are like two axes which I haven't even really thought about, especially not ⁓ the good tooling stuff. Well, I was just kind of brainstorming. I mean, imagine if literally this was a product, right? And it was like, okay, turn my code base, like click a button, turn my code base into like a thing that's like, just will work a bunch better for agents. I think there's a couple of like really dumb things like.

Matt Carey (45:35)
How is it?

Wilhelm Klopp (45:52)
I think the AMP guys were talking about this as well, that just like, if you have giant files, they just don't work very well. So you just want to be like splitting up. think like Cloud Code has a limit of 25,000 tokens that it'll like, it won't be able to ingest the file and with its read file tool, if it has more than that. So at a minimum, you just want to be splitting up files if there more than 25,000 tokens. For me in my, in the Semmlepool code base,

Matt Carey (46:04)
Yeah.

Wilhelm Klopp (46:19)
most files are smaller than that, but a bunch of test files aren't, But it's really, really easy to split up test files, right? That's not a hard thing to do. You could just run some Cloud Code tasks, like, okay, split up the test and the test file. So I think that by itself is gonna be an improvement that'll just make agents work better. ⁓

Matt Carey (46:37)
Yeah. In your linter, way

you can stop that is like, can literally just have a max file length in your linter. I remember having that one company I worked at had a max file length of 200 lines, like mandated in the linter. And it's like, that is, that is pretty strict, but I mean, works. makes you think about your file structure.

Wilhelm Klopp (46:43)
That's cool.

That is pretty strict, yeah.

Yeah, that's really interesting. It reminds me a little bit of the, I feel like there was an implicit rule at GitHub that like any function that was longer than like three lines was a bit sus and you shouldn't really break it up into more functions.

Matt Carey (47:07)
Like

the Clean Code vibe.

Wilhelm Klopp (47:11)
Yeah, exactly. Some of the stuff, although speaking of that, it makes me wonder like if having, like a bunch of logic kind of co-located in, one place, like smart co-location in the same file is probably also really, really helpful for an agent. So it's not like spread out across the whole code base.

Matt Carey (47:28)
Yeah, maybe. think that's kind of tough to think about. think like grouping your, grouping your, your functionality, like very logically. So for instance,

Like I've been getting quite good success having a monorepo with packages and the package is doing one thing really well, rather than like previously you might've stuck them in like a lib folder or like some sort of modules folder. Now you have just have like another package. That package has a readme, that package has like maybe it's even its own rules maybe. And.

That package just does one thing really well. So for my MCP server hub, one of my packages is MCP config types. And all that package does is export a very opinionated way of how we specify MCP config. ⁓ It's a type only package, but I could have put that in the main app, but actually it needed a bit more documentation and it's a very high touch point like.

Wilhelm Klopp (48:09)
Yeah, yeah, that's awesome.

Matt Carey (48:20)
If I'm touching this, I'm going to break a lot of other stuff. And so immediately it's like, I can write some things there about that and inform the agent that if you mess around with this, you're going to have to run some linting. You're going to have to run some tests even like 100 % as soon as you change this, you're going to have to run package for like, you're to have to run a build script also. Rebuild the.

Wilhelm Klopp (48:39)
Yep, yep. I

love the thing you said earlier about like, you have to, like having a rules file that says you have to run this thing whenever you make a change. That seems like really, good.

Matt Carey (48:50)
There was an awesome way of working with models that Jonas from Iterate told me about, which I'm going to spill the beans now, because I think it's been long enough since he told me about this, that stuff's moved on, where they had this script, that their whole code, they had this coding agent, and this whole coding agent was just a script that returns the next thing to do, and then said, once you've done this thing, call this script again. ⁓

Wilhelm Klopp (49:14)
Mmm.

Matt Carey (49:14)
how they created this like loop. And I thought that was really like a neat way of doing it. So that the control flow is very separate from the execution flow. And I think you probably need to think about those patterns quite heavily when you're optimizing a code base now.

Wilhelm Klopp (49:24)
Totally.

What's good way to like, say your code base has zero rules files. Like imagine you're like maybe even a reasonably sized company or take something like the Z code base actually, because it's quite large. I always thought, especially in a world pre where models were very good at Rust. Supposedly Sonnet 4 is very, good at Rust. I've heard a few takes that like, oh, you can't really tell the difference between Sonnet 4 and 3.7 or like it's a very incremental jump.

But Nathan, I actually saw them briefly at the AI World Fair thing. I ended up getting in the other day I went. And it turns out the guy who had sent me away was completely wrong, by the way. there's a guy said, oh, your badge doesn't get you access to this. The this was literally the whole conference. Like there was nowhere else for me to go besides where he didn't let me in. He didn't like the look of me. Exactly.

Matt Carey (50:16)
He didn't like the look of you, was like, this man looks like a rongan.

Wilhelm Klopp (50:21)
And anyway, I got in the other day, saw all of them and Nathan was like, yeah, Sonnet 4 can write Rust. Whereas previously the models really couldn't. Anyway, so that's interesting by itself. so if you had something like the Z code base, maybe it has like one rules file. Like how would you go about just peppering it with lots and lots of great rules files, especially like nested rules file, like the Claude.md ones can be nested, right? Which feels like a good pattern. Like how should a company think about going from like,

zero or like almost no rules files, just like having lots and lots of great rules files that help with like agentic development.

Matt Carey (50:54)
mean, the way I do it, which probably isn't the most systematic way. In a few weeks, I'll say run shippy rules, my great. That's what I'm say. But now, and that should go through your code base and build rules files for stuff that looks interesting. But now, what I tend to do is you just got to use the coding agent. And when it does something silly,

Wilhelm Klopp (51:02)
Nice. Nice. Nice.

Mm-hmm.

Matt Carey (51:16)
where you're like, that's not how a code base does it. it uses, the most simple one is like, what package manager is it using? If you're in JavaScript, in TypeScript land, there's like eight package managers. Like, is it using NPM by default, which it probably is, but maybe you're using Bun. Okay, that's a really easy one. Write a rules file that's like, we use Bun, never use NPM. These are all the commands you can use in Bun.

We like a big one for me for Shippee is we use bun, but we don't use any of the bun specific features because we want it to be able to run in node. So it's like, it's a big toss up. It's like use bun. We use TS up to build it and we don't use any of like the bun SQL light stuff. We don't use any of this stuff because that breaks node compatibility. So please do that. And I mean, as soon as you have an issue.

Wilhelm Klopp (51:52)
Right.

Nice.

Matt Carey (52:01)
with your agent coder. Just stop the agent coder, write a rules file, restore from checkpoint, try it again. If it works the second time, your rules files good. Knock yourself out.

Wilhelm Klopp (52:13)
That makes sense. that kind of goes back a little bit to the tooling thing. Are you familiar with, there's this documentation framework which is quite popular in the Python world called dia-taxis. Have you ever heard of this?

Matt Carey (52:23)
No, briefly the documentation I was gonna ask you, are the Z docs for like their rendering engine and everything, is it in their code base?

Wilhelm Klopp (52:32)
Yeah, for... my god, I'm blanking on the name. What's it called? What is the framework called?

Matt Carey (52:38)
That's a big deal, right? Like if

the docs aren't co-located with the code, I think that's going to be a massive anti-pattern.

Wilhelm Klopp (52:46)
Yeah, I think the problem is the docs are more than 128k tokens. ⁓

Matt Carey (52:51)
Yeah, you

can not put the whole docs in, but like they should either be co-located with the code, like in files next to it, or they should be separate, but you should set up like the MCP tooling to be able to search the docs. That's what I just did with the SAP one once.

Wilhelm Klopp (53:05)
GPUI is the name of it. Yeah, yeah, yeah, I agree. think like, what I was thinking is like, ideally, you have a couple of key examples or something from that, right? Or like a list of where to go and find out more or like this sort of stuff. But yeah, I think that was one of the, I think that would be one of the biggest, like, I imagine, like improvements to folks working on Zed is if you have, if you really teach the model well about

GPUI and how to build with GPUI.

Matt Carey (53:32)
Yeah.

So the way I've done it with stack one, which is, think it's kind of fun is we use minify for our docs. So we have an LLMs full.txt, which is like a whole docs as one markdown file. And so I have this very, very simple MCP server, which has one tool on it. It's like search stack one docs and

Wilhelm Klopp (53:50)
Mmm.

Matt Carey (53:51)
that search tag one docs does a request to that lms.full, gets a whole markdown file and then does like a fuzzy match and returns some chunks which contain the fuzzy match of the keywords that you pass in. It's not smart at all, but it does work and.

Wilhelm Klopp (54:04)
That's awesome.

Matt Carey (54:08)
I think that sort of tooling people are going to really value in the future. Like we're probably going to have that sort of tool for, for your docs, but we're also going to have that sort of tooling for every library that you use. Yeah.

Wilhelm Klopp (54:20)
Yeah, no, that

I really like that. think like it's a bit similar to the context seven approach, right? Like first find the library then. Right. That feels like pretty smart. Having something like that, especially because yeah, I don't know. Like it feels like ⁓ searching through an existing thing that you're dealing with is just like such an unlock. Like one thing I've been doing for actually back for this migration thing is.

Matt Carey (54:26)
It's exactly like taken from that.

Looking

at the node modules, briefly, the way that the vibe rules thing works is they're like, you export your rules files for this particular thing as an LLMs in the node modules. I'm not sure.

Wilhelm Klopp (54:49)
Right.

Hmm.

Matt Carey (55:00)
that's gonna give you varying levels of success, right? Like I think docs are generally bigger than this singular rule file is gonna be. They're gonna be more expansive. They're gonna have more examples in them. I feel like I prefer access to docs rather than to like a hastily written. ⁓

rules file was meant to tell you about how to use the things because I installed the post hog one and Like the post box setup is beautiful They have this like AI setup now and it installs a rules file for you about how to use post hog But that rules file it is rubbish. It says like it doesn't even say you're using post hog. It's called like post hog setup It's it's dreadful. Like it's actually I can't explain how bad it is. I deleted it immediately and hopefully they improve Yeah

Wilhelm Klopp (55:17)
Yeah. Right.

Hehe.

Maybe a

good MCP server would be like a actually good search agent that goes out and just finds relevant stuff to what you're trying to do.

Matt Carey (55:54)
Yeah, yeah, like, I mean, you'd hope that would be solved by the underlying platform like cursor solved that problem.

Wilhelm Klopp (56:02)
Yeah, although none of them really do it, right? So I mean, like something that can like search context seven, find the right library docs, but also like search a GitHub repo, like search for the right stuff in, because well, AMP has like source graph access, right? But I actually haven't really seen it used or I don't know.

Matt Carey (56:19)
Cursor

used to use the web search a lot. my guess was, I've been through these fluctuations, but I'm like, cursor is amazing, cursor is rubbish, cursor is amazing, cursor is rubbish. I genuinely think a lot of this is about how much they've tweaked, how much it can use web search. I'm not even, because web search must be super expensive. Like, SERP APIs are really expensive. If you're using Tabli or you're using like, like,

Wilhelm Klopp (56:28)
Yeah, I know what you mean.

⁓ really?

Mmm.

Matt Carey (56:47)
like they're so expensive to run at scale. And so I genuinely think like a lot of whether cursor is good or not is like whether it can do that.

Wilhelm Klopp (56:54)
That's fascinating. But even with web search, Web search rarely does GitHub code search.

Matt Carey (57:00)
Yeah, yeah, having GitHub code, but that's what context seven is meant to be, right? GitHub code search. That's meant to be it. And that's what's, what is, what is sourcecrafted?

Wilhelm Klopp (57:05)
or I mean...

That is what Sourcegraph is, yeah. But I think it's, I mean, Sourcegraph, product, I think is mostly that for giant internal monorepos, but they do have like a public like search GitHub one. I just haven't really seen it used much. Like one of the ways I managed to get around some problems in this migration I was doing is by ⁓ using this tool called repo mix, which I heard about on the how about tomorrow pod. Very cool.

Matt Carey (57:32)
I'm not doing very cool. Very, very cool.

Wilhelm Klopp (57:35)
So the idea is like you just dump like a GitHub URL and then gives you a giant XML file of all of the ⁓ files in the GitHub repo. And like for the libraries, like I was migrating from like one mocking library to another, neither of the libraries are like that big. They both kind of fully fit into the context window if you exclude some test files or some other like random big files. And then you just, three just has like the full.

you know thing and you can just be like the full code base and you can just be like how does this thing work in the old library how can we add that to the new library so it's like that was like really really useful but i just and that would have been a smart thing to try and search for but i didn't see any of Claude code or any of the agents like try and find like the relevant stuff even though it was really instrumental for making this like refactor work

Matt Carey (58:20)
Yeah.

Yeah. Originally what

I ended up doing is finding the right thing, copying the GitHub raw URL and pasting it in and being like, this is how I want you to do it. Like regularly. the raw URL, the one that just returns like 10. And just like, you know, this is how I want you to do it. Do it like.

Wilhelm Klopp (58:34)
Wait, sorry, copying the WotGithub link.

cool. Yeah.

Right, right.

And you paste that into cursor.

Matt Carey (58:44)
into cursor, into code, into whatever really. Often it's the most useful when you're like building a prompt array methodically and you're like, we're going to take this structure example here. We're going to use this library example here. And I feel like that's very useful if you're writing it as a GitHub issue rather than cursor feels a little bit more like ephemeral. Like I could do it. I could roll it back. I could do it. I could roll it back. Whatever was in a GitHub issue. It's more like

Wilhelm Klopp (58:46)
Right. Yeah.

Matt Carey (59:08)
this shit lasts forever. let's make it good this time. So I think more about it then.

Wilhelm Klopp (59:13)
⁓ Just one more thing. I know we're talking for a while, but this DIA Texas docs framework that I was talking about, ⁓ quite like it. At the core it is this two by two matrix. And obviously we love a two by two matrix. But the idea is just like, how should you structure your documentation? And I think like two or three of these are like pretty straightforward. So like the top left is like, you should have a tutorial. ⁓

Matt Carey (59:15)
We need to finish this. Yeah, yeah, sure.

Wilhelm Klopp (59:39)
The bottom right is like, you should have reference documentation, right? So like, like goes into big detail about like what literally every like symbol in your code does or every function does. And then in the top right, you have how to guides, which is like how to accomplish specific things, specific goals, like with your thing. But then the thing that I thought is really, really fascinating and relevant to like the question of what to put in the rules files is like, you should explain concepts. Like you should just have like some.

sections that are not a tutorial, not a how-to guide, not a reference, but it just explains like core concepts in your thing, right? So in Colo it would be like, what is a trace? Or in simple pull is like, you know, everything is around the pole. Everything is about like a channel in Slack, like these kinds of like core underlying fundamentals that like, if you understand them, everything else makes like a lot more sense. So I think like, this is another interesting thing you could talk about in your rules file for whatever

codebase you're working on is like here are the like four or five key concepts that if you really understand them well like everything else would click into place.

Matt Carey (1:00:41)
Yeah, I think Pydantic AI has like a really good method of like Pydantic in general, like they do this really well. Like their documentation is all about concepts about different things. Then they have a bunch of examples, which I think is missing from this. Where do think examples fits in that two by two?

Wilhelm Klopp (1:00:52)
Hmm.

how-to guides maybe

Matt Carey (1:01:00)
Yeah. Okay. Fair. Yeah. Like specifically. Yeah. And that's what modal labs do so well. Like they have, they must have hundred examples of how to use their model to do or how to use their, like their infrastructure paradigm to do useful things. Like actually useful things, not flight booking.

Wilhelm Klopp (1:01:02)
Yeah, top right.

Mmm.

Yep.

That's awesome. One last thing I'll say on the what to put in your rules files thing is like a compact representation of some some traces I think is like really key. I think most code bases have, and obviously this is where like Colo comes in. I think most code bases have like sort of a couple of really key transactions that happen in the code base where transaction is like, I don't know, the thing that like, you know, where a lot of code is hit or where most people

Matt Carey (1:01:26)
Whoa!

Wilhelm Klopp (1:01:43)
do the thing. So in simple poll, would be like creating a poll, voting on a poll. In like an e-commerce site, it might be like the user checks out, right? Like you really want that checkout code. Like probably it's like very optimized, lots of tracking in it, all this stuff. And if you have like a compact representation of a trace that shows like all of the functions that were hit in that key transaction, whether that's checking out or creating a poll or I don't know, in Chippy, might be like poll request opened or, you know.

new code pushed up or should be invoked or something like that, right? Then I think like you teach the agent like not just here's a bunch of files and directories or whatever, but it's like, here's all the functions, here's all the files that were kind of invoked in that key transaction, which I think probably quite useful.

Matt Carey (1:02:23)
Yeah, definitely, definitely. man, it's been great fun.

Wilhelm Klopp (1:02:26)
Once again, we don't run out of things to talk about.

Matt Carey (1:02:29)
We don't, we could do this for a long time. An hour of that is enough. What time is it for you?

Wilhelm Klopp (1:02:32)
We could. ⁓

⁓ Time to do some work.

Well, so I'm in Europe at the moment, right? So like, I was in London last weekend. We didn't see each other sadly. yeah, I was gonna say this actually. I organized like a 10-year high school reunion and it was a blast. It was so good. ⁓

Matt Carey (1:02:43)
God!

You organised this.

When you sent me the picture, I was like, I was like, cool. I don't know any of these people, like, you had a great time. yeah, that makes way more sense. You were proud. I was, I read, I was looking at it. It was like midnight and I was just very confused while you were sending me this picture. And I was very happy. But okay, I'm really glad that you organised it. Very lovely.

Wilhelm Klopp (1:02:59)
I was just proud man. No, we just had

Yeah, yeah. And it was just so fun. Like, I highly recommend organizing your 10-year high school reunion or whatever year it is. It just needs someone to do it then it happens.

Matt Carey (1:03:23)
happens yeah no no no very cool very cool very cool okay

Wilhelm Klopp (1:03:26)
So I'm

in Europe this week and next week. like, yeah. And then the weekend after this one, I'll be in London again. So we should definitely hang out.

Matt Carey (1:03:34)
Cool cool cool cool cool cool. Yeah, no, we definitely should. Yeah, because I'm in Portugal next week. Then I'm back. So I'll be around. Sweet. All bye dude. Bye. ⁓

Wilhelm Klopp (1:03:40)
Nice.

Sweet, have a good one. Bye.

Creators and Guests

Host

Matt Carey

ai engineer @StackOneHQ

Host

Wilhelm Klopp

building @kolo_ai

Structuring Codebases for AI, Claude Code in GitHub, Scale Acquired! Granola Cafe, AI Rules and More MCP

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere