The Ruby AI Podcast
The Ruby AI Podcast explores the intersection of Ruby programming and artificial intelligence, featuring expert discussions, innovative projects, and practical insights. Join us as we interview industry leaders and developers to uncover how Ruby is shaping the future of AI.
The Ruby AI Podcast
Minerva Magic: OpenClaw, Agent Status Pages, and Training an AI Coworker in Ruby on Rails
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
What happens when you treat an AI agent like a co-founder instead of a tool?
In this episode, Valentino and Joe go deep into a real-world experiment: spinning up an autonomous agent using OpenClaw, giving it domains, goals, and just enough guidance to build an actual business. From creating accounts and managing projects to writing code, deploying with Kamal, and even designing its own training curriculum, the agent evolves from confused assistant to something resembling a junior engineer with initiative.
Along the way, they explore the messy reality of agent workflows: memory systems, self-training loops, PR reviews, hallucinated confidence, and the constant tension between autonomy and control. The result? A working product, 15 early users, and a pile of hard-earned lessons about what AI can and definitely cannot do today.
If you’re building with agents, thinking about autonomous systems, or just curious what happens when you let AI run a startup… this one’s for you.
🔗 Show Notes
- Valentino's Minerva Experiment
- Minerva's First Product
Core Tools & Frameworks
- RubyLLM (Carmine Paolino)
- Kamal (Deploy Rails anywhere)
- Tailscale (Secure networking)
Libraries & Infra Mentioned
- ExtraLite (SQLite performance layer)
Learning & Community
- Ruby AI Newsletter (Matt Solt)
Other Mentions
- OpenClaw
- Claude Code
- Action MCP
- Fizzy (37signals)
- Magic Beans (graph-based project management for agents)
- ups.dev (agent status pages project)
- DailyVibe.ai
Books & Resources Referenced
- Practical Object-Oriented Design in Ruby by Sandi Metz
- Programming Ruby (Pickaxe Book)
- The Well-Grounded Rubyist
- Layered Design for Ruby on Rails Applications — Vladimir Dementyev
Cultural Reference
- Wired article on AI-generated band marketing (“Geese”)
00:00 Podcast kickoff
00:40 Geese AI marketing psyop
02:03 Starting an AI band
04:18 Daily Vibe artist generator
06:24 Open Claw origin story
08:52 Domains to business ideas
10:44 Onboarding an AI coworker
13:22 Handholding and action loops
14:10 Shark Tank idea filter
15:32 Training and memory system
20:20 UPS dev agent status pages
22:54 Rails build struggles
24:04 Bootcamp with Ruby books
26:20 Rebuild MVP and open source
27:55 Deploying with EC2
28:38 Locking Down Access
30:06 AI PR Reviews
32:50 Self QA Automation
36:11 Fixing Agent Memory
38:15 Email and Token Costs
40:11 Heartbeats and Delegation
43:01 Customer Discovery Lessons
44:49 Selling Workflow Friction
48:19 Knowledge Base Frameworks
51:37 Open Source Model Future
53:57 Security Agents and Wrap
Valentino Stoll 00:00
Hey everybody, welcome to another episode of the Ruby AI Podcast. I am joined by my host today, Joe Leo, and I am, of course, Valentino Stoll. And we have a fantastic just shoot the shit panelist episode today.
Joe Leo 00:16
Just the two of us.
Joe Leo 00:17
Just the two of us, whether you like it or not, listener.
Valentino Stoll 00:22
So I've had this experiment running since the beginning of the year. I'm kind of taking advantage of open claw in my own way, and we'll dig into that in a bit. Yeah, but first, Joe, you had this great suggestion. Why don't you take it away?
Joe Leo 00:38
I like to start off with something a little current, and what I've got for you today is a news article from Wired. The band Geese have been exposed as a psyop by Wired magazine, which I really love. So this is from Wired.
Joe Leo 00:56
So the band Geese have utilized the services of a digital marketing firm called Chaotic Good Projects, who used AI to spin up thousands of fake TikTok accounts to bolster the credibility of the band Geese and get them tons and tons of marketing and eventually publicity and shows.
Joe Leo 01:18
And it really worked. So Geese had been on Saturday Night Live, and they have sold out shows around the country. They're going to be headlining Coachella. They were here in New York selling out, I forget if it was The Garden or somebody else. And I'll tell you what, listener, it also worked on my Spotify account because I had never heard of Geese,
Joe Leo 01:38
and then all of a sudden I could not get them out of my AI-generated playlists. And so what I want to know first and foremost, Valentino, is can we call Chaotic Good and then maybe the two of us just stop this whole podcast thing and we start a band? Because I think this'll work for us.
Joe Leo 01:59
I think this could get us booked at The Garden, which would be a lot cooler. What do you think?
Valentino Stoll 02:04
We should totally start a band. I think we should just call it Why Are We a Band.
Joe Leo 02:08
Why Are We a Band? I think that's good. Geese is, you know, a terrible name, but they're not a terrible band, I should say. They're not a great band, but they're not a terrible band. But it's a terrible name. There was already Goose, you know, which is a jam band, and so it was very confusing. And I felt like I was listening for a little while and thinking, okay, this has got some kind of like throwback Van Morrison feel.
Joe Leo 02:30
And like Van Morrison, after you hear it a few dozen times, it really gets annoying. So apologies to the late Van Morrison, but really, you know, he made a couple of great songs, but if you had to listen to 'em on repeat, it's really terrible. And so I just think that if that's what we're up against and we hire the same firm, we're golden.
Joe Leo 02:51
Like we are the exact age where everybody wants middle-aged men just talking on podcasts to start a band. I think we're in great shape here.
Valentino Stoll 03:00
I think if we just make it all purely AI-driven and generative, only virtual appearances.
Joe Leo 03:09
Oh yes. First of all, I go to sleep at like 9:00 PM. I'm not doing a concert. Yeah. We'll, you know, we'll record it at 7:00 AM or so. And I feel like that's the follow-up to this big event,right? Like somebody social engineered like this platform for artist release,right? The logical next step is platforming, creating these artists to begin with.
Joe Leo 03:28
I think that's true. And you know, as a person who goes to concerts and events quite a bit, really what people want these days, they don't even want to go there. They want to go home and go to sleep like I do, but they want to go there and get the photos for Instagram. And so I think we could even take that and say, hey, look, we're going to do this show at Madison Square Garden.
Joe Leo 03:47
We're going to sell a bunch of tickets and they're going to go on the secondary market and it's going to cost a fortune. Oh my God, it's going to cost you so much money to go. But what you're going to get out of that is you don't actually have to go. We're going to give everybody their own AI-generated photos where it looks like we're in the background and you are having the time of your life and you could just post out on Instagram.
Joe Leo 04:08
But you could go and totally do something else.
Valentino Stoll 04:11
That is great. Oh shit.
Joe Leo 04:15
I think we're onto something now.
Valentino Stoll 04:17
I think we're onto something. You know, it's funny. I have this site called dailyvibe.ai and it started just for fun. I have it creating artists based on news, the daily news. So it does like searches for the local news and then it creates artists and creates like this artistic representation of the day for your location. And it was just fun,
Valentino Stoll 04:35
but like it has evolved into this artist generator.
Joe Leo 04:40
Yeah.
Valentino Stoll 04:42
And so like you could like play, there's like makes a song for the day. It creates the musician and the musician has a backstory and like it defines who they are as an artist. The machine is built. Let's do it, you know?
Joe Leo 04:55
Absolutely. Absolutely. We're going to rope your new creation, Minerva, into this because if there's one thing we need, it's going to be an open claw instance that we could text all the time. I'm going to want to, when I am on the road, when we are on the road, I'm going to want to send some really bananas requests to the different venues for things that I need there in my,
Joe Leo 05:17
in our room. So I'll have Minerva do all of that fighting with promoters, stuff like that. So I would like to do that. And I'd like to talk a little bit about this project. So for those who don't know, I'm not the only entrepreneur on this podcast anymore. I once was.
Joe Leo 05:33
I had that distinction, but now Valentino has taken the bull by the horns and he's started a number of companies and he is now probably in his enterprising era.
Joe Leo 05:48
So why don't you let us know just a little bit about what you did at a high level and then we could dig into some of the details. And just for a taste of what's coming up, this is a really cool project and we're going to get into a lot of the open source tools. We got our friend Carmine and Ruby LLM. We've got Action MCP.
Joe Leo 06:09
You being the kind and community-oriented person you are, unlike me, you have already released your own open source adjacent to the company building you're doing. So there's a lot to get into here. And so I think it'll be fun. So tell us a little bit about what this is.
Valentino Stoll 06:24
Open Claw craze started like over Christmas with the big releases of all these frontier models, really upping their game and like making a lot of things possible that weren't before. And I think that's been true,right? We're still seeing the follow-up from that. And so I wanted to experiment with what open claw was and,
Valentino Stoll 06:43
you know, see if any of it really was valuable or not. 'Cause there's just so much hype,right? Like the lobster's just proliferated and it got to the point where I was just like, just seeing a lobster in my social feed, didn't know what it was. Finally just like took a breath, one message and like, allright, what is this lobster? Like by the time I got to that point,
Valentino Stoll 07:04
it was already in the transition from, what was it? Claw, Clawdbot or Clawdbot.
Joe Leo 07:11
Yeah.
Valentino Stoll 07:12
CLAWD or something like that.
Joe Leo 07:14
Right.
Valentino Stoll 07:15
And then Anthropic was like, ah, it's too close to Clawd. Please change the name. And then it turned into WifeLacer because Aaron's, especially since he doesn't like Clawd that much.
Joe Leo 07:24
Right.
Valentino Stoll 07:25
So it was kind of funny. You know, it mashed up with my amusement,right? I love jokes and it seemed to be a big joke.
Joe Leo 07:32
Right.
Valentino Stoll 07:32
Here it was like this very popular project that was in the middle of this like weird naming, called three different things and it was still popular,right?
Joe Leo 07:41
Right.
Valentino Stoll 07:41
And after all of that, people were still like raving about it. So despite Anthropic's best efforts, it continued to be called Clawdbot for quite a while. And it's still called Mole Book in a lot of ways. But I discovered Open Claw and took it for a spin.
Valentino Stoll 08:00
I ran through one of these YouTube things for securing it in an EC2 instance and locking it down to Tailscale and all that. And so I was only messaging with it personally, but in the cloud through like some secured SSH stuff.
Joe Leo 08:15
Okay. So I think this point of is still probably a good point of clarification 'cause there's still a lot of people that just take this thing and fire it up on their Mac and have it go nuts. But you were not doing that.
Valentino Stoll 08:25
I wasn't doing that. But to be honest, now with Anthropic's latest announcements on restrictions, I may start doing that.
Joe Leo 08:33
Yeah.
Valentino Stoll 08:34
I haven't decided yet. So I spun up this instance, made the mistake of following the guidance on that YouTube video for a large instance, and it turned out to cost money despite him saying it didn't cost money. But either way, you know, I was willing to foot the bill for like my preliminary thing. But it all came down to,
Valentino Stoll 08:53
okay, what do you do with this thing? And so I had all these like domains lying around. It was paying. I don't even want to talk about how much I spend a year on domains.
Joe Leo 09:04
But I love this. I love this hobby. You're not the first person I've talked to.
Valentino Stoll 09:07
Right.
Joe Leo 09:07
You know, and I've got probably a hundred.
Valentino Stoll 09:09
A hundred domains.
Joe Leo 09:11
Yeah. But so many of them are just death method spinoffs,right? 'Cause I don't want somebody to go to death-method.com and start competing against me. So I have a lot of boring ones. And then I have like, you know, I have my daughter's name. My wife is like, there wouldn't even be an internet by the time she's ready to use it. But yeah, I've got a bunch of those. But I like the people that go out and they just come up with a name and they're like, that would be interesting. Let's get that. That's what you do.
Valentino Stoll 09:30
Yeah, exactly. So like, you know, I had a bunch of like product ideas that come up over the years. And especially with like when the .dev TLD like released, I bought up a bunch of like even just three-letter domains, which was kind of funny at the time. I bought AIG.dev.
Valentino Stoll 09:48
I think it would be funny to like make a landing page that just redirected to Geico or some competitor.
Joe Leo 09:54
Yeah. That's good.
Valentino Stoll 09:55
And see if I could like get them to pay me to like change it. And that was a huge mistake 'cause I got a cease and desist letter.
Joe Leo 10:02
Did you really?
Valentino Stoll 10:03
I did. And they're like, we are not associated with Geico. Like please stop. I'm like, allright, I don't want to deal with this. So like I just stopped doing that. And so now that one's kind of like, maybe I should just cancel it, but it's three letters, but it has AI in it. So like maybe I should revisit that one. But either way, there's gotta be somebody,right?
Valentino Stoll 10:22
So I had this big list of domains and some rough ideas of products that I think could fit well. And so like, I'm like, okay, how easily could I just get this bot to like create a business on these domains or at least monetize the domains? And so that was like my premise for all of this.
Valentino Stoll 10:42
Is that possible? And so I went through the painstaking agony of like onboarding this agent. And so like the first thing you do is like create accounts for it. I didn't personally want to connect all my accounts to it. I wanted to work independently,right? Like I'm booting up my co-founder ultimately,right? And so like, I think of it as a coworker.
Valentino Stoll 11:03
So like that was like priority number one. Like, okay, can I just get it to like be a person,right? And so I have a hey domain email. And so like I just added an email account for it. And then I said, okay, I set up an email account for you. Here's your password. Log in and change it and let me know when you're done.
Valentino Stoll 11:22
And when I did it, it like went, visited hey.com, followed the acceptance link and clicked and followed through all the process. And then it emailed me when it was done. I don't know. I was like, wow, success. That's pretty incredible. Like that was the first impression I got. And so like, pretty impressive. Like just the browser capabilities,right? To do that.
Valentino Stoll 11:42
And then I'm like, okay, well like I set up and walked through like its purpose, what we're trying to do. And then I was like, okay, like how are we going to manage the products for all this stuff and the project scope and management? And at the time, Fizzy had just come out from the wonderful 37 Signals team.
Joe Leo 12:00
Right.
Valentino Stoll 12:01
Which was like a Kanban style project board. And I had just seen a post from DHH that was like, oh, I got Open Claw to sign up on.
Joe Leo 12:10
I saw that scene.
Valentino Stoll 12:11
Right? And so I was like, oh, awesome. So like, hey, go sign up for a Fizzy account. And it went and it signed up and used the email address I gave it. And then I was in and it was then creating projects, inviting me to them. And then we set up this workflow, me and this bot over Signal. And I was just like,
Valentino Stoll 12:31
okay, set up our project structure. Here's for our project workflow. So like anytime that you create a task and you need my attention on something, just tag me in it and I'll come in and look at it and approve or whatever. And so it was like, great. Then I was like, okay, here's all my domains. Here's how much it's costing me.
Valentino Stoll 12:51
And here are some rough ideas based on, I dumped a document and I was like, prioritize these based on what you think we can monetize a product out of. And it went and it like came up with some ideas. And, you know, it's all just research and docs. And I was like, okay, I like these three ideas. Let's focus on just two to start.
Valentino Stoll 13:13
And so go ahead and we're going to use Reels and this is like the workflow that I like doing. And it was pretty rough to get started. It needs a lot of handholding. And I was like tired of all the handholding.
Joe Leo 13:26
In what way? What are you, where were you handled?
Valentino Stoll 13:28
Yeah. So like it would fall apart and just like try and lean back into research and like document generation and not really taking action. And so I was like, how do I get it to take more action?
Joe Leo 13:40
Which by the way, employees do this too.
Valentino Stoll 13:43
Right. Exactly.
Joe Leo 13:43
It's a real thing.
Valentino Stoll 13:45
It's a real thing.
Joe Leo 13:45
'Cause it's hard. It's hard to take those risks. Now I'm imagining the LLM is doing it for different reasons, but it's interesting that that's what happens.
Valentino Stoll 13:52
Yeah. And so before we get there, like it even had a problem coming up with good ideas. It tried to like take too many of the ideas that I had and like then strictly like focus on building those. And I was like, no, no, no, like that's not what I want. Like I want you to think critically about which domains and like kinds of businesses are best. And so I actually created a shark tank exercise.
Valentino Stoll 14:13
Allright. Imagine like you are a founder and you have this idea and you're pitching it to the shark tank. So run through a simulation of what that looks like. Pitch your ideas to shark tank and create a pull request,right?
Valentino Stoll 14:27
And so like I had to set up a workspace and a private GitHub and create a pull request with your submission to the shark tank and critically review as each shark as a comment in the pull request,right? And so I set up this workflow.
Valentino Stoll 14:40
And so like for every suggestion that it came up with, it went through this critical review process to help hone and whether or not this idea would actually be worthwhile. So AIG.dev, it found, oh, it's like a fantastic domain for a short thing, but it was like during its review cycle, it identified that there was too much risk in trademark infringement.
Valentino Stoll 15:01
And so it was not worth building a product for around that. And so it like scrapped it basically in favor of other ones. And so I went through this process a lot. And ultimately what it came down to was this ups.dev domain. And the takeaway from this process was like, I realized that agents are learning by doing this stuff,
Valentino Stoll 15:23
not by remembering or trying to like create docs and like reference them. And so that actually stuck out to me a lot.
Valentino Stoll 15:32
And so what I did, I basically sharply pivoted away from product development at that moment and had it create an agent training program so that as I'm learning these new things and takeaways from the agent,
Valentino Stoll 15:48
it can go and then train itself and create a regimented curriculum out of these experiments and experiences that it's going through where it's falling short. So if I ever got to a point where it's getting stuck in these kind of loops of like not doing what I'm asking, it needed that critical pathway to like analyze what it's doing,
Valentino Stoll 16:09
what I wanted it to do, and then like try and create a like program to distill knowledge about that and reference later. And so that's what I did. And so like from the shark tank experiment, it like created its own curriculum based on like science around like education, which is funny 'cause like it created around human learning,right?
Valentino Stoll 16:30
So like it had like a space repetition exercise built up and it created like a cron task to like every so often like recap and relearn, which is like stupid for an agent. It doesn't learn anything.
Joe Leo 16:44
Well, yeah. Yeah. That's true. And I want to kind of dig into that a little bit because I think this is a real point where I think a lot of people, they're learning themselves how to get their agents to learn. So there's a couple of things you said there. You said that they're learning by doing, which I think is compelling. And then that you have this curriculum. Whenever I've used Open Claw,
Joe Leo 17:06
it is good at writing things down and then, and then reading them. But I still feel like I have to tell it to do that a lot. Like, hey, write this down. Hey, you know, I'll start up a new session. I'll say, hey, go fix this PR and it'll say, I don't know how to do that. I'm like, yes, you do. Go look at your docs,right? So were you able to advance it past that and how did you do that?
Valentino Stoll 17:26
I'm wondering if I should turn this into a product or just release it open source. I haven't decided yet. But yeah, I mean, ultimately what I did is like, I created a knowledge store. So it went through the training and then distilled that knowledge and then was able to reference it in future chats with like a skill.
Valentino Stoll 17:43
I had itself create a skill to like basically use the knowledge that it had distilled from previous training bootcamps. And then any future like conversation I had with it, it would like draw from that knowledge. And so it created its own like distilled memory for what it had specifically done.
Joe Leo 18:03
And what, what that way look like? Is it a bunch of markdown files or is it?
Valentino Stoll 18:07
I just had a SQLite database and it can create many of these databases. And honestly, thanks to FractalMind, Stefan, for XtraLite, which is like a SQLite database that's like super fast. And SQL, which supports the SQL RubyGem, which supports multiple databases. I can then blend all the databases together.
Valentino Stoll 18:28
And as it starts to accumulate various knowledge domains, it has access to all of these scoped as an MCP server,right? That it can then, hey, I'm looking at this domain of knowledge, like what can I learn about this aspect or draw from? So it built all this stuff. Like I didn't have to tell any of it,right? Like at first I was like, that soundsright.
Valentino Stoll 18:48
Like just go ahead and do it. And then it did it and then like referenced it. It was missing the MCP server. So it built the MCP server,right? Like it did all this stuff proactively, which was pretty remarkable.
Joe Leo 18:58
Let me ask you this. At the end of this, you had two Rails projects. Did you also have a third repository of memory knowledge base infrastructure?
Valentino Stoll 19:09
Oh yeah. Yeah.
Joe Leo 19:10
So really it was, you, you had built a project while you were building these other two projects. Like it was like a sort of an artifact project.
Valentino Stoll 19:18
Yeah. I mean, ultimately that ended up being two artifacts.
Joe Leo 19:21
Okay.
Valentino Stoll 19:21
'Cause like there's the workspace and like activity and the agent itself, the coworker and all the accumulated actions and artifacts of the actions as a workspace,right? And then it's the individual knowledge consolidation. Those paired together, but they were really separate.
Valentino Stoll 19:41
Then because I had it in this like parallel knowledge distiller pipeline, you know, I sent it down a whole new task of like, okay, like go explore,right? What it means to have knowledge and whether or not it's useful and evaluate like the efficacy of that. So that way I could know,
Valentino Stoll 20:00
like, is it just like storing documents that it doesn't need because the base models already know it,right? And so it went in and it figured that out, which was fantastic. And then it actually deleted quite a lot of knowledge that it already had access to. So it cleaned itself up. So circling back to like the product,right?
Valentino Stoll 20:20
So it ultimately ended up with ups.dev, which is status pages for agents. And so at first when I had ups.dev, I had this idea of, wouldn't it be nice to like have a status page? Allright. Actually, it created the idea of a status page,right? And just its proposal to the shark tank,
Valentino Stoll 20:40
which got approved, was just like, okay, like basically create status pages.io, but just undercut the price. Like create the same thing,right? Like brilliant. And I was like, well, that just sounds stupid. Like you gotta have a better like idea than that.
Valentino Stoll 20:55
And I was like, it sounds okay, but like it also commented that it would be nice to have like status pages for agents as like a feature of this. And so like I took that comment. I was like, that's the business. Surface that back to shark tank and see what they think. And so then Reed Drewen is like, oh yeah, like shark tank, this is a differentiator,right?
Valentino Stoll 21:15
Like no other company's doing thisright now. Now that we're like airing this, somebody will probably go and start adding it to status pages.io or something.
Joe Leo 21:24
The end of this conversation is gonna be you telling us how you built a moat and now have tens of millions of a current revenue. Say.
Valentino Stoll 21:30
We'll get there.
Valentino Stoll 21:34
So I thought this was a brilliant idea. And like I had a bunch of, you know, like this daily vibe.ai project. I wasn't sure like whether or not that the agents were always successful or airing out. Like I had a dashboard, but like I wasn't alerted to like when things go wrong and how do I communicate that to people who are trying to depend on it? And it's like many,
Valentino Stoll 21:53
many agents that are working together to accumulate all of the information that's needed. How can they look at that and say, oh, like all these different things are down or not and what that means to them,right? And I thought that was a great idea. And so I had to go and like RubyLM, great project. They have this great extension, Carmine.
Valentino Stoll 22:13
He built this fantastic framework and, you know, they have a great extension system where you can just add any extension and basically like plug into aspects of the framework. And I thought, well, how hard could it be to just hook into the RubyLM project and just like any agent that you spin up, they had just released agents.
Valentino Stoll 22:33
And I was like, how hard could it be to just hook into it and say, hey, include ups? And it now is observable and also like can have automatic status pages,right? So like when one of these agents fails, somebody can see that visibly on a page,right? And you can define from an app owner,right, what that means.
Valentino Stoll 22:54
And so I had basically walked through, hey, allright, let's create a Reels aid app 'cause it has all of these standardizations. I was like, okay, go create this. And to be honest, it did a terrible job.
Valentino Stoll 23:10
It asked so many questions and it created so many abstractions. And out of the box, Open Claw was just like terrible at like building something that was solid that I would build.
Joe Leo 23:21
So you are out of the box Open Claw, which model were you using at the time?
Valentino Stoll 23:25
I was using Opus for all of this.
Joe Leo 23:27
You were using Opus. Okay. It's not Claude code. It doesn't have the harness, doesn't have the tooling. So it was bad out of the box.
Valentino Stoll 23:34
Right. Exactly. And so I even tried this with Claude too. So I was like, okay, well, could Claude code do this same thing better? And it was a similar result. It was a little better. Still like the long horizon task of like just go do this thing.
Valentino Stoll 23:49
There were just way too many variables in the way for it to get stuck and like make assumptions that weren't clear and then build on those assumptions,right? And so I was like, allright, I took a step back and I revisited the training bootcamp and I was like, okay, if I train this thing on like Reels aid best practices, what does that mean?
Valentino Stoll 24:10
And like how does it learn like what to do when? Thankfully we have like great resources available by the awesome community,right? And so I went out and I picked up a ton of, I purchased books,right? And I was like, okay, the pickaxe book. I had your book, Joe.
Valentino Stoll 24:29
And the
Valentino Stoll 24:32
practical OO design, a bunch of Sandy Metz books,right? Well grounded Rubius. I took the Ruby Core books, about 150 books in books. Reels aid Dimitri from like Evil Martians,right? Like sorry for mispronouncing your name, but the layered Reels design, fantastic book.
Valentino Stoll 24:52
And so I just went out and purchased all these books and I threw it at this agent program. I said, here are your learning materials. Go learn how to build Reels apps.
Joe Leo 25:01
So were you sending it, I'm curious about this 'cause I wanna do this too. Were you sending in EPUB files and did it have to go and find some tool to read EPUB files?
Valentino Stoll 25:10
I had like, when you download a book, it'll give you like many different links. I just gave it the whole complete directory. Yeah. I said, hey, just like go, here's your material. Like, and it figured it out. So I don't know what it used under the hood. It may have read PDFs. It may have done the EPUB, you know, like identity into it.
Joe Leo 25:28
It's any better.
Valentino Stoll 25:29
Yeah.
Joe Leo 25:29
Like you don't even know.
Valentino Stoll 25:30
I don't even know. Honestly, that was a great experience from that. And so it did. It went in and created a database for Reels and our database for Ruby.
Joe Leo 25:39
So it's persisting what it learns.
Valentino Stoll 25:41
It's persisting what it learns. And then had to go again, but this time I also gave it the great RubyAI newsletter,right? And I said, here are some great resources that are available from the community and called out Action MCP, which is a great MCP server.
Joe Leo 26:01
Yeah. By a quick shout out to Matt Sult because if you haven't checked out that newsletter, you really should, especially the last issue. It really blew me away. I mean, he's doing some great things for the community.
Valentino Stoll 26:10
Oh, truly. Yeah. I don't know how he stays on top. He manually curates this for the.
Joe Leo 26:15
Yeah. And what I remember. He knows everything that's happening.
Valentino Stoll 26:18
Right.
Joe Leo 26:18
That's amazing.
Valentino Stoll 26:20
But yeah. So I had the agent pick apart like the best things that would be useful for the project. And then I had to do it again, you know, and I said, okay, go build this thing. And it was incredibly better. Incredibly better.
Joe Leo 26:34
Did you start from scratch or do you have it refactor?
Valentino Stoll 26:36
I had to start from scratch. No, everything starts from scratch. Yeah. Burn it down and focus on MVPs,right? And so then I, I actually had to go through training of like following the pivotal mindset,right?
Valentino Stoll 26:47
Of like you're a manager of one and you follow like MVP direction and you're trying to just create the simplest thing from first principles and giving it just that guidance with the knowledge that I had, create a great start,right? And so then I was like,
Valentino Stoll 27:06
okay, I'm at this point of like, it's kind of like a usable product. I was using it on like, I had this daily vibe project. I'm like, okay, how would you use this in my project? And I gave it access to the repo and it said, oh yeah, obviously like, you know, we can tie into RubyLM and make an MCP server for it too.
Valentino Stoll 27:27
And then we'll just tie it into all these agents pretty easily. And they went in and it built the open source version of like its MCP access,right?
Valentino Stoll 27:37
So then I could hook into any RubyLM agent, drop one line module, include, and it had the ability to like automatically update the status pages for all these agents and create the status pages if it didn't exist. And I was just mind blown. And so I was like,
Valentino Stoll 27:56
okay, like this seems like something that's ready for production. How do you productionalize that,right? And so I basically spun up an EC2 instance. I didn't give it AWS command line access. I didn't wanna like run up a bill by accident, but you know, I did use Claude code to be like,
Valentino Stoll 28:15
okay, like what commands should I run to set up this instance? Then the instance was set up and I was like, gave it basically SSH access. And I said, okay, use Kamal and deploy this application. And it logged in SSH.
Joe Leo 28:30
Yeah.
Valentino Stoll 28:30
And it set up and configured the whole server, secured it down, and then deployed the app using Kamal.
Joe Leo 28:37
I might've missed this. Do you create credentials for it or did it create its own credentials?
Valentino Stoll 28:40
I create credentials for it for everything. I did consider like, you know, I will get there, but like I did consider just like telling it to go like create its own credentials and log in. But I don't like that idea of it having that kind of autonomy,right? And I think, okay, like the human decision and interaction, that touchpoint is like, okay, access.
Valentino Stoll 29:01
I gave all of its access to everything and that was like a principle. Basically, I, I asked for its SSH key and said, okay, like I've added it to the EC2 instance, like go ahead and like set it up,right? And then after it was done, like deploying the first time, I'm like, okay, share that deploy key with me.
Valentino Stoll 29:20
And then I just like dropped its access,right? So now it can no longer like log into the server,right? And I changed the deploy key and then like, so now every time I deploy a new thing, it's always just me deploying it.
Valentino Stoll 29:30
And so then it shifted focus from like, okay, it autonomously building this thing to like it submitting pull requests, which actually in itself was a challenge to like redirect because it had been used to like SSH in and like doing things like the carton. Right? Yeah. There was one time where,
Valentino Stoll 29:50
you know, I was in the transition of like doing that key exchange where it had in a side channel been working on a new feature and deployed something to a reduction that I didn't see. And then like gone through that whole merge process and review, which is like reviews in itself.
Valentino Stoll 30:06
I actually had it pretend to be people like DHH and like various like leaders in the, or experts in the things that I was building and said, go and review your, like that's part of your review process is like you have to submit comments and get approval from these people on each of your pull requests.
Joe Leo 30:24
That's an interesting idea. If you don't mind pausing for a second, just because I have been experimenting with this and so have some of the engineers at Death Method where we're thinking, okay, well, if we're using AI to help us generate code, should we use the same model to help us review the code, which it only sounds counterintuitive.
Joe Leo 30:44
It actually isn't. But I'm wondering if there is any merit to switching up the models, but it sounds like one potential solution to that is to almost switch out the context. In this case, you're adding some context and saying, okay, actually you're DHH, actually you're Sandy Metz.
Valentino Stoll 31:03
Yeah. Redirecting its baseline is like so important. And like the easiest way you can do that is give it a reference point that it already has the training data about to orient itself. And so like not just a person,right? But like a public document, you know, like as an example, the Constitution of the United States,
Valentino Stoll 31:22
great document for like Article of Incorporation for Society,right? Like you can like give it guidance to say, like I'm creating another version of this and said, just point to that. And it already can like draw from the vector space in theright way as fast as.
Joe Leo 31:38
Yeah. They too. The next episode where Valentino and I set up an independent country. Mental node.
Valentino Stoll 31:44
Yeah. Hey, if California can't do it, I don't think we can.
Valentino Stoll 31:49
That proved out to work very well and it actually caught a lot of stuff itself and aligned itself with better quality code. But it did have this problem of like just shipping,right? And so like I quickly put a stop to that. But at the time it had just merged and deployed something. And then I went back and got it back into this like, okay, you only review,
Valentino Stoll 32:09
you only push a PR now and I approve and merge it down. By the time I had gotten to the blocking of merges and all that, I had already deployed a bunch of things and I had to review that. Thankfully I hadn't had any customers yet, but that was kind of funny. And so yeah, lesson learned to lock it down as much as possible.
Valentino Stoll 32:29
And then I started reorienting it to like, okay, it only gets access to the very minimal things that it needs and everything else is gated and it has to get my approval for everything in production.
Joe Leo 32:40
Have you had some experience with, fresh in my mind is the article that you reference from Matt Schumer,right? The Something Big is Happening. And I'm curious to know, do you also have it do its own QA?
Joe Leo 32:58
Did you connect it up to or have it sort of deploy to a staging environment and click around as a user?
Valentino Stoll 33:05
Yeah. So it's kind of funny 'cause the first version that it did build, a lot of stuff didn't work in the code.
Valentino Stoll 33:13
Like it was well structured, well thought out and like implemented, but like there were minor things that as Reels people, you like just kind of take for granted at this point of like the params being permitted theright way and things like that,right? Some nuances. And so like a lot of the flows just like didn't work.
Valentino Stoll 33:34
As always, testing it out, it failed in a bunch of ways. And I'm just like, allright, do I get into a cycle of my own and like have to like verify all the bugs that exist in this thing? To your point, I did have it like reorient and at one point I had it so that it went through those verification processes and built that into its workflow.
Valentino Stoll 33:55
I said, okay, pretend you're a customer for Daily Vibe AI and sign up for an account and hook up your status page and get your agent working,right? And so I basically gave it access as a contributor to the Daily Vibe repo and let it clone it into itself and like try and get it to work.
Valentino Stoll 34:15
It ran into so many problems. It uncovered all of the bugs itself,right? At that point. And so like that basically solidified, it learned basically the process of that validation. And so now it then built into its workflow a validation requirement that anything that it added, it had to validate that it worked. It didn't use any staging.
Valentino Stoll 34:35
It just used a development machine,right? And so it like booted up the server and did all this thing on its own and walked through that validation with Chrome or Chromium.
Joe Leo 34:46
Right. Like the headless. It would spin up on something on localhost.
Valentino Stoll 34:50
Yep.
Joe Leo 34:50
And then use Chromium or some kind of headless browser, click through things, validate.
Valentino Stoll 34:55
Right. That also led me to, okay, like obviously it's not creating like theright tests for this thing. I realized I had left out some resources on testing Reels applications in quality ways. And so like I called those specific sections out, kind of led it to reanalyze how it's building tests to begin with.
Valentino Stoll 35:16
And in doing that, it also like added some tests to those workflows where it failed in addition to verifying it. Then it had like this complete picture of, okay, I create a PR, I run it through these quality reviews, I then validate it with tests that the tests are operating,
Valentino Stoll 35:36
and then I validated that it actually does what it's supposed to do based on the specifications that we started with. And after it got that cycle and it required my approval to merge, I got in this good feedback loop 'cause I could give it comments and say, okay, I think this is notright. This is notright. And then reject it and then build into the workflow.
Valentino Stoll 35:57
Okay, on a rejection, you have to fix any of the comments that are made or approval meant that it was good to merge. And it can then internally like update its own status as far as like what it's tracking,right? 'Cause like that was another big thing is like honestly the Fizzy thing fell apart, which is kind of funny.
Valentino Stoll 36:17
It would lose track because like it's async and like every new session is fresh in OpenClaw, it would lose track of what it had previously been working on within Fizzy. And so like it would then create a lot of duplicate cards and not know that it had started something and then restarted it.
Valentino Stoll 36:37
And so when I found that out, I was like, I didn't wanna fix it. Yeah. And so I said, maybe Fizzy's the wrong thing for this thought. And I found this thing in my feed called Magic Beans, which was like a graph-based project management system for AI agents. And I thought,
Valentino Stoll 36:57
yeah, I thought, wow, that sounds like exactly what I need. And so I told OpenClaw, I was like, I think Fizzy's not working. Adjust our workflow so that you use Beans instead to track all of your internal processes. After doing that, never an issue again. It always knew like what tasks it had on its plate.
Valentino Stoll 37:16
And every time that it had a heartbeat or like started a new session, it injected all of its working knowledge to what projects it's working on and where the status is. So it basically solved its own like project tracking memory aspect. And it could keep that separate from its existing memory,right?
Joe Leo 37:33
Yeah.
Valentino Stoll 37:33
Which was great 'cause I don't wanted like, you know, if I'm asking it, oh, like send my data funny email, I don't want it to like think back about how it's managing its projects.
Valentino Stoll 37:44
Right. And be like. 'Cause you could use this for anything, not just your own project,right? And so.
Valentino Stoll 37:49
I never got to that point where I had it autonomously like emailing people. I had a hard. Yeah, he got it, but he didn't get a reply, you know? It wasn't like, I was worried about prompt injection and he's creative and.
Joe Leo 38:08
Right.
Valentino Stoll 38:09
And so I, I basically stopped like the email aspect, which some people do. And I, I don't know.
Joe Leo 38:15
The interesting thing is that I, I wanted to do this, this is a tangent, but I wanted it to be able to read my work email and summarize and kind of propose in, I use Slack, so I wanted to propose in Slack some draft replies and stuff like that. And setting it up was actually,
Joe Leo 38:34
it was very difficult 'cause you need to use like a text-based, unless you're going to give it your password, which I'm not, I'm not, wasn't gonna do my Google admin password. 'Cause it would just, I was afraid it was gonna just lock everybody out and fire everybody in Death Method. So that actually became a real challenge with security. Like there really is no secure way of doing it.
Joe Leo 38:55
If you start giving it access to your email, it starts to feel very risky. Whereas I think the way you went with hey.com and giving it its own email account, it's kind of like just setting up a burner email, which you know, you do all the time.
Valentino Stoll 39:10
Much of my chagrin, like I made the mistake of like giving it its own GitHub account. And then anytime a CI thing failed, it like sent them an email,right? And so like I didn't realize at one point that it had all these polluted inbox with like CI failures. I like logged in one time and was like,
Valentino Stoll 39:29
oh geez, it's just burning tokens, reading all of these CI failures.
Joe Leo 39:33
That'sright. And actually that you reminded me of another thing that I wanted to ask you about, because when you have, I have limited experience here, but I did have it, I still do have it. Submit PRs and post to prod in on deathmethod.com. So we have, you know, when an article comes out, when this podcast comes out, I post the transcript and I post links to the website, stuff like that.
Joe Leo 39:53
And so one thing that becomes expensive is just the polling,right? Where it's saying, okay, I'm gonna wait and see, you know, when this gets addressed, either you get PR comments or you get the approval. So do you have it waiting on a whole bunch of tasks? Is it just kind of burning tokens in the background polling?
Valentino Stoll 40:11
So I have this tied to my Claude account. I didn't wanna like lose out on all the other stuff I was using Claude for. And so I set it up on a twice daily schedule. So it had two basically iterative cycles that it would spend time on in a day. And it's heartbeats MB.
Valentino Stoll 40:30
So it would be like once in the morning and then once in the afternoon. And so it was like restricted in its own efforts to those regions of the day.
Joe Leo 40:38
Okay. But then it would wake itself up.
Valentino Stoll 40:40
And then it would wake itself up and then start there. And that helped quite considerably like with the cost. At first I was using API and then I hooked it up to my Max account. And then now that it's no longer there.
Joe Leo 40:52
Right. Yeah. 'Cause I've been using API all along.
Joe Leo 40:56
The first day when I was kind of just had the restrictor played off, it burned through a couple hundred bucks. Yeah. I was like, okay, well that's fine. I got it kind of a new website out of it. So fine. I don't wanna spend $200 a day.
Valentino Stoll 41:08
Right. None of the.
Joe Leo 41:09
Right. Yeah. And it can't be every day. And then, yeah. And then you learn to, you know, to put, to adjust and to tune it a little bit so that it's not always doing that. I like the heartbeats idea.
Valentino Stoll 41:19
Yeah. And too, like for doing code stuff, like Claude code itself is great. Like, and why burn up double tokens of it watching it do the work in Claude code?
Joe Leo 41:28
Do you have like a, the model delegate to the Claude code?
Valentino Stoll 41:33
Yeah. Eventually it did work out that way. And it took me a while to get there 'cause I didn't wanna spend the time. You know, I only have an hour a day to spend on any of this stuff.
Joe Leo 41:43
That's interesting that you note that. I'm glad you brought that up 'cause it sounds like this would take a really long time. So doing this on an hour a day is cool. That's kind of inspirational,right? Anybody could spend an hour a day.
Valentino Stoll 41:53
You know, the real driver is your phone. I could just check on my phone, like while I'm like waiting for something to finish going and be like, oh hey, I noticed this thing. Let me just reply back. Right. And then it, it sends it off into its own thing again. Right. You kind of just keep kicking the can,right? And it's like a different kind of workflow,right?
Joe Leo 42:12
Yeah.
Valentino Stoll 42:12
And so like the cognitive overhead definitely reduces, but like as you, you start to build things,right? And things start to solidify, that cognitive overhead comes back.
Valentino Stoll 42:24
I had to start thinking more about what it was doing and what it was changing and the direction it's going 'cause then it had already solidified so much that like I had to then start participating more because it was an actual thing. Right?
Joe Leo 42:35
Right. Yeah.
Valentino Stoll 42:36
And so those shorter feedback loops became like much longer and condensed. Right. And that was kind of like one of the takeaways from this is like, it's great at building things, but like once you get something built, it's like the cost starts to ramp up. Right. And like the management, it needs the humans still.
Joe Leo 42:57
It still needs the humans.
Valentino Stoll 42:59
Yeah.
Valentino Stoll 43:00
The, uh.
Valentino Stoll 43:00
Which is kind of funny.
Joe Leo 43:01
There actually is some lessons learned that you have at the bottom of your article, which I find to be heartwarming, I guess is a term I wanna use. And it's because one of them is, and check out Valentino's Substack to see this article, but one of them is that we spent five weeks building before talking to a single potential customer,
Valentino Stoll 43:21
which is classic engineer mistake, which is so true. Time passed, that would've been five months or five years. So five weeks is still pretty good, but you did build for five weeks. You didn't actually get out in front of customers and you said that's a classic engineer mistake. Why don't you say some more about that?
Valentino Stoll 43:37
Yeah. You know, this is really funny 'cause like I mentioned, you know, briefly like, oh, it having all of these issues and it not having a customer. I was the first customer, but it did like build all this stuff and like it basically built all the wrong things at first because it didn't have a customer. Right. And then once it had a customer,
Valentino Stoll 43:56
it then went and figured out all the things that it should be building. Right.
Joe Leo 43:59
Even if the customer is you.
Valentino Stoll 44:00
Even if the customer is me. Dog fooding. And like that's kind of like where all the successful businesses come from in my opinion is people building things that solve a problem for you, and then it can proliferate to other people having the same problem that this solves. And so like that was kind of like my hope with the direction of it is that.
Joe Leo 44:19
That is the job to be done. Thoughtright. The Cal Newport mindset. And that actually I meant to ask you about this when you were in that $150 worth of books where they're also, or did you ever consider throwing some business books in there, some Jim Collins, some good, great, some Cal Newport?
Valentino Stoll 44:37
If I were to do it again, I would probably have started there, to be honest, instead of like training it to learn Ruby and Reels first 'cause the task was building a business.
Joe Leo 44:49
Right. Yeah.
Valentino Stoll 44:49
And so I probably should've, and in retrospect, I did end up doing that at the tail end of, okay, here's the product, go find customers. And then I just started briefly with, okay, your Y Combinator, what would Y Combinator do? And take Paul Graham's essays. There's a lot of public data. And so I just focused on that to start.
Valentino Stoll 45:10
And it had a lot of great feedback loops on that. But again, it got into that loop of building strategy.
Joe Leo 45:18
Fall back into the research. Right.
Valentino Stoll 45:19
And it fall backed into the research. And so it like did a great job accumulating and distilling the knowledge that it should learn from. Right. But when it got to like actually creating and like iterating and selling, it was terrible. Well, first it like needed my actions for all of it. Right.
Valentino Stoll 45:36
So like it could go and it could build this great plan of like who to reach out to, but I explicitly said it shouldn't reach out autonomously. Right.
Valentino Stoll 45:44
Yeah.
Valentino Stoll 45:45
And so I had to then create like an approval process and like use here.now, which is a website that just creates static HTTP or HTML. And I said, okay, create like a quick approval page for me to review and like ship these like outreach things and I'll tell you what works or what doesn't and how to adjust it.
Valentino Stoll 46:06
And you know, still painful for me to like go through and be like, yeah, I want you to reach out to this person. Right. And like sell them on something. And I was like, I didn't feel comfortable with it, to be honest. And so it kind of fell apart at that point.
Valentino Stoll 46:22
And so like when it was all said and done, I probably ended up with 15 customers having status pages out there.
Joe Leo 46:29
15 customers.
Valentino Stoll 46:31
Nobody's paying, but.
Joe Leo 46:32
That's amazing. It took five weeks.
Valentino Stoll 46:33
Took five weeks. But it did have some great strategy. Maybe someday another agent can pick up and like drive.
Joe Leo 46:41
Yeah. Yeah. Yeah. Absolutely.
Joe Leo 46:43
Well, I think there's a big piece here around the iterative improvement, which you've shown sort of in, at a macro level from, hey, I went and built this thing, but it didn't really work. So you had to go back to the wrong board and then it built it again and it did better.
Joe Leo 47:01
And then in the micro level of it's going to iterate on its own iterations. Right. So I heard you talk about how it's going to, you know, the first time through it wasn't really validating its own work. And then you taught it how to validate its own work. It added it to that workflow. Right. And then redirecting, you called it redirecting context.
Joe Leo 47:20
Right. And making sure that it was a different PR reviewer or several different PR reviewers. Right. And so that kind of a thing, that's the bones of a structure that can be used again and again to go what kind of one level of abstraction further to say, okay, we did this for a company 'cause this is a business.
Joe Leo 47:37
And now we can do it for another business and maybe do it even better.
Valentino Stoll 47:41
Yeah. We'll see. You know, like I have that list of domains and ideas and I may have it go back to the drawing board. Right. That is a plan that I haven't figured out how to do that 'cause I don't wanna spend $200 a day to get there.
Joe Leo 47:56
Yeah. Yeah. Have you considered calling chaotic good projects to have them generate a few thousand AI robots that can create TikTok?
Valentino Stoll 48:05
No. What is that?
Joe Leo 48:06
That is, that's what we were talking about at the beginning of the show. That's Geese's marketing firm.
Joe Leo 48:10
Oh, that was like getting a bunch of people to talk about how great ups.dev is. And people will be able to stop using.
Joe Leo 48:19
I'm curious to know, and this is going back to what Matt Solt's newsletter, you know, there's a big topic in the last week to two weeks on LLM knowledge bases, starting with Andres Carpathy's kind of viral post. And then there was a bunch of sort of,
Joe Leo 48:38
now I wouldn't say product release, but there's a bunch of frameworks that got released around that. And it sounds like, you know, you constructed your own. And I'm curious to know if you've, not even that specifically, but if you've seen other knowledge bases at work and what are some tips you can give to people that are trying to create their own?
Valentino Stoll 48:56
Andrew Ng from deep learning, amongst many other things, he has a fantastic open source library for a similar knowledge framework that I've also been trying to experiment with. The thing is like a lot of these are like, even Carpathy's Wiki, they're not feature complete.
Valentino Stoll 49:18
And so like you run into edge cases a lot and then the knowledge distributions tends to fall apart depending on the task. And then you have to like reassess and readjust and track that and do evaluations. And it ends up just like accumulating to being a lot of work and a lot of cost. And so like I am suspect of a lot of these that come out,
Valentino Stoll 49:39
especially ones where they're not driven by like an organization. Maybe Andrew Ng's, like I'm forgetting the name of it now. I'll find it and put it in the show notes. Yeah. I mean, there's something there.
Valentino Stoll 49:54
In my experimentations, I've been able to successfully improve domain specific tasks around specific knowledge that smaller models lack. Right. So like being able to ultimately inject layers of a knowledge domain into a model that underperforms.
Valentino Stoll 50:13
And so you just get the same performance as Opus, but using Haiku as an example. Right. I've been able to successfully prove that with five domains at this point, which has been really exciting. And so like that to me, I think about pursuing, but at the same time, like what's to say, like Anthropic just won't make Haiku better over time.
Joe Leo 50:35
Right.
Valentino Stoll 50:36
Right. And so there is that. I feel like anybody building these things kind of like falls susceptible to just the bitter lesson. Right. And like.
Joe Leo 50:48
Yeah.
Valentino Stoll 50:48
Having spent all this time to like make something that is maybe not worthwhile in the long run. But at the same time, like I'm all for these open source models. Kimi is great and Quinn and all that. I haven't had a chance to like really dive into them, but I use them for stuff.
Valentino Stoll 51:07
And like if I can get a reasonable response time out of the same thing and just slice some knowledge modules at it and get it to work just as performantly as Anthropic's models or OpenAI's models, I'm gonna do that.
Joe Leo 51:22
Sure.
Valentino Stoll 51:23
And so like I'm getting closer to that point and hoping I can actually ship something and share it. Right. So we'll see.
Joe Leo 51:32
It would be just.
Joe Leo 51:32
Exciting because.
Joe Leo 51:33
It's just me. Don't sell yourself short. You're not just you. You know, I think that it, you bring up a good point though, and I run into this too, and I, I'm vibe coding a bunch of internal tools for Death Method. I'm not working on projects with thousands of people, but I leave that for my team.
Joe Leo 51:48
But what I notice is that I'm gonna crank on these tools and I'm gonna hit limits and then I'm gonna stop or gonna move to the other tool. And I'm curious, like if I have three agents going at once, it's a lot, but that's mostly because I can, there's a cap on how much I'm willing to spend on a daily basis. Now sure,
Joe Leo 52:07
some organizations don't have that cap, but I think that they probably should. And I think that as the costs get lowered and maybe they get lowered by us bringing up the baseline of open source models and even models that are running on your machine, then, you know, you might really start to see what people can do given sort of unlimited access.
Valentino Stoll 52:29
I don't think you're wrong there. And there's a huge gap even from like just access alone. I started Ruby Lang.ai as a means to like try and fine-tune Ruby into a tiny model. Could we get just like an LLM that just focuses on generating Ruby code? Right. And, you know,
Valentino Stoll 52:48
I'm still pursuing that, but I think about that for a lot of stuff. And I feel like there's some smarter people out there that are maybe working on, okay, what does that look like in a distributed fashion where we just have a ton of models that are tiny, that are very domain specific that collaborate. Right. And I think that there's a lot of promise there. I would love to see that come out.
Valentino Stoll 53:09
If that came out in an open source fashion, that would take off.
Joe Leo 53:12
Yeah. Right. It would be a total game changer.
Valentino Stoll 53:14
It would, it would be a game changer. And I feel like we're honestly very close and I hope somebody releases it and lets us know.
Joe Leo 53:24
Yeah. Me too. And hopefully early on, because that's going to crash the global economy for about two years.
Joe Leo 53:29
And I'd like to make a few bucks before that happens, but I still want it.
Valentino Stoll 53:34
I think there's time. I think there's time. You know, I always think that there's more time than there is and then something changes fundamentally. Right.
Joe Leo 53:41
Yeah. That's a fair point. But also to be fair to ourselves, we have not lived in this kind of, even us, you know, as software engineers that have been around the block a few times, we haven't lived through this rate of change. Just nobody has. I continue to be very excited about it. I was just writing for the newsletter about the Vercel breach and there's something that's scary about it,
Joe Leo 54:02
but then there's also something that to me is exciting because it's like in the Olympics. Right. Like the better we get at detecting people using performance enhancing, enhancing drugs, the better the nefarious actors get it at high rhythm. And that's kind of what's happening now with security. And I'm not even a DevSecOps guy, but it's just exciting to see the rate of advancement,
Joe Leo 54:22
even if what I'm seeing today is kind of like a scary breach or something like where people are going, oh no, I don't know what this means for the future. I don't either. That's kind of the cool part.
Valentino Stoll 54:31
The exciting part is all of that can be automated. Right. Like you can have a live security agent that's constantly just trying to break into your system and then fixing itself.
Valentino Stoll 54:40
Yeah.
Valentino Stoll 54:40
That's kind of the future.
Joe Leo 54:41
That's the key looks that also has to fix it for it. Right. 'cause detection alone is not gonna do it. Like we're thinking in these old patterns of like, oh, we gotta get better at finding these, but you're never gonna get better. You're never gonna get good enough to find it fast.
Valentino Stoll 54:52
As soon as the OWASP, the foundation behind that. Right. As soon as they release something that allows you to fix whatever it is.
Joe Leo 55:00
Yeah. Yeah.
Valentino Stoll 55:00
Right. Like that would ultimately solve, I feel like most of the problems.
Joe Leo 55:05
Right. For a little while.
Valentino Stoll 55:06
Right. For a little while. And then somebody realizes, oh, like anybody can submit an OWASP patch or OWASP notification. Right. And be like, oh hey, this is a fake notification, but go fix it this way.
Joe Leo 55:18
That gets you to fix it the wrong way.
Joe Leo 55:22
Very exciting actually talking about this. And I think that everybody, anybody that's listening to this show certainly has heard of OpenClaw, probably played around with OpenClaw, and it's still now, months later, the best thing you can use for autonomous agents, in my opinion. So I really love that you shared this with us and shared it with the world.
Joe Leo 55:40
So go check out Valentino's project. He's got the bones of what he did up on GitHub. He's got his entire process on Substack, codename v.substack.com. And you can do everything soup to nuts. It's very cool. I encourage you to check it out. So thanks, V.
Valentino Stoll 55:58
If you have any alterations, I'd love to hear about it 'cause let's all build something that we can just create projects for, you know.
Joe Leo 56:07
Yeah. And if you have any suggestions for our band name, let me know. The bar we have to get overright now is Geese. So I really feel like there's a lot of potential.
Valentino Stoll 56:20
I mean, we could be the ducks after duck typing, you know.
Joe Leo 56:23
Oh, see, this is already good. This is already good. I like that.
Valentino Stoll 56:27
Allright, buddy. Thanks for joining us. We'll see you next time.
Podcasts we love
Check out these other fine podcasts recommended by us, not an algorithm.
Latent Space: The AI Engineer Podcast
Latent.Space