The Ruby AI Podcast

You Can’t Vibe-Code Trust: Why Real SaaS Still Wins in the AI Era

Valentino Stoll, Joe Leo Season 1 Episode 18

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 43:17

On the Ruby AI Podcast, hosts Valentino and Joe Leo welcome Scholarly CTO/co-founder Kelly Sutton to discuss building a vertical SaaS “faculty information system” for universities. Sutton explains why competitors can’t easily replicate Scholarly: higher ed is moving off decades-old homegrown software, and the product must meet trust, security, compliance, and regulatory demands such as SOC 2 Type II. He describes how Scholarly expanded from replacing Excel/Access tracking to sophisticated workflow automation and how universities recently shifted from AI skepticism to AI FOMO. Scholarly uses AI in product surfaces, heavily in engineering, and via an admin MCP server that helps ops/customer success rapidly configure workflows from faculty handbooks with human-in-the-loop review. The conversation debates MCP’s likely temporariness versus traditional APIs, emphasizes smaller reviewable “PR-sized” outputs, and frames AI as an implementation detail focused on customer value. Valentino also shares an experiment training Claude to build products, including ups.dev and an open-source Ruby uptime-monitoring gem.

Valentino Stoll  00:00
Hello everybody, welcome back to another episode of the Ruby AI Podcast. I'm Valentino, joined by my co-host.

Joe Leo  00:07
Hi, I'm the co-host, I'm Joe Leo. Very excited today. I've got Kelly Sutton, who's going to join us in just a second, but first, Valentino, NASA has developed an AI nose, an e-nose technology that can help to sniff out things,

Joe Leo  00:28
including tracking the health of astronauts on long-distance or long-duration missions. What will you do with your first AI nose?

Valentino Stoll  00:40
Bullshit. It's trying to be able to detect bullshit.

Joe Leo  00:44
It's got to be able to detect bullshit. I love that answer. I thought what I'm going to use it for is I'm going to use it to determine which car on the subway to step into and which to avoid, because that is a real bummer getting into the smelly train car. But I like the bullshit answer better. Allright, so without further ado, let's welcome the CTO and co-founder of Scholarly,

Joe Leo  01:06
Kelly Sutton. Welcome to the show, Kelly.

Kelly Sutton  01:08
Thanks for having me.

Joe Leo  01:09
It is great to have you on. Know each other and been in similar circles for a long time. Love the work that you do in Ruby. And now we've got this thing called AI, which is kind of taking over. You run some pretty important software that I think could be described as software as a service.

Joe Leo  01:29
Would you describe it as a SaaS product?

Kelly Sutton  01:31
Vertical SaaS product, yeah.

Joe Leo  01:33
A vertical SaaS product, and yet you are still in business. How is that possible?

Kelly Sutton  01:37
Yeah, and yet we still exist. In fact, we're doing better than ever. Don't believe all the headlines that you read.

Joe Leo  01:42
I don't want to believe all the headlines I believe, but I do want to understand what is the thing that sets Scholarly apart, that makes it so that the next person who comes around and says, "Oh, okay, I can recreate Scholarly, you know, in 15 minutes, and then I'll charge half the price," what prevents that from happening?

Kelly Sutton  02:04
Yeah, so Scholarly is a we call it a faculty information system, and you can also run some pretty sophisticated workflows off of the platform. So you can think of it like a really just an HR platform for higher ed. So think like Gusto, Rippling, Workday for colleges and universities in the US.

Kelly Sutton  02:24
I think there are two things that really prevent folks from just vibe-coding a competitor into existence. One is that I think colleges and universities are on this current track where they're trying to move away from operating their own software. It would last like 20, 30, 40 years. Very capable,

Kelly Sutton  02:44
talented professors, usually in the computer science department, built a tool for managing some piece of the faculty life cycle there, but they didn't understand the maintenance costs of that,right? And so a lot of like customers that switched to us are actually like turning off old systems that sometimes like predate the internet,

Kelly Sutton  03:04
which is always a fun challenge there. So universities in general are kind of like getting out of like, "We want to like maintain software that we built that is bespoke for us." And then the second thing is we deal with like very sensitive information. We are an HR product, so there is like the legal, regulatory,

Kelly Sutton  03:24
security, and compliance aspects to what we do. You don't necessarily just get like a SOC 2 type 2 certification overnight. That requires some time.

Joe Leo  03:35
I think that's true. There's also a trust component there that I want to get into, but a quick sidebar, I think that that is how Ruby Luminary and my co-author of The Well-Grounded Rubyist, David Black, got his start.

Joe Leo  03:48
It was building applications that were pre-web and then web applications in the communications department at his university and then came across Ruby at some point, and the rest is kind of history. But yeah, you'reright, that story of software being developed and managed by just somebody who, you know,

Joe Leo  04:07
is at the university and had an interest in programming is real. I like that. So you mentioned this piece with SOC 2 type 2 compliance, and that's real. You also said it's a regulated industry. There's high risk there.

Joe Leo  04:20
And I think because of that, you could start to sort of build your moat as a business with the trust that you build up over years. So first, I guess I'd ask you whether you agree with that statement or not. That's kind of a softball question. And then the harder question is, well, how do you do it with your engineering team and with the need to,

Joe Leo  04:39
of course, move fast and service your customers?

Kelly Sutton  04:43
Yes, I do agree with that. And I would also agree with it where I on the other side, like I'm not going to trust like a two-person startup running my university's payroll. Like that's nuts. That's an objectively poor idea.

Kelly Sutton  04:56
And so I think much like with any B2B SaaS company that is taking on like a big piece of what their customers are trying to do, you need to find customers who are willing to stomach that risk appetite of, "Okay, we're going to go with these folks who have only been around for six months,

Kelly Sutton  05:15
12 months because we are so desperate for something." And then as the company matures, we're almost three years old now. We've got more than two dozen folks working for us or like in our orbit here. Those effects compound,right? It's like, well, they've been around for two years now. They've been around for three years now.

Kelly Sutton  05:34
These others, you have like the social proof of like other folks trust them. Those other folks seem to be doing really well. Maybe like the risk calculus is we're willing to take a bet on this young company, but there's still institutions out there where it doesn't make sense for them,right? And those are usually like the larger institutions or larger systems.

Kelly Sutton  05:57
And that's okay. We'll get there. We'll get them all eventually.

Joe Leo  06:00
Yeah, that's spoken like a co-founder because, you know, you make this interesting comment that I think is true. You know, university shouldn't trust a two-person payroll team that just spun up last week, but at some point you were a two-person team. And so at some point you need to build that traction and get somebody to make the leap. So what was that story like for you three years ago?

Kelly Sutton  06:20
It was not starting with payroll. Yeah, we still don't do payroll. Someday we might. It was starting with what are parts of the process that are important, but also something that you wouldn't mind going with a vendor for.

Kelly Sutton  06:34
So a lot of our earliest customers were switching off of like Microsoft Excel spreadsheets or like Microsoft Access databases for like tracking faculty activity data and like faculty rosters there.

Kelly Sutton  06:48
The ability to show like, okay, like a connected web database for this information is vastly superior because more than two people can work on this at the same time. Yeah, it was there a pretty easy sell. So that's where we started and then started adding to what we did, got into like more like workflow management,

Kelly Sutton  07:09
and then we continued to add functionality to like what we call it like our workflow engine. So rather than just being able to run a very simple process, you can now run very like sophisticated processes that have like 50 plus people involved for a single like run of this workflow here. And so we just continue to add to that.

Kelly Sutton  07:29
And all of that really comes from like just sitting down with customers or prospects and saying, "Okay, what do you need? Like what problems are you trying to solve? Explain it to us in plain language. Like don't tell us like I want the button over here and I want a spreadsheet over here. Like tell us like, okay, what is this process? Explain it to us in depth." And then we don't stop until they're happy and their problems are solved.

Joe Leo  07:51
From an institution perspective, I imagine they have a lot of like departments and it's not like maybe one person you're talking to to set these up. So how do you kind of deal with that from an AI business perspective to disseminate and connect all of the related parties?

Joe Leo  08:08
So you're not like relearning how a university works differently when you're trying to onboard somebody new. Do you have like systematic approaches? Like what's your kind of workflow from a business adopting AI and implementing AI?

Joe Leo  08:24
Do you have like channels set up for certain workloads to like line up your customers to yours? Is it that in depth? Do you have this desire to like jump into that kind of preemptively without knowing it? You know, how do you handle all the trade-offs and balance there?

Kelly Sutton  08:41
Yeah, our platform has been built to be extremely flexible from day one. No two universities are alike. We're talking just like hilariously different sets of requirements. Just when you think this requirement, this is going to be set in stone,

Kelly Sutton  09:00
like the privacy, like this document coming out of this workflow, it's always going to be this way. I guarantee you the next customer is going to show up and say like, "Nope, we do it totally differently." So we've always been, I would say, like a very like nimble engineering and product. And then when it comes to like how do we plug AI into that,

Kelly Sutton  09:19
there's a few things that I think universities were very AI skeptical up until about two to three months ago. Now they're like very much don't want to get left behind, like extreme FOMO, really trying to lean into how they can like responsibly integrate AI into their administration and all of their operations there.

Kelly Sutton  09:42
When it comes to like how we use AI, there's kind of like three places that we use AI. We have surfaces in our product, everything from like a chat surface to like, okay, this complicated thing is actually being offloaded to an LLM for like building like the draft of a workflow, for example. So you can drop in a document, we put together a workflow.

Kelly Sutton  10:02
We've got an MCP server, which our administration team is using a lot for setting up new customers and guiding them through implementation. That's been a surprising like win in the last few months here.

Kelly Sutton  10:15
Like we're building fewer administration dashboards and tooling and more and just letting the LLMs handle that. And then we also use it as an engineering team all the time. Like probably 80, 90% of the code that we write has Claude somewhere on it.

Joe Leo  10:33
So I wanted to zoom in on this and I want to get Valentino's take on it also. I was reading some of the stuff you wrote, which is excellent. Your retrospectives posts are great. And you made a comment about MCP being useful, but likely temporary. I think you're not alone in that assessment, though nobody really knows. And I'm curious, you know,

Joe Leo  10:52
why you came to that, not really conclusion, but why you came to that prediction. And I'm also kind of curious how that resonates with V.

Kelly Sutton  10:59
I think it's fair to say that the three of us have been around the block a few times,right?

Joe Leo  11:04
Yes. Brought up maybe a little bit too much on this show.

Kelly Sutton  11:06
Yeah, yeah, I know. For damn people keep mentioning it. Sorry.

Joe Leo  11:09
Yeah.

Kelly Sutton  11:11
You develop a nose for the things where like a new technology is introduced that solves a problem, but it has some echoes of the past of like this technology like solves this problem, sure, but is it necessary? No. Okay.

Kelly Sutton  11:28
Or are there things that already do this or you're going to run into a constraint or a shortcoming? So I always like to pick on GraphQL. I've picked on GraphQL for a decade now. Similar thing, like I don't think we need this. And I think at this point GraphQL's, I don't know if it's dead, but it's just not really used as much.

Kelly Sutton  11:48
As rewind like five years, everyone's like all in on GraphQL,right? And if you look at what GraphQL's like providing, it was some like very useful stuff, like partial field sets, related resource fetching, the ability to issue one request and get many things back. But GraphQL didn't have a monopoly on that.

Kelly Sutton  12:09
MCP, similarly, like it's a way of programmatically speaking with the web application. But we've had JSON APIs for 10, 20 years now. Those are well documented. If you choose a spec like JSON API, it's very predictable how you can like interact with something.

Kelly Sutton  12:29
And I think a lot of like just the early like MCP stuff of like throwing the kitchen sink in of technology when we're kind of just reverting to like, okay, this thing is kind of just like issuing requests and their RPC style requests is like a, allright, I think for us as a business, we must adopt this. But if we fast forward like five years and someone's like,

Kelly Sutton  12:50
yeah, we just stopped using MCP because all the models actually could write curl commands, I'd be like, yeah, of course. That seems like a perfectly fine place to alone. That's my thought.

Joe Leo  13:01
Allright, Valentino, you've been sitting on his hands. Let's hear it.

Valentino Stoll  13:04
I am so torn on MCP because from an efficacy perspective, models perform much better with CLI style discoverability. Claude code itself adopts memory, which is progressive disclosed via like successive files,right? And it circles back to the whole like hypermedia idea.

Valentino Stoll  13:24
To your point, Kelly, like regular APIs, RESTful APIs are like meant for that discoverability. If it needs more information about more, you can link to that in successive documents and kind of reflect that same.

Valentino Stoll  13:41
But I think a lot is lost on kind of the architecture of MCP from like fruitfulness that I think it's one of those things like GraphQL where if you don't adopt it, in GraphQL's example, if you don't want to adopt it as your data delivery strategy,

Valentino Stoll  13:56
then it becomes very problematic in that abstractions and complexity kind of proliferate throughout the rest of your systems. And so like that does complicate things more. But if you have things that need that complexity, it makes it simpler from all of your downstream services and systems.

Valentino Stoll  14:16
And like that, MCP is the same,right? Like if you adopt the entirety of the MCP, it's got great like resource control, great discoverability. It has like a lot of things that agents lack as far as accuracy.

Valentino Stoll  14:31
But to your point, Kelly, like do you need GraphQL for like 20-page or 20-route Rails app? Probably not. Just automatically dumping it in any repo, like maybe isn't a great idea,right? And I think the same way is true for like MCP. Maybe you don't need MCP to just deliver tools to your agent.

Valentino Stoll  14:53
Start with an array or like something very minimal and then build up as you get there.

Valentino Stoll  14:59
But I think what we're seeing from like a scale perspective is that MCPs have a hard time with quality of delivery when the tools scale,right? So like if you get beyond the even 50 mark of tools, it starts to fall apart quickly and it like picks the wrong ones.

Valentino Stoll  15:19
It uses the wrong parameters and like it becomes hard to manage the context and communicate that with the tools,right? And so there's a lot of stuff.

Joe Leo  15:29
Are you saying this independent of how things are connected and whether you're utilizing MCP, there's kind of this threshold of tools?

Valentino Stoll  15:37
Yeah. And you know, if you lean heavy into MCP, then it does that resource restriction and it helps like limit the number of tools, but it's still like a number of tools,right? If you have like a big domain, even for like, you know, institution, like a university, like it has all its own departments and each department may have like a hundred tools that it wants to use,

Valentino Stoll  15:58
but each one doesn't need for every task,right? Even within the domains of the larger organization, it becomes hard to like, well, what do you present to the model for a given user request? And you can get so far, but it still has its limitations. And so like even if you adopt at a large scale and embrace it fully, you're still going to hit those limits.

Valentino Stoll  16:19
And so you end up having to build in different mechanisms to get there.

Valentino Stoll  16:24
But at the same time, you know, there's like a whole like other aspect of MCP land that I think is often lost of like prompt resources and other kinds of resources that you can make available to an LLM that can benefit your application. But more than anything is just consistency.

Valentino Stoll  16:44
So if you could just implement consistent abstractions like Rails has for a long time, that is better than the actual technical specification. Whether it's MCP or something else, as long as it's all like following a convention, it's going to perform the same,right? I don't know. I'm torn on MCP.

Joe Leo  17:05
Well, Kelly, can you walk us through, you said that it's being used to great effect with onboarding and it's replaced kind of building out dashboards, which of course takes time. And as you've already alluded to, every university's paying attention to something different, different KPIs, different requirements. So what does it look like on the ground? A new university or a new prospect decides,

Joe Leo  17:27
yeah, okay, I'm going to take the plunge with Scholarly. Your engineers, you have some forward deploy engineers that set up MCP connections or what happens and take us through it?

Kelly Sutton  17:36
We have an MCP server, which is just for like administrators. So this is our like operations and customer success folks,right? And so they plug their Claude. Reason Claude now might be ChatGPT in six months. Like I love being a consumer in a highly competitive market like AIright now.

Joe Leo  17:57
Fets, yeah.

Kelly Sutton  17:58
So there's a lot of setup that is required when onboarding onto a platform like Scholarly. And our admin MCP tools help us take like a good first draft approach of, okay, we need to configure a lot of things, what are known as like faculty activities on the platform.

Kelly Sutton  18:18
So these are things like publications, grants, service, like the things that make up teaching, research, and service in a university that are required to be tracked for things like accreditation or promotion or any sort of like merit processes, as well as like the workflows themselves.

Kelly Sutton  18:36
And good news is like higher ed loves like writing stuff down. And so there are a lot of like faculty handbooks out there. And so using something like Claude plugged into our admin MCP tools, we're able to take a faculty handbook and drop that into Claude and say like,

Kelly Sutton  18:55
spin up all of the workflows that you find,right? And the faculty handbook in detail is going to spell out like, okay, here's what the annual review process looks like for a faculty member. Here's what the tenure process from going from like assistant to associate professor looks like at this institution. And Claude, as it does with our code,

Kelly Sutton  19:15
it'll like sit there and make a plan and then say like, okay, I'm going to create these workflows. Someone at NRSN says, yep, looks good. And then it sets it up in the app,right? We have a web interface to do all of that, but that does take hundreds or thousands of clicks because, you know, on the sixth step of this 13-step process,

Kelly Sutton  19:36
this question on the fourth section of the second survey should only be shown to folks whose last names begin with the letters N through Z. And so there's just like an incredible amount of detail that we have to like plug into these workflows on behalf of our customers.

Kelly Sutton  19:56
And that is something that LLMs are great at doing, or I guess these harnesses like Claude and Claude Cowork.

Joe Leo  20:03
That's interesting. I see this part and I see, you know, you also do some CV import flow. And in both of those, it sounds like the human in the loop is by design. And I like the absolutely. So there's very, you know, affirmative thing like we would never just trust it. Is that the case? Do you see a time where you can just trust it?

Joe Leo  20:22
And I guess what's the failure rate when your humans are going in verifying this stuff?

Kelly Sutton  20:28
I think really in the last three months, these models are getting so good that most of the time when I see a model make a mistake and we drill in and we're like, okay, like why did it make this mistake? Turns out the source documentation was wrong.

Kelly Sutton  20:48
Meaning it was doing theright thing based on the information that it had. And so that's a little spooky to me because previously it'd be like, okay, got it wrong and it just got it wrong. Now it's like, oh no, there was a mistake in that doc and it just did what it said. And some of them are even highlighting like, hey, I did this, but I don't think this isright.

Kelly Sutton  21:08
So that's fascinating in that in some respects it might be better than a human. But for the time being and just given the type of software that we build to help me sleep better at night, I want a human in the loop,right? As much as possible.

Kelly Sutton  21:23
I think there's probably some interesting stuff that happens when the human in the loop becomes lazy and we trust it too much and we aren't keeping as critical of an eye toward like the outputs or what it's asking us to do. But that exists kind of like regardless of AI. You know, that's just any kind of like management.

Kelly Sutton  21:41
If a manager isn't verifying the results or if the customer isn't doing acceptance testing or someone's not doing acceptance testing and just saying like, yep, looks good. Like yeah, that's, you're cutting a corner. So not new to AI there.

Valentino Stoll  21:53
Yeah. This topic fascinates me. I have so many questions. So I guess how are you like evaluating that they're doing their job?

Valentino Stoll  22:01
As much as a human can get lazy, AI can be lazy as well and kind of just like gloss over maybe details or like use instructions that you gave it maybe overly so. Like you mentioned, like it may notice, but maybe it doesn't.

Valentino Stoll  22:19
Like where does that human sit as far as a quality control perspective for feedback? And like how are you like managing that aspect of things to just make sure like certain,right? What do you actually care about and test?

Kelly Sutton  22:33
I take a lot of inspiration from receiving a pull request. Okay. You get a pull request that's a 10,000 line change. You are either saying, I'm not reviewing this, or you are going through it and you get through like the first three files and you're like, I guess this is good enough. When the human is in the loop,

Kelly Sutton  22:52
it's much better to like package things in a size that a human can consume. So I'd much rather get a 100, 200, 300 line pull request to review whether from an engineer or a tool. And I think the same goes for any like business process as well.

Kelly Sutton  23:09
If you can't easily verify that the outputs are correct, then one strategy that you can take is like, well, let's make the outputs a little bit smaller. So in our example, rather than look at like the whole workflow that might be a 12-month process with a dozen plus steps,

Kelly Sutton  23:28
it's like, okay, well, let's have it do like a step at a time and verify that as it goes. And then we also, as reviewers, we also get a sense for what does it do well? Where is it likely to make a mistake? Because it's all being driven by MCP, like there's just certain capabilities that we have built into the MCP server. So our ops or CS folks might file a ticket saying like,

Kelly Sutton  23:49
hey, I can't set what shows up in a select question on our platform because that's not available to the MCP yet. Can we build that? So just scoping down what you are reviewing and getting that cycle time up.

Kelly Sutton  24:02
So you're reviewing smaller things more frequently, I think, is a great strategy for keeping the human in the loop and keeping them in the loop where they're doing more than just rubber stamping things.

Valentino Stoll  24:15
So are you allowing kind of the humans in the loop to have access to AI tooling as well to help their work, or is that like not necessarily part of it and it's more you focus the tooling around the pipeline that they fall into?

Kelly Sutton  24:33
So the MCP tools look a lot like our APIs. And this is why I also don't think like MCP is like long for this world. It's like, okay, so we build some APIs and then we go build the same things into MCP and they're doing these same things,right? So me as someone maintaining this stuff, it's like, well, I would rather maintain one thing than two.

Kelly Sutton  24:55
But yeah, like our ops and CS folks, they're the biggest users of our internal MCP tooling for our platform.

Kelly Sutton  25:02
And it's really given them the superpower of doing things that would previously require a Ruby script to do something programmatic on the platform to load data in or stand something up or move like an implementation forward. It gives them like that programmatic power without being programmers themselves.

Kelly Sutton  25:22
They're just doing that out of a Claude Cowork these days.

Valentino Stoll  25:26
It makes me wonder if a .mcp format for Rails would be valuable here. Or just any Rails endpoint can be an MCP tool.

Kelly Sutton  25:38
Yeah. And if you look at how like we're organizing the code as well, it's like controller, MCP, both point at the same like service class or like business logic doer. And so thankfully those layers are pretty thin.

Joe Leo  25:52
You've talked a little bit about, and I think we're touching on it here as well, that the idea of AI becoming an implementation detail versus a feature. And I think that's very quickly where we're shifting as an industry,right? Like it's becoming really rote, I guess, to say, well, you know, we're doing this thing with AI.

Joe Leo  26:11
I'm like, well, of course you are. If you're not doing it with AI, why would we do business with you? That said, there are, as you've mentioned, you've got a customer base that has been shifting more recently, you know, started very conservative around AI and now is moving forward quickly. So in your words, what's the difference?

Joe Leo  26:29
What's the difference between AI as a feature and AI as implementation detail?

Kelly Sutton  26:34
You got to look at the value. Like what is valuable to our customers or just any customer? The customer does not care that your Pentium processor clocks at 75 megahertz. Your customer does not care if your server has 16 megabytes or 32 megabytes of RAM.

Kelly Sutton  26:52
Your customer does not care if you're using GPT-5.4 or 5.3 nano,right? Or whatever the models are,right? Customer care is like, are you helping them make more money or saving them a lot of time?

Kelly Sutton  27:05
And I think in the last, well, since GPT-3.5, we've gotten a new tool to our tool belt, which is there's this magical probabilistic computing thing that we can plug into places and it speaks in plain language.

Kelly Sutton  27:21
And so being around the block a few times, you kind of see like the arc of people being very like chat forward of like, okay, everyone's going to put a chat into their stuff. And we're kind of in this like.

Joe Leo  27:31
This was a bad time.

Kelly Sutton  27:32
Yeah.

Joe Leo  27:33
AI development. Yes, I agree.

Kelly Sutton  27:34
Yes, exactly. And then you're like, this was the real bucket. Yeah. And you're like, well, are we a chat company? Like no, leave that to ChatGPT and Claude. And so we're kind of like settling into like, okay, what are the new things that we can do that are very valuable to our customers using this probabilistic genie?

Kelly Sutton  27:54
And is that appropriate for the tasks that we are giving it? You know, we've been living in like the discrete world for a long time as programmers here. And most of us have been trained on like, yeah, it either works or it doesn't. What do you mean it might work,right? That's a weird thing to get our heads around. And like, how do we need to change the products around that?

Kelly Sutton  28:15
Over time, I don't think we're even like going to be adding like the sparkles icon or like a chat surface. Like just parts of an application are going to be probably powered by a probabilistic LLM to help you save time or accomplish a task. And whether that's falling in like the fully discrete or more probabilistic area,

Kelly Sutton  28:37
like you won't be able to tell. Does that make sense?

Joe Leo  28:40
I'm interested in this just from a business standpoint because I'm seeing this trend, the same thing you are. AI is first, it's too out there, people don't understand it. And then, oh, actually it's really exciting. So let's everybody make a chat, which is basically funneling money into ChatGPTright through an API.

Joe Leo  29:00
And then predictably, smart companies started using it for implementation details early. But I bet it's still hard to resist from your salespeople and your marketing team's perspective to say, well, you know, we're building this thing with AI. It's really exciting, which is something that's different than, well, we're building this thing at 75 megahertz,right? Which is a thing that doesn't appear on anybody's radar.

Joe Leo  29:22
But maybe that's all marketing and sales fluff and internally you guys can just focus on what's actually working.

Kelly Sutton  29:28
You need to do both. Like you got to embrace it. I think something that I, maybe like a mistake that I made in my career like five, 10 years ago was like actually not embracing GraphQL to go back to that,right? I should have just embraced it. There's a lot to be gained with just like going along with the flow and like hyping people up and like,

Kelly Sutton  29:47
but still maintaining that like critical, like what are we doing here? Like what problems are we solving? So because GraphQL is like a bad example in this, but when it comes to like a customer, they've been told like, whatever you need to do, go figure out how we're using AI in some cases. And so they were going to go, in our case, they're going to go look for a vendor that's like very AI forward or AI first.

Kelly Sutton  30:08
And so we're doing ourselves a disservice if we're not just embracing this. It's a good type of waste where we're just experimenting with stuff. We're throwing it against the wall, but at one out of every 10 experiments hits, we're like, ooh, that is actually really useful. This like admin MCP was not something that I even predicted would be useful six to 12 months ago.

Kelly Sutton  30:28
And now it's like, oh, this is actually the only way that we do certain things at this company now. And you don't get that if you say, well, we're not going to experiment with this at all. You really have to just dive in and figure it out and know that 90% of the stuff you build or try isn't going to do it, but you at least get to be in that conversation.

Joe Leo  30:48
You know, it's interesting because you've said this a couple of times and I've, it's now 12:15 on the East Coast. I've already had three conversations about this where I've heard anything from September and October, but no later than January of this year where there has been some kind of sea change. It seems to be among the engineering set, probably this more senior or experienced engineering set,

Joe Leo  31:09
but it's also among, you know, in your case, Kelly, your customers. What do you attribute that to?

Kelly Sutton  31:15
Opus 4.5, 4.6, whichever one came around out around that time.

Joe Leo  31:19
Yeah. You get specific to the models.

Kelly Sutton  31:21
I think it's the models. And I think, man, I would hate to be running like an AI tooling companyright now because you're just like, is today the day that I wake up and Anthropic like publishes a blog post and we're just like toast? I think their speed of execution, their focus on the enterprise market,

Kelly Sutton  31:41
and like they're kind of like the Microsoft of these AI companies where it's like, no, we just build like tools for like productivity, like professional use here. We're not trying to build like an advertising company and they've just nailed it with Claude code and they're like, okay, let's try like Claude Cowork. And like anyone who's not an engineer loves that internally.

Kelly Sutton  32:03
It's a very exciting time. And I think the models getting better just obviate all sorts of minutiae that we might have been doing a year ago or two years ago around like orchestration and tooling and Claude MD files. It's like, well, if you just wait three months,

Kelly Sutton  32:21
like you can just delete your Claude MD file and you don't need that anymore. Like the model's that good.

Valentino Stoll  32:25
So I guess the question is, do you find yourself being more productive as a business owner and like producing more value faster for customers?

Kelly Sutton  32:33
I am, for the first time in my career, I am consistently overestimating how long things will take from an engineering perspective. Usually it's like, yeah, we could get that done in an afternoon. Like three months later, where is that? Now, you know, I'm chatting with folks on the business side. I'm like, yeah, it feels like a, I don't know, week or two.

Kelly Sutton  32:54
And then like that afternoon, it's like in production,right? Without a bug. And I'm still adjusting. I still haven't found my ground yet. You're usually not punished for like underpromising and overdelivering. So might stay there for a little while.

Kelly Sutton  33:09
But it is just a weird world that we're living in here where like what used to be the longest or one of the longest poles in the tent, yeah, engineering time basically becomes zero.

Valentino Stoll  33:20
As that execution timeline tightens, do you find it your planning strategy aspect of things, is that changing the way that you do those because the execution feedback loop's so tight?

Kelly Sutton  33:33
We got really lucky in how we set up the business and the relationships that we built with our early customers. So we're doing like the, every customer gets like a Slack connect, Slack channel. And we train our customers in like, you're on our board too. You are looking at the tickets that are moving through.

Kelly Sutton  33:54
You get to prioritize them. You are kind of doing some acceptance testing depending on like what the task is in front of us here. We're going to train you to get really good at high cycle time here, or I guess a low cycle time.

Kelly Sutton  34:09
So things that our competitors would say like, yep, we'll put a version of that in front of you in three months. It's like, okay, we need you to test this like this afternoon and tell us if this is meeting your needs or not. So like from the early days, we've been like optimizing for that really quick cycle time.

Valentino Stoll  34:25
And do your customers do that? Do they jump in and test things?

Kelly Sutton  34:28
Yeah. Yeah. And if they're not, there's a chance that we're not building something valuable. So it's like a nice lowercase A extreme programming, lowercase A agile, extreme programming, whatever you want to call it. We're just in it with our customers, whole team building, solving their problems here. So the LLMs have like shortened that,

Kelly Sutton  34:48
like, okay, it's going to take a few days for engineers to put that together. So it's shortened that part of the process, but there's still everything that kind of goes into adding a feature or something into the platform here. So that's always existed, but we've always tried to keep that pretty short.

Joe Leo  35:04
We've got a couple of minutes left. And you did just mention this around, you know, you got lucky with how you structure the business. And by the way, I really love that. I love that approach because I've done a lot of work with customers that expect a three to six month timeline. And there is, I think you framed it perfectly, there is a teaching moment there to say,

Joe Leo  35:24
no, no, we're going to release very, very frequently, but you got to be part of it or else we can't. I think that's great to build that in from the beginning. So I have another question for you, which is just zooming out. I love the timeline of three years. I don't love it.

Joe Leo  35:36
I think it's interesting that you've been around for three years and that is sort of like very beginning slash predating the release of ChatGPT to now where, you know, so much has changed. What, if anything, would you have changed knowing that we were going to end up where we areright now?

Kelly Sutton  35:59
I feel like we just got so lucky. I don't know.

Joe Leo  36:05
What are you glad that you did knowing where we areright now?

Kelly Sutton  36:09
I was listening to the episode of Mr. Searles on this show before coming on and I consider myself a pretty particular programmer with some like strongly like held ways of like doing things, but he might be the person.

Joe Leo  36:22
Tell you talk more.

Kelly Sutton  36:23
Yeah, he might be more particular than me. And so I think early on when we were just starting this company, GPT-3.5 or 3 was like the model de jour and it couldn't really like do anything,right? And so we were still like copy-pasting Ruby code in and out of like the ChatGPT interface.

Kelly Sutton  36:41
But we were able to set up the Rails code base exactly how I wanted it. And the model is very much like a junior or mid-level engineer. It's just going to copy what it sees. So we got to be very particular in how we set this thing up. And so now it just like follows the patterns and follows some like the conventions that we have in our code base.

Kelly Sutton  37:01
So it's just an extension of us. I think some of the shortcomings around AI that I hear people complain about is like it kind of doesn't know what to do still with like the zero to one. And it's like, well, yeah, it doesn't know your way of doing things, but we've got 12, 18, 24 months of history of like, here's how we do things here. So plug in and get with the program and that's been great.

Joe Leo  37:23
That's a great answer. And little teaser for our next episode, we have somebody who's coming in who found the exact opposite. In other words, went to try to use AI to do a bunch of code gen on a legacy application and said, oh, wait a second, we don't want to copying these practices. We got to clean stuff up first,right? And so maybe that is also a benefit of timing for you,right?

Joe Leo  37:44
You started in a place where, yeah, it was not mature, but you were able to get something out of it. And then subsequent iterations of LLMs could build on a good foundation.

Joe Leo  37:57
But in five, you know, to 10 years, when, you know, you've had enough turnover and things are really atrophying, Dev Method is great at modernizing and de-risking applications. Let's throw that in there. Hey, we've only got a couple of minutes, but Valentino, if you don't mind, I want to hear a little bit about ups.dev.

Valentino Stoll  38:16
Oh man, I have so much to say about this, but the TL;DR of it is in January, I decided to spin up my own open claw.

Joe Leo  38:26
As did we.

Valentino Stoll  38:27
As you do. And what does it do? And yeah, I was very underwhelmed by code output that it made. I was just like, ah, make a Rails app that does this stuff and use all the latest stuff. And I produced like very lukewarm results. And I'm like, okay, clearly this is a context problem. Garbage in, garbage out.

Valentino Stoll  38:45
It's probably just like making up a ton of stuff and it has a limited context window and X, Y, Z. How can I condense the knowledge chunks that it takes on each iteration of its loops and make more use of it? And all the main models are making training bootcamps and that's how they improve their models.

Valentino Stoll  39:03
And so I'm like, okay, what does that look like on a smaller scale? Can I just train it on Ruby? Can I just train it on Rails? And so I was like, okay, Claw, go like use the latest educational science and like what does a program look like to train you on using Ruby? And it's like, oh, like here's like what I do, like the Rubik's, the pipeline.

Valentino Stoll  39:25
I was like, okay, that sounds great. And I was like, so what do you need? And it was like, well, I need a list of materials to build the courses and stuff like that. So I went out and I purchased all the materials for that course that it would want. So that way at least like all of the authors are like getting paid for me using like the content. So now it's like a distribution problem.

Valentino Stoll  39:43
Like can I share this like knowledge module? I don't know. Future problem. So then it like it actually worked and it produced much better results. And so I just gave it more domains. I was like, allright, try Rails now. Can you like build a Rails app and like make something useful? And so it did and it produced better results. And I'm like, okay, this is great.

Valentino Stoll  40:01
I get a bill all the time for these stupid domains that I own and they're just like years sitting, burning money. And I'm just like, allright, I'm either just going to like get rid of them and just say, eat that burden and just like, that's it. Or I'm going to like start making use of it. I'm like, well, this seems perfect. Maybe I could train this model to make use and monetize these domains for me.

Valentino Stoll  40:21
And so I set up like a product building training camp. And as a result of this training camp, it produced the very first product built entirely by a Claw, I think, maybe not, but ups.dev.

Valentino Stoll  40:38
And it basically took the business model of statuspages.com and undercut their pricing and then added a nice little feature I thought, which was agent status pages.

Joe Leo  40:51
Yeah. Yeah. I think.

Valentino Stoll  40:52
So it built a Ruby gem to make it easy to monitor the health and uptime of an agent. And I went and added it to another site I built called Daily Vibe. And I said, okay, add this, you know, heartbeat to all of your agents that you built for the app so I can know what the status is of all the agents running and operating this site.

Valentino Stoll  41:13
And sure enough, we have Daily Vibes ups.dev page that will show you the uptime of all the agents that are running and operating Daily Vibe.ai. And it's fantastic. I love it. I hope it's successful because if not, I'm going to have to decommission it and get rid of the domain. That's part of the path. But I think.

Joe Leo  41:34
Part of the path. Yeah.

Valentino Stoll  41:35
Better than that, it's more of like, okay, does this experiment work? Can I distribute these knowledge modules to others? And how might we be able to create these pipelines where we can start sharing and distilling knowledge in general? So we'll see. I have an article started. We'll see where it goes, but it's promising.

Joe Leo  41:54
And you open-sourced the gem as a result of this too. Ruby LLM ups gem.

Valentino Stoll  41:59
Yep. So there's now a simple one-line change you can add to your Ruby LLM agents to just include monitoring and it will automatically give those heartbeat updates to the status pages. So your customers can know that, you know, your operations are operational. Super fun.

Joe Leo  42:17
Yeah. I think the best way to get people to use this is just threaten them with support Valentino's product so that we don't have to like put ads in our show. We got to make money somehow.

Valentino Stoll  42:29
Yeah, please do. My monthly bill is $10 a month. So I think just like a few customers to that would make it stay afloat. So if you have $10 to spare.

Joe Leo  42:41
And you got some agents.

Valentino Stoll  42:42
Throw it at it and it'll stay up.

Joe Leo  42:43
That'sright. Status pages.

Valentino Stoll  42:45
That'sright.

Joe Leo  42:47
Well, okay, we're run up here out of time. Kelly, any parting words you'd like to give us?

Kelly Sutton  42:52
Thanks for having me on the show. I really enjoyed it.

Joe Leo  42:56
Yeah, it was great having you. Great talking about the details and getting into the weeds about what's working, you know, on a startup that is maturing like yours is. And of course we wish you all the best with Scholarly as a Ruby product. We're all rooting for you. And I guess we'll see you around the conferences sometime soon. Allright.

Valentino Stoll  43:16
Take care.

Joe Leo  43:16
Good night, Kelly.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.