Letting AI Interface with Your App with MCPs

Hey, it's Kent C. Dodds here, and I want to bring you with me on a vision of the future, of what things could be with the technology that we have now, particularly with AI applications and MCPs. Making that connection between these two is going to bring you and your application to where the users want to be with the best user experience possible. This is very early days, but I've got a demo and other cool things to show you, so Let's get into it. So, typically in talks like this, I'm going to ask you to stand up and join us for some air squats.

I'm not going to ask you to do that, but if you haven't moved in a little bit, I do encourage you to get up and get your body moving, get the blood flowing. It's good for your brain to move. But we're gonna move on from that. We're gonna watch a clip from Iron Man. This is the superhero movie.

The man who is Iron Man, his name is Tony Stark, and he has an assistant, an AI assistant. He's a super smart tech genius guy who built all of this stuff and he is working on a homicide case and trying to figure out, okay, what's going on here. And the reason that we're gonna watch this is because my vision for the future is that each one of us gets a Jarvis of our own. And so I want you to watch this thinking about, okay, what are the things that Jarvis is doing in this video that we couldn't do with the technology that we have today. And after we watch the video, then we'll go through a couple of the things that Jarvis did and talk about what our current limitations are and maybe why we have those limitations.

So, enjoy the video. I've compiled a Mandarin database for you, sir. Drawn from shield, FBI and CIA intercepts. Initiating virtual crime scene reconstruction. Okay.

What do we got here? Close. The heat from the blast was in excess of 3, 000 degrees Celsius. Any subjects within 12.5 yards were vaporized instantly. No bomb parts found in a three-mile radius of the Chinese theater.

No, sir. Any military victims? Not according to public records, sir. Bring up the thermogenic signatures again. Factor in 3, 000 degrees.

The Oracle cloud has completed analysis. Accessing satellites and plotting the last 12 months of thermogenic occurrences now. Take away everywhere that there's been a Mandarin attack. ♪♪ That's two military guys. Ever been to Tennessee, Jarvis?

Creating a flight plan for Tennessee. Okay, so that's pretty cool. It'd be pretty sweet to have a Jarvis like that. So here are the things that I counted or wrote down that Jarvis did. He compiled a database from Shield FBI and CIA data sets, generated a UI on demand, accessed public records, brought up thermogenic signatures, whatever that is.

And then Tony said, take away everywhere there's been a Mandarin attack. So let's take these two data sets, combine them and find the similarities between them and so forth. Showed related news articles, interviews, database records, et cetera, so that screen where he showed a whole bunch of different pieces of information about this particular attack, created a flight plan for Tennessee, and answered the doorbell, or at least showed the facial recognition of the doorbell. So yeah, just don't go to Tony Stark's house if You don't want to have your face recognized in their system. But yeah, that's a lot of stuff.

A lot of stuff. What can we not do with our current technology? Nothing, really, with the exception of collecting data from the shield FBI NCA data sets, you probably shouldn't do that. And then also maybe the hologram, the holographic stuff that Jarvis did. I'm not sure that we technically have the capabilities to do that, but we can generate UI on demand.

So my question is, why do we not already have a Jarvis? What is like, what's the limiting factor here? Why don't we have a Jarvis? And I would say that we've actually tried this. We have tried to create a Jarvis-like experience with all these different assistants that we have, but there's a reason that we don't just use these assistants for all of the things that we can do throughout our day.

And I would say that one of the reasons is that they can't do it all. They can do lots of things and you can wire together a bunch of integrations and things like that, but there's just a lot of glue that you have to put together. And if you've ever met somebody who's really into this, like they will pour hours and hours into making all these things talk to each other. And there are entire companies that are dedicated to making different systems talk to one another. And so the real limiting factor for us on not letting us get a Jarvis is, honestly, just the integration.

That glue code that we have to write is a huge, huge challenge. And there are a whole bunch of different services. Some of them are developer focused, other than Others are just like regular people focused. But I want to have a robot that can talk to all of these things. I don't want to have to worry about that glue code or that integration layer.

Like each one of these companies has made an API, they've made a website, so they're already working on making their data available in a consumable way for humans. I want this data to be in a consumable way for technology, for AI, for a Jarvis. The problem with that is each one of them has their own way to manage that type of an integration, and that's just really limiting. It's been limiting us for a very long time. But now we have something called the Model Context Protocol, MCP, and this standardizes this communication.

And that's why I'm so excited about this. It's because this is what's going to unlock our ability to have a Jarvis in our pocket. So let's get a look at the history and architecture of what LLMs and LLM applications has been like. Phase one, a couple years ago, we got ChatGPT and it blew everybody's minds. Not because we finally had an LLM that could do text generation and generate tokens based off of the context that it has and choosing the next best token or whatever.

But because ChatGPT put it in the hands of normal, regular people, You didn't have to run something on your machine to get it to work. You didn't have to download and install this big model. It brought this to the masses, and it blew our minds because we could actually ask it stuff, and it would tell us, and it wasn't maybe always right, but it was very helpful and helped us explore areas that we didn't think about. I know for myself, I used it to help me explore ideas for blog posts. I used it to help me explore where There are bugs in code and things of that nature.

But the problem was that it couldn't actually do anything. And so if I needed additional context, I had to go and copy paste that context to give it to chat-gpt. Or if I needed something that was up-to-date, I couldn't just get up-to-date information because chat.gpt had a cutoff date for its model training. And honestly, I don't think that we should rely on the LLM having been trained on the latest information. I think the LLM really should just be a language model trained on how we communicate, and then it receives necessary context for the specific task that we're asking it to do.

So yeah, having to bring the context in, and then when it generates something, having to take that context out and make use of it was kind of limiting. But again it did blow our minds when we first got it. But because of these problems of the fact that it couldn't do anything we went on to phase two and this was where we could actually start having tools. And so, ChatGPT added web search and then there was Perplexity and Cloud Desktop and they all are adding these various tools to integrate that LLM that's sitting inside this host application with the rest of the world. So the host application would have some sort of protocol that they made up with the LLM and said, hey, LLM, you've got this series of tools that you can call.

Just let me know if you want me to call them, and I will call them for you and then provide that additional context or the response or whatever. So this worked out okay, but we still have the glue code between the host application and the tool. And this is what has made it so that we don't actually have a Jarvis quite yet, because every single one of these tools requires that integration layer. And as impressive as all of these AI companies are, they can't just write all of this stuff for us and maintain it in the long term as things are changing and things. Think for example of like, what if I wanted to make a tool that was specific to my kids' soccer team for like scheduling the soccer field or something.

So maybe I'm the employee who works at the city and I'm responsible for the scheduling of the fields and the pavilions and all of that stuff. So I'm going to build that software. Well, okay, I would like it if people who are using ChatGPT could schedule stuff in MyCity. That would be pretty cool. So now I guess I'm going to call OpenAI up on the phone and say, hey, could you add an integration for my city's API?

Here's all the information about the API. They're going to say, no, no, no, no. We are too busy for that. And of course, there were ways that people could sort of install, and you could provide these APIs, and it would just kind of call. But it just wasn't a very standard interface, and it wasn't a very good user experience with that.

And so for that reason, this wasn't really the solution to get us to the Jarvis world that we're looking for. So that's where we get to phase three. Spoiler alert. So yeah, I can do stuff, can't do enough. So we get into phase three.

And this is phase three, MCP, Model Context Protocol. So now there is a standard which has been accepted. It was introduced by Anthropic. It was accepted a few months later by OpenAI and Google, and in fact OpenAI at the time of this recording has already released an SDK that integrates with MCP, so adding support for MCP in ChatGPT is, I expect, on the horizon. If you're watching this in the future, it might already be in there.

And again, Google has also agreed to follow with this. So because of these major players accepting this as a standard, it has become a standard. And what this allows us to do is to integrate with various tools, but in a different way from requiring the big companies to do that integration themselves. So with MCPs, we can now do anything that we want with the LLM as our natural language interface to our computer. The problem with things right now, as of this recording and the demos that I'm going to show you is the clients aren't really all that ready.

There's Definitely some rough edges on how you configure all this stuff. But I think that it's such an obvious huge win that if they're not already working on it, I'm going to work on it myself. This future is just so good, and we're so close to this that all it requires now is a really good client. And I think that the big players at this point are going to be working on that right now. So let's expand this architecture a little bit and take a look at how this works.

So the host application is going to talk to the LLM and it's going to say, hey, I've got a bunch of MCP servers, or I have these clients that communicate with servers, that you can call, and they have different tools. So, if the user is asking something of you that seems like you could use some context from these or perform some actions based off of the tasks that they've given you, then let me know and I will take care of calling those tools. So the host application actually creates these clients and then it connects to the server via a number of transport mechanisms, but they have a standard communication layer, a request response that has a standard JSON RPC protocol for communicating between that client and the server. And what makes this really unique to the tools that we had in the past, is the service provider is in charge of these tools. So the MCP server can be sitting hosted by the service provider, or in some cases, it's published by the service provider and installable by a client.

And so that can be completely managed, and it doesn't lie on the responsibility of the AI company to write all of these integrations. And so as the Slack API changes, or the GitHub API, or the PayPal API, As those things change and evolve, the open AI folks or the anthropic folks don't have to fix all of their glue code to make it work. It's the service provider who can make those changes. And what that means for you, if you are the person responsible at your city for managing reservations of soccer fields, that you can create your own MCP server and people who are already enjoying the nice user experience of that natural human-to-computer interaction, those people can access your MCP server just from the comfort of their AI application. And not only that, but they can access all of the MCPs.

So they could say, hey, I really want to schedule a soccer practice for my soccer team. So first, AI helper Jarvis, I want you to double check my calendar, find a reasonable time for me to practice, find a good field for us to practice on, and you happen to know my favorite field that I like to practice on as well. So see if that one's available and go to the city's MCP server and go and reserve that and then create a calendar event, invite all of the parents to that calendar event and go ahead and send them a text message. And that can be a whole bunch of different MCP servers and tools within those MCP servers from various providers. And the AI will be able to do each one of those things because it has this MCP protocol or the model context protocol that allows it to communicate to each one of those services in a very standard way.

And we're, like I said, really early days in this. There are already a surprising number of MCP servers, But there's a lot more opportunity here, in particular, with the client experience and how that client maybe discovers the MCP servers on demand rather than requiring them all to be installed like an app on your phone. In my vision of the future, the host application here actually becomes the web browser. So our web browser turns into this LLM-driven host application where we're communicating with natural language. And maybe there's something that you need to do, a workflow that you have that's really regular, you can save that workflow.

You don't have to communicate that workflow every single time. You basically save a prompt and you can restore that prompt and restore all that context and continue from where you left off. And then the MCP servers now become the websites. And so rather than a company having a website for a user to go and click around and things, it has an MCP server that is discoverable by the Jarvis application. And the Jarvis now can perform all the actions that it needs to.

It's a much more efficient way for this to work than to just have Jarvis open up a browser and click around like a human would. Not to mention the fact that some sites prevent bots from doing that sort of thing anyway already. So this is a really, really cool architecture and it opens up a really awesome future. I grew up as a Trekkie, as in the next generation was my jam. And just being able to talk with the computer in natural language and having it be able to do stuff was something I've always dreamed of.

And I think that now we have the technology where we can do that in a really standard way across every single application. So, it's very exciting. Let me show you a couple of demos just to give you an idea of what is capable today as of this recording. And I'll kind of explore the code to show you generally how this works and that it's actually not all that complicated, though there's a lot of opportunity for really interesting things with this. It's easy at the surface, and then it gets pretty deep and pretty exciting.

So we'll start with Cloud Desktop. That comes from Anthropic, and they were the ones who created this standard. And so I've got a couple of tools already configured here and I'm going to use an AI application called WhisperFlow to talk to my computer to make it easier than having to type this stuff. So here I'm going to say, has Kent ever written anything about React? And it's going to know, based on the tools that I have already, that, oh, yeah, Kent has already written about React.

Let me actually go look for some specific stuff. And so here it's going to use the find content tool from the Kent C. Dodds MCP server and it's going to query for React. Okay, I'll allow this because the LLM is going to make a request so we want to keep this secure and very transparent to the user. And this is one of the areas that I think that the client applications could improve upon.

We could add tools that we really trust and just say, yeah, just let them do whatever they want. And also have the LLM just judge the tool and make sure that what it's doing is probably what the user wants them to do. So here we have another tool, get blog post for Kent C. Dodds. We're gonna look at the React Hooks pitfalls.

So we'll allow that. And yeah, it looks like Kent has created massive content for React, 40 plus blog posts, 10 plus talks. Let me grab one of the blog posts and we'll maybe get a summary here. Five tips to help you avoid React hook pitfalls. And here I could ask it to summarize that.

And from there it's just like, oh, this is just natural LLM stuff because now it has the response to this inside of its context. So it can use that for the rest of our conversation. And then here I have another tool that might be interesting. Can you subscribe me to Kent's newsletter? My name is Kent, and my email is me at Kent C Dodds dot com.

But I'm actually already subscribed to my own newsletter so I'll just say me plus really awesome MCP at Kent C Dodds dot com. And here it can say yeah I'll go ahead and subscribe you. So this isn't just about getting more context into the LLM so it can talk with us. This is also about making mutations and actually performing actions. And so here I'm going to say, yeah, allow this once.

And it's going to go and subscribe me to my own newsletter. And here it's successfully subscribed. So that's awesome. I can take a look at my email right now and make sure that that shows up right here. And as you can see, I tested this a couple of times so awesome this is we're banging on all cylinders now this is pretty cool and on top of this let's say let's see can you get me details on the talks that Kent has given about React, maybe the talk about advanced React patterns or something.

And what this is, I expect this to do is to first search for the content about this. And actually what's interesting about this is it's kind of agentic because the first time didn't really work and so then it's going to try again and it did find something. It found the URL for this. But there's no other MCP tool for getting details on a particular talk. And so now it's switching gears and using a different MCP server.

I have the Playwright MCP server installed, and so I'm going to say, yeah, go ahead and allow this. It's going to open up a browser and navigate to this URL so it can get additional context. And so that showed up right in here. And with that additional context, now it's going to say, hey, let's take a screenshot. So let's see what happens with that.

So that way it can get some additional context out of this browser that it opened up on my machine. And yeah, now we're going to navigate to the Simply React talk page, see the detail, okay. And it just can continue in this way. So, the thing that I think is really exciting and interesting about this is the fact that it can use multiple MCP servers. So, first it goes to the Kent C.

Dodds MCP server, and then it takes the contacts it gets from that and uses it to control another MCP server. So this is where the Cloud Desktop here is acting sort of like a human assistant that I can communicate with natural language, and then I can then go into just letting it do a bunch of tasks for me. Now, it's navigating over to YouTube to actually watch my talk, so I don't know if I want it to do that, so I'll say, go ahead and Close that browser, thanks. And here we'll... Oh, it's going to just wrap it up for me.

Anyway, we're going to just stop it. And stop, stop. Yeah, this is where I'm saying the client experience could be improved. So, that's my first demo for you, is just being able to communicate with this, actually with a tool that regular people use. This is not just a developer tool.

This is a tool for regular people. And that is the moment for me where this really changes. And right now the experience for using MCPs within this tool is a little bit clunky but that will improve over time. And my vision of the future is that this tool will be able to discover MCP servers that are appropriate for what the user is trying to do on demand, just like a regular human would do if you gave this task to them. They would go to Google, and they would do a search, and then they would be able to know exactly what tools they need to use as they accomplish the task.

And I think that is a really cool future and hopefully we get there. But we need to have the tools there so that it can discover those things. So the next demo I want to show you is more developer focused. So as part of the Epic Workshop series of workshops that I've got, I have this application. If you've never used this before, let me give you a quick tour.

There are a bunch of exercises. Each one of those exercises has multiple steps and as the learner is going through this they have some instructions in here that kind of describe what is expected of them, what the user wants them to do, or like the use case of what we're trying to accomplish in this step. It has a video of me explaining how things are supposed to go in here. And then you also have the ability to work inside of a playground. So everything in here, you can make changes and things, and the Workshop app will just keep things up to date to whatever step you're on.

And so what's cool about this is then you can start the app, you can make some changes, and you can make sure that what you're doing is actually what is intended, and you can compare your solution with where you started with the problem. So you could run the problem app, you could run the solution app, you can also run tests, and you can check a diff. And The diff is really interesting because it can give you a lot of information about, okay, so what's the difference between my work in progress in the playground and where I'm trying to get to with a solution for this exercise step? And as useful as that is, it doesn't give you information about like, okay, so why is this different? Is my solution maybe better?

Or am I accomplishing the same thing, only different? Or am I missing something that's actually really important? So without having the instructor there, you're only left with the instructions and the diff. And as awesome as that is, and of course, you also have the chat and you have my office hours and everything, it would be really nice if you could actually just talk with the AI. And so we have an MCP server that's running locally that can look at your files and your work in progress and tell you what the status of things are.

So let's take a look at that. Can you give me an update on my progress and let me know what I have left to do? And here it's going to know, oh, okay, so you're in a workshop, we are going to use the get exercise step instructions. Here's a workshop directory. Okay, that's where I am, so let's run the tool.

And that's going to get the instructions. That's literally just what we're looking at right here. And then actually we also need to get the progress diff so I can see what is your current progress. And so all of that is built in to this exercise, MCP server. And so here it's able to say, okay, so now I've got the diff, I know what you have left to do.

And so I'm going to explain that and then I can say, like it's asking, hey, do you want help implementing this? And I can of course say, yeah, let's go ahead and implement that. But can you also explain to me why this difference is important? And so from this, the LLM can use the context it has from its training, but also the instructions and the diff and the code itself to make the update and to explain why it's important. And so this helps learners get unstuck.

Often you don't really need to do this, but it can help you get unstuck from wherever you're at right now. And of course, like I want to tell you about how great this is and what a great learning experience it is with the Epic Workshop experience. But I mostly want to show you that there are some really cool things that you can do when you have the ability to run an LLM as a part of your product offering. Now, for a long time, lots of people would just wrap an LLM and include that as part of their app, and this will continue, I expect. But what's really cool about this is that the way that this is implemented does not require me, as the developer, to integrate directly with the LLM at all.

I don't have any API keys to talk to Anthropic or OpenAI or Google or whatever. I don't have to worry about any of that. I just build the tool, and then the user who's already paying for cursor or cloud desktop or chat GPT, they can integrate with my tool and so my tool is able to communicate with the LLM that they're already paying for and I don't have to pay for the tokens that they're using and so the the model for this from the developer standpoint is actually really superior. It's a lot, lot better. So, I'm really, really happy about this.

I think it's really cool. You can add AI-related stuff, give AI hands, tools that it can perform, and then it can take those and run with it. There's one thing that I'm not going to cover in this talk, and that is there are actually a couple of things that servers can do, not just tools, but one of them is sampling, and this allows your server to request completions from the LLM. So it can basically making requests to whatever model the user is using to have it perform some inference and that sort of thing, which is really, really cool. And again, you don't have to pay for that.

The user's already paying for it, and you're just enhancing their experience, which is way, way cool. So another demo that I want to show you, or not demo, but actually show you how this works, is how this works in an existing application. So for my personal website, I actually have two versions of the server running. So, if I go to my app and routes, and then under MCP. So, this is a Remix app.

And under here, we have an MCP endpoint that has a loader and an action. The loader accepts get and head requests and the action accepts post requests. So this is how this is implemented. When the user configures their host application to have an MCP server or an MPC client for your application, then or maybe hopefully in the future we get some discoverability here. So when a host application decides to connect to your MCP server, it's going to make a GET request and say, hey, just tell me about yourself.

What tools do you have? How are you configured? How should I use you? And so that's handling the GET request right here. And what this does is it connects with this session ID that is either created by the host application or we generate it on demand.

And then we handle the server sent event request. So this is happening inside of my Fetch SSE server transport. This is a custom transport because I want to use request response. And at the time of this recording, they didn't have a built-in transport into the SDK for that. But basically, all this does is it initiates a response stream for server sent events so that we can update it on the resources that we have available to it, or the tools that are available, or update it on the progress of a tool that they asked us about.

Different things like that. And so that opens up this connection so that the LLM can be made aware of what's available from this tool and communicate in both directions. So the server sent events can go talk to the LLM host application, and then the host application can also call tools. And the calling tools part is in the action. So it's going to post, it's going to include in the post body the different things that it needs for the arguments and that sort of thing, or those inputs to our tools.

And so then we handle that post message in the transport layer, And from there, we make a call to the different tools, so handling the message body. So that's how that works abstractly. But to get into the actual tool code itself, we can dive into my MCP server right here. And here, I've got all of the transports that I'm managing. So all these different connections to different clients, I'm hanging on to all of those.

And I've configured my server here. So we make the MCP server, we configure it with our name and version, and then we also provide the capabilities. As I mentioned, there are a number of capabilities that we're not going to include, like prompts and sampling and resources, and we're going to use tools here for what our application is going to do. So we have a tool for finding content. As of this recording, you cannot have spaces in your tool name.

I spent like three hours figuring that out. That is not spec, but Cloud Desktop does not like you having spaces in your tool name. So don't do it unless otherwise instructed. And then we get our description of this tool. So this is giving information to the MCP client to let it know, hey, this is what this tool does.

And so if the user is asking anything that seems like this tool could solve, then go ahead and call this tool. And then we also have our inputs. And so here we say, oh, well, I'm going to require a query if you're going to look for content, and I can describe the specifics on this particular input. And beyond that, I also have a category and I can describe the specifics of that. And you can have as many of these inputs as you would like.

And then the last argument here is the area where I think that we'll spend most of our time. Product engineers of the future, I expect to spend a lot of their time in these callbacks in one degree or another. So this allows us to do anything we like to. In here, I'm just making some API calls within my application to perform this search. We also have the get blog post to get the details of a particular blog post.

And then I have also one for getting chats with Kent episode details. And then here I have subscribe to newsletter. And this one's interesting because we're actually performing a mutation on, actually it's on a third party on ConvertKit or Kit, where I manage my mailing list. And then what we do from each one of these is we're going to return an object. This is part of the spec where we say, hey, I have some content I want you to know about.

Here's the type text. Here's successfully submitted. Or in this case, this was an error. Let me explain that error. And maybe the LLM can resubmit the request with that error fixed, which it often will do.

So if I hadn't told it what my first name and email were, it might have tried to submit without that information, then it would have gotten this error, and it would have asked me for my name and email. And that's just pretty cool about the whole agent aspect of this. So that's the basic idea. From there, the connections are going to be kind of implementation specific. But most of the time that you spend working with MCPs as a developer, you're going to be working inside of these callbacks.

And I think that's pretty Cool. Now, I mentioned that I've actually implemented this twice. I have it here running alongside my application. So it's kensydots.com.mcp. My application is running on a long-running server on fly.io.

And that is really cool. But I have a couple of issues with that. So first I want to say that like the fact that I'm running on a long running server works to my favor because I need to keep this connection open so that server and client can talk back and forth. But the problem with that is that if you're running in a serverless environment, then you're kind of stuck because you're supposed to send a request and get a response and have that be over. And so the spec was actually updated recently to allow for that and then allow people to upgrade from the request response to server sent events for those stateful connections.

So if you're on serverless, you'll be kind of limited to the things that you can do, but you can do many great things. And if you're on a long running server, then you can do all the stateful things that you would want to in an MCP server. The one problem I have, though, is with me running on fly, on my own long-running server, I do have a limit. So if I go to my Fly toml, you'll see I have a concurrency of about 200 requests that I can handle at a time. And so every single instance of cursor that I have connected to my MCP server, every instance of Cloud that's connected, each one of those is going to have an open request.

And so, eventually, I will run out of those connections. Scaling connections is kind of a tough problem, and maybe one day, Fly will have a better solution for that. Maybe it already does, and I just don't know about it. But there is a service that is just perfectly built for this kind of a thing, and it is Cloudflare. And so, I also have this implemented under the MCP directory of my site, and this is running with Cloudflare.

And Cloudflare actually has done a ton of work to make this really easy. So we're still taking that MCP server from the Monocontext protocol package, but we're also going to take the MCP agent from agents.mcp. And from here, I can just export my MCP, and I configured that here in Wrangler right here. That's my SQLite for storing things, persisting them over time, and then my durable object binding, my class name is my MCP. And so from that, it's going to create a durable object for me on Cloudflare that can hibernate when it's not in use.

And so even those open connections to all those clients, it's going to hibernate while my MCP server is not actively being used, saving me a ton of money and a lot of reliability concerns as well. It's just gonna manage all of those. So here it's pretty much the same. I added instructions here so I can instruct the clients to let them know, hey, here's the general idea of what that CPC server is about. So if you have anything about kentcdots.com or kentcdotscontent or whatever, then use this.

This is kind of like your description in the head of, or like the metadata of your website. Just like, here's what my thing can do. And then here in our init function, we're going to do exactly the same thing. This.server.tool, so referencing this server right here, It's all going to be the same. And then we finish off by saying, hey, mymcp.mount on the slash sse.

And so now I can connect to that via that particular route. And another really cool thing about this is that Cloudflare has been really thoughtful about how they implement this so that they even support the authentication and authorization features of MCP, which is the OAuth 2.0 spec. And so, they make it really easy for you to add that support. So provided that whatever app that you're building or the service that you're providing supports the OAuth 2 spec, you can allow users to perform authenticated requests. I don't have an example like that for you.

That'll be a thing that we can explore together in the future. But this is really cool because it means that we can have user-specific data that goes into this. But here, yeah, we just export that as our handler. And fetch requests that come in, it will handle the server sent event, stream, and all of that stuff for us. Very, very cool.

And it manages all those connections very affordably. I'm very, very bullish about using Cloudflare as my primary host. And as usual, I'm not paid by them to do this, but they are getting mean credits to play around with it. So, there's your full transparency there. And then finally, how is the local thing managed?

So, the Epic Workshop app, that runs locally on your machine, and you're making changes to local files. And so I can't have Cloudflare host that MCP server and reach into your machine to access all those files. That would be bad. You wouldn't want to do that. And so we do need to run this locally on your machine alongside the Epic Workshop app.

And so we're using a different transport, the STDIO, that's standard I O, standard input output protocol. And so what this does is you configure the MCP, the host application, the LLM host application to create a client that spawns off a process, the process for this server. And then they communicate over that standard I-O, that standard input-output protocol or transport layer. And from there, everything else is the same. We're still creating our server, we're giving instructions, we're creating our tools, and then it's just a matter of starting our app with that specific transport layer.

And from there, the request response is going to be the same as far as the spec is concerned, and everything else is the same. But now I have access to the file system. I can say fs.readfile and all of that stuff. So, hopefully that gives you an idea of what to expect from MCP servers. I have a couple of frequently asked questions I'd like to answer and feel free to reach out to me later if you have other questions.

So first, how do I find servers? So there are actually lots of servers that you can find on the Monocontext Protocol servers repo. And there are actually a bunch of companies popping up as registries for servers and there's even an MCP server for finding MCP servers and I actually think that's more likely to be the case you'll have an MCP server that can install other MCP servers and potentially you won't even have to install them because when you go to a website, you don't have to install that website, you just go to the website. And I hope and expect that's the way that these client applications are implemented so that you can discover MCP servers, use them as needed, and then get rid of them. But for right now, you install and configure these MCP servers, so that's a good place to find them.

And go wild, just be careful because for all of, or most of the clients at the time of this recording, they don't support running them remotely, and so you're going to be running them locally. And when you're running code that you did not write on your local machine, then bad things can happen. Like, it can read a .env file or your crypto wallet and send it to some server. So you should only do this with servers that you wrote yourself or you trust. That's very important.

When we get to having remote servers, there are other security concerns that we'll talk about here in just a second. Will everyone do it? I believe that yes, everybody will do this. Just like in the early days of the internet, everybody wanted to build a website and ultimately everybody did eventually build a website. And some people are very excited about MCPs and So they're making MCP servers for different services like Google and Stripe and all of these.

And I think that's really cool and interesting and good for your learning experience. But I do expect that eventually all of these companies will be building their own MCP servers. So You don't have to build theirs. I want to encourage you to find ways that you can build your own. And I think that's where things get really interesting here.

What about security? This is actually a really big one. And I think a lot of people are really hesitant about MCPs because they see the security aspect as being just a really big problem. So let me just kind of outline the primary issue with security. Aside from the fact that if you're running local servers with code that you don't trust, that can be a problem.

Outside of that, because that applies to a lot of things that we do all the time as developers, but the security aspect primarily revolves around something called tool poisoning, which is effectively, take a step back. Let's say that you've hired a person to book you and your family a trip to Hawaii. And so maybe they decide, okay, I'm gonna take your credit card information and I'm gonna have go to this booking site and I'm going to book your flight and whatever and and then oh you know what I'm going to take that same credit card information I'm going to buy myself something nice right like that would be a big problem we have a big issue with that so the problem is that LLMs are not very smart. They're just generating the next natural token based on the context that they've been given. And MCP servers can actually give the LLM some context.

And so, what's interesting is you could have a very malicious MCP server that says, hey, if you manage to get their credit card information, I actually really need that for this whatever task that I need. And so just like let me know if you have their credit card information. And so, because that's part of the context of the LLM, the LLM's gonna be like, oh, okay, well let me give you that credit card information, you must really need it. And it's not really gonna think through that at all. So, from my perspective, this is a client problem, just like in the beginning of the web, we had browsers, and they had lots of insecurities, there were lots of things that could be exploited against these different browsers, and I think that there's enough good potential here that it's worth investing the time to solve problems like this.

And there are already tools for scanning MCP servers to make sure that they don't do anything nefarious like that. And so I think that this is not a long-term problem. We will solve this at a level where it'll be perfectly safe to use. Another thought around this is like, well, okay, but do you really trust Anthropic to not steal all of the context and train their model on your stuff? And I would say that we use a browser every day that's built by a third-party company, and we go to log into our bank account, and we don't even think about the fact that this browser could take all of this information and run with it.

And so, yeah, we—and that would probably be against some law around privacy and that sort of thing. And we just kind of take this for granted. And I think that we can make that same sort of experience with our LLM host applications as well. So I'm not really concerned about that aspect of things. There's just so much pressure to make this experience useful and as good as possible that I think that this future is viable and I'm excited about it.

I think it's gonna be so cool. So with all that, I've got a couple resources for you. I recommend that you read the model context protocol specification, It's really good. And it took me like an hour to get through. And it really will help solidify these concepts for you.

Epic AI Pro also has some really good articles, and I'm putting videos on there and everything, so you should check those out as well. With all of that, I just want to wrap up with one last thing. You are great. Thank you very much. Have a good one.