Using MCP Sampling in VSCode (Insiders)

Hey friends! I’m super excited to share some awesome news: VS Code Insiders now supports the entire Model Context Protocol (MCP) spec. That means you get not just tools, prompts, and resources, but also roots and (🥁 drumroll) sampling! 🎉

If you’ve been following along, you know sampling has always been a bit of a “just trust me, it works” kind of thing. But now, we can actually see it in action, play with it, and use it in real projects. This is a big deal!

Real-World Example: Journaling with MCP

In my MCP Fundamentals workshop, we build a journaling app that’s accessible only through MCP. It’s a great playground for these new features. We’ve got tools for creating entries, managing tags, and more. But here’s where it gets really cool: instead of manually adding tags to your entries, the LLM can now generate tags for you automatically using sampling.

Here’s how it works:

When you create a journal entry, we make a sampling request to the client, asking it to generate recommended tags.
We use void instead of await so the user doesn’t have to wait for the sampling request to finish. The entry is created instantly, and the tag generation happens in the background.
The sampling request includes a system prompt and all the context needed to generate the right tags—either an empty array (if no tags are needed), references to existing tags, or recommendations for new ones.

Imagine in the future, with the elicitation feature (currently in the MCP draft spec), you could even approve or reject the LLM’s tag suggestions. That’s going to be a fantastic enhancement!

The User Experience

Let’s walk through what this looks like:

You ask the LLM to write a journal entry about a trip to the beach and insert it into the database using the MCP server.
The entry is created—content, mood, location, weather—all there, but no tags yet.
Instantly, you get a request: “Hey, the MCP server wants to make a language model call to generate tags. Allow?” (It’d be nice to see the exact prompt, but for now, you just approve it.)
The sampling happens in the background. You don’t have to wait.
Later, you can ask, “What tags were applied to this entry?” Boom: beach, nature, relaxation, sunset. Perfect!

You can even peek behind the scenes to see the sampling requests and outputs. It’s all transparent and super handy for debugging or just satisfying your curiosity.

Why This Matters

This is a huge leap forward. Now, you can borrow the user’s LLM to accomplish all sorts of completions—like generating tags, summaries, or anything else you can dream up. And it’s all built right into VS Code Insiders.

I expect this feature to land in other clients soon, so keep your eyes peeled. In the meantime, go try out sampling, have fun, and if you want to go deep, join me in the MCP Fundamentals workshop!

Happy sampling! 🚀

Hey, I'm way excited because VS Code Insiders has added support for the entire MCP spec. That means you get not only tools, prompts, and resources, but also roots and sampling. And this is very exciting because we've not been able to use sampling before. So it's always just been like, this is how it works. Just believe me.

Now we can actually see how it works and start to use it. This is very exciting. So let's take a look at the way that this works. I have an MCP fundamentals workshop that we're going to be using here. In that, we build a journaling application that is accessible only through MCP.

And we have a create entry tool. We also have tools for managing tags on entries and things. And as cool as it is to add tags manually to your entries, it's even cooler if when you create them, the LLM can just generate tags for you. And so once we create the entry, we actually make a sampling request to ask the client if they could just generate us some recommended tags. And we're using void here instead of a wait, so we don't have to wait for the user's approval of our sampling request.

And then we just immediately return and say, yeah, created. Awesome. By the way, could you answer that sampling request? So in the sampling request that we create right here, we specify our system prompt and the messages necessary to ultimately generate the output that we're looking for. We need some structured output of an array of either empty array, if there are no suggested tags, maybe the user created all the appropriate tags themselves, or an array of references to existing tags or recommendations of tags to create.

And you can imagine that in the future with the elicitation feature that's in the draft spec of MCP right now, that we could actually ask them, hey, your LLM recommended this tag. Are you good with it? Yes or no, which would be another nice enhancement on this feature as well. So then we provide it with all the context of the entry so it can generate the most appropriate tags, and then we parse those and create and apply those. So let's see what this experience is like.

Please write a journal entry about a trip to the beach, go ahead and make it up, make it about three paragraphs long. Then insert it in the database. Make sure that it actually uses the MCP server. Okay, because sometimes it'll just like spit it out at me and I'll say, no, you got to put it in the database. So great.

There's the content, mood relaxed, location beach, weather sunny. OK, nice. But it doesn't include any tags. So we've got our content and title and stuff. Doesn't include any tags.

So awesome. Now we're getting the request. Hey, the MCP server, MCP fundamentals, that's the workshop, has issued a request to make a language model call. Do you want to allow it to make the request? It would be nice if I could actually see the specifics of the request, like what's the prompt that it's going to give and stuff.

But I'm going to go ahead and allow it because I trust this server. Then we're done here. So it should actually be very fast. We're not going to get any feedback on the results, but we can take a look at that in a second. Let's just ask, could you look up the tags that have been applied to this journal entry, please?

And let's see, it's gonna look up entry one, and oh, sweet, beach, nature, relaxation, sunset. That's cool. And we can take a look at the actual content right here. Looking for tags, there they are. Boom, we've got our tags.

Awesome. And we can also take a look at the sampling to what actually did happen behind the scenes there. So if I go to list servers and then select MCP fundamentals, there is this option, show sampling requests. And so we've done six total requests in the last seven days. Here is what the request looked like.

Interestingly, it doesn't include the system prompt in here, but we do have the output. So it suggested creating a beach with that description, nature, et cetera, et cetera. So it did exactly what I wanted it to do. And now you can borrow the user's LLM to accomplish some sort of completion, which I think is really, really exciting. So I'm very thrilled that VS Code Insiders now has support for this.

I expect this to be added to a bunch of the other clients in the future. So keep an eye out for that. And that is sampling. Go sample stuff. Have a good time and I'll see you in the MCP workshop if you want to learn how to really go deep into all of this stuff.