How to Deploy an MCP Server with Billing
Complete walkthrough of deploying a production MCP server with integrated billing. Covers template selection, environment configuration, deployment to Vercel/Railway/Fly, Stripe connection, and going live.
In this guide
Choose a Template
SettleGrid provides 13 MCP server templates and 4 REST API templates, each with billing pre-wired. Browse them at /templates or run npx create-settlegrid-tool --list to see all options in your terminal. Templates range from simple (a single-tool server that wraps an external API) to complex (multi-tool servers with database connections, caching, and rate limiting).
For your first deployment, choose a template that closely matches your use case. The "web-search" template is a good starting point if your tool calls external APIs. The "database-query" template works well for tools that read from a database. The "ai-proxy" template suits tools that call an LLM and add value on top (summarization, classification, extraction).
Each template includes a complete project structure: source code, tests, Dockerfile, deployment configs for three platforms (Vercel, Railway, Fly.io), environment variable documentation, and a CI/CD pipeline. Fork the template, customize the handler, and you are ready to deploy.
Configure Environment Variables
Every SettleGrid deployment requires three environment variables: SETTLEGRID_API_KEY (your publisher API key), SETTLEGRID_TOOL_ID (your tool's unique identifier, assigned when you register), and STRIPE_CONNECT_ACCOUNT_ID (your Stripe Connected Account ID for payouts). All three are available in the SettleGrid dashboard under Settings > API Keys.
If your tool calls external APIs, add those credentials as environment variables too. Never hardcode secrets in your source code — the templates use process.env for all configuration and include a .env.example file documenting every required variable. For local development, copy .env.example to .env.local and fill in your values.
For production deployments, configure secrets through your hosting provider's dashboard or CLI. On Vercel, use vercel env add. On Railway, use the Variables tab. On Fly.io, use fly secrets set. All three platforms encrypt secrets at rest and inject them as environment variables at runtime. Double-check that your SETTLEGRID_API_KEY is the production key, not the sandbox key — they are different.
Deploy to Your Platform
The deployment process varies by platform, but SettleGrid templates include configuration files for all three major options. For Vercel, push to GitHub and import the repository — the vercel.json in the template handles the build configuration. For Railway, click "Deploy from GitHub" and select your repo — the railway.toml defines the service. For Fly.io, run fly launch followed by fly deploy — the fly.toml and Dockerfile are ready to go.
Regardless of platform, verify that your deployment can reach the SettleGrid API by checking the health endpoint. The templates expose a /health route that reports SDK version, API connectivity, and billing pipeline status. Hit this endpoint after deployment and confirm all checks pass before proceeding.
For production workloads, configure auto-scaling. The SettleGrid SDK is stateless — it sends metering events asynchronously and does not require sticky sessions — so horizontal scaling works out of the box. Set minimum instances to 1 (to avoid cold starts), maximum to whatever your budget allows, and let the platform scale based on CPU or request count.
Connect Stripe for Payouts
SettleGrid uses Stripe Connect to pay tool publishers. If you do not already have a Stripe account, create one at stripe.com and complete identity verification. Then, in the SettleGrid dashboard, go to Settings > Payouts and click "Connect Stripe." This initiates the Stripe Connect onboarding flow, which takes about 5 minutes and requires your bank account details.
Once connected, SettleGrid automatically transfers your earnings to your Stripe balance on a rolling 7-day schedule. You can view pending payouts, completed transfers, and revenue breakdowns in both the SettleGrid dashboard and the Stripe dashboard. SettleGrid uses a progressive take rate: 0% on your first $1K/mo of revenue, 2% on $1K-$10K, 2.5% on $10K-$50K, and 5% above $50K. Most developers pay 0%.
Test the payment flow end-to-end before going live. Use the SettleGrid sandbox with Stripe test mode to simulate tool calls, verify that metering events are recorded, and confirm that settlement amounts are correct. The sandbox produces test webhook events that you can inspect in the Stripe dashboard under Developers > Webhooks. Verify that the amounts, descriptions, and metadata match your expectations.
Go Live and Verify
When you are ready to accept real payments, switch from sandbox mode to production mode in your settlegrid.config.ts by setting mode: 'live' (or by removing the mode field, since live is the default). Redeploy your server with the production SETTLEGRID_API_KEY. The SDK will now meter real usage and settle real payments.
Verify the production deployment by making a few test calls from the SettleGrid dashboard's built-in tool tester. Check that calls succeed, latency is acceptable (<500ms p95 for most tools), and billing events appear in the Metering tab. Make one call with intentionally invalid input to verify that your error handling works and that failed calls are not billed.
Publish your tool to the marketplace by running npx settlegrid publish --live. Your listing will appear in the explore page, category pages, search results, and the Discovery API within minutes. Monitor the dashboard closely for the first 24 hours — watch for error rate spikes, latency regressions, or unexpected billing patterns. Set up PagerDuty or Slack alerts for critical metrics so you can respond quickly to production issues.
Set Up Monitoring and Alerts
Production MCP servers need observability. The SettleGrid SDK exports OpenTelemetry spans for every tool call, so you can send traces to your preferred observability platform (Datadog, Grafana, Honeycomb, or any OTLP-compatible backend). The templates include a tracing.ts file that configures the OTLP exporter — just set the OTEL_EXPORTER_OTLP_ENDPOINT environment variable.
Configure alerts for three critical metrics: error rate (alert if >5% of calls fail), latency (alert if p95 exceeds 2 seconds), and revenue (alert if daily revenue drops more than 50% from the 7-day average). The SettleGrid dashboard supports webhook notifications that you can route to Slack, PagerDuty, or email.
Review your server logs weekly. Look for patterns: are certain input formats causing errors? Are specific AI agents making an unusually high number of calls? Is your upstream API returning rate limit errors during peak hours? Use these insights to improve your handler, adjust rate limits, and optimize your infrastructure for the actual usage patterns you observe.
Ready to get started?
Scaffold a complete MCP server with billing pre-wired in under 5 minutes.
Start Building — Free