Developer docs

API documentation

lincorelink speaks the OpenAI wire protocol. Point any OpenAI SDK at our base URL, swap in your key, and you are done — no new client, no rewrites.

Introduction

lincorelink is an OpenAI-compatible API gateway. It authenticates your request, meters your usage, and relays the call to a third-party large-language-model provider — currently DeepSeek V4 — then streams the response back to you. Because the wire format matches the OpenAI Chat Completions API, existing OpenAI SDKs and tools work without changes.

Base URL

bash
https://api.lincorelink.ai/v1

Endpoints

MethodPathDescription
POST/v1/chat/completionsCreate a model response for a chat conversation (streaming or non-streaming).
GET/v1/modelsList the model IDs available to your account.

Authentication

Authenticate every request with your API key in the Authorization header as a Bearer token. Create and manage keys in your dashboard. Keys look like sk-lcl-… and are shown in full only once, at creation — store them securely, because we keep only a hash and cannot recover a key later.

bash
Authorization: Bearer sk-lcl-...

A missing or invalid key returns 401 with code invalid_api_key. Keep keys server-side; never embed them in browser, mobile, or other client-side code. If a key is exposed, revoke it in the dashboard — revocation takes effect immediately.

Quickstart

Send your first request with cURL or any OpenAI SDK.

cURL

bash
curl https://api.lincorelink.ai/v1/chat/completions \
  -H "Authorization: Bearer $LINCORELINK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Python (OpenAI SDK)

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.lincorelink.ai/v1",
    api_key="sk-lcl-...",   # your lincorelink key
)

resp = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[
        {"role": "system", "content": "You are concise."},
        {"role": "user", "content": "Hello!"},
    ],
)
print(resp.choices[0].message.content)

Node.js (OpenAI SDK)

node
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.lincorelink.ai/v1",
  apiKey: process.env.LINCORELINK_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "deepseek-v4-flash",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);

Chat completions

POST /v1/chat/completions accepts the standard OpenAI Chat Completions body. The most common parameters:

ParameterTypeDescription
modelstringRequired. One of deepseek-v4-flash or deepseek-v4-pro.
messagesarrayRequired. A non-empty list of { role, content } messages (system, user, assistant).
max_tokensintegerMaximum tokens to generate. Clamped to your plan's output cap (see Rate limits & tiers).
streambooleanIf true, tokens are sent as Server-Sent Events. See Streaming.
temperature, top_pnumberSampling controls, passed through to the provider.
stop, frequency_penalty, presence_penaltyvariousStandard OpenAI parameters, passed through to the provider.

Response

A non-streaming call returns a chat.completion object. The usage object includes cached vs non-cached input tokens, which is what we bill on (see Pricing & metering).

json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1734220000,
  "model": "deepseek-v4-pro",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Hello! How can I help?" },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 7,
    "total_tokens": 19,
    "prompt_cache_hit_tokens": 0,
    "prompt_cache_miss_tokens": 12
  }
}

Streaming

Set stream: true to receive tokens incrementally as Server-Sent Events, exactly like the OpenAI API. Each event is a chat.completion.chunk with a delta; the stream ends with a data: [DONE] line.

To receive token usage at the end of a stream, set stream_options.include_usage = true; a final frame then carries the usage object. If you do not request it, we strip our internal usage frame so the stream stays byte-for-byte like a native OpenAI stream.

python
stream = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Write a haiku about the edge"}],
    stream=True,
    stream_options={"include_usage": True},  # optional: get a final usage frame
)
for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content or "", end="")

Models

Model IDBest forNotes
deepseek-v4-flashHigh-volume, latency-sensitive, cheapAvailable on every plan, including Basic.
deepseek-v4-proHarder reasoning, agents, codeAvailable on Pro and Max plans.

Fetch the live list any time with GET /v1/models. More models and providers join the same key and balance over time.

Pricing & metering

Billing is prepaid and metered per token. You are charged on the usage the provider reports, with cached input billed far cheaper than fresh (non-cached) input. Rates per 1M tokens (USD):

ModelInputCached inputOutput
deepseek-v4-flash$0.16$0.004$0.30
deepseek-v4-pro$0.60$0.005$1.20

Each request reserves an estimated maximum from your balance, then settles to the actual metered cost when it completes. A request is rejected up front with 402 if your balance cannot cover the estimate. See the pricing page for plans and the latest rates, which may change with notice.

Rate limits & tiers

Per-request context and output caps, concurrency, and any daily token cap depend on your plan. A request whose input exceeds the context cap is rejected before it reaches the provider (413 context_length_exceeded), and max_tokens is clamped to your output cap.

PlanContextMax outputConcurrencyDaily tokens
Basic (free)128K32K10100M
Pro256K32K20Unlimited
Max1M64K20Unlimited

When you exceed a limit you receive 429 with a Retry-After header — wait that many seconds, then retry with backoff.

Errors

Errors use the OpenAI shape, so existing error handling works unchanged: { "error": { "message", "type", "param", "code" } }.

json
{
  "error": {
    "message": "Insufficient balance. Top up to continue.",
    "type": "insufficient_quota",
    "param": null,
    "code": "insufficient_funds"
  }
}
StatuscodeMeaning
400model_not_foundMalformed request, or an unknown model ID.
401invalid_api_keyMissing or invalid API key.
402insufficient_fundsBalance cannot cover the request. Top up to continue.
403model_not_availableThe model is not included in your plan (e.g. Pro model on Basic).
413context_length_exceededInput exceeds your plan's context cap.
429rate_limit_errorToo many requests / concurrency or daily cap reached. Honour Retry-After.
5xxapi_errorUpstream/provider error. Safe to retry with backoff.

Best practices

  • Keep keys server-side and rotate them periodically. Revoke immediately if one leaks.
  • Retry on 429 and 5xx with exponential backoff, and honour the Retry-After header when present.
  • Reuse system prompts and prefixes to benefit from cached-input pricing, which is roughly 50× cheaper than fresh input.
  • Set max_tokens deliberately so reservations against your balance stay tight and predictable.
  • Stream long responses for better perceived latency, and request include_usage if you need token counts.

Need a key? Create an account, top up a few dollars, and make your first call in minutes.