Developer docs

API documentation

lincorelink speaks the OpenAI wire protocol. Point any OpenAI SDK at our base URL, swap in your key, and you are done — no new client, no rewrites.

Introduction

lincorelink is an OpenAI-compatible API gateway. It authenticates your request, meters your usage, and relays the call to a third-party large-language-model provider — currently DeepSeek V4 — then streams the response back to you. Because the wire format matches the OpenAI Chat Completions API, existing OpenAI SDKs and tools work without changes.

Base URL

bash

https://api.lincorelink.ai/v1

Endpoints

Method	Path	Description
`POST`	`/v1/chat/completions`	Create a model response for a chat conversation (streaming or non-streaming).
`GET`	`/v1/models`	List the model IDs available to your account.

Authentication

Authenticate every request with your API key in the Authorization header as a Bearer token. Create and manage keys in your dashboard. Keys look like sk-lcl-… and are shown in full only once, at creation — store them securely, because we keep only a hash and cannot recover a key later.

bash

Authorization: Bearer sk-lcl-...

A missing or invalid key returns 401 with code invalid_api_key. Keep keys server-side; never embed them in browser, mobile, or other client-side code. If a key is exposed, revoke it in the dashboard — revocation takes effect immediately.

Quickstart

Send your first request with cURL or any OpenAI SDK.

cURL

bash

curl https://api.lincorelink.ai/v1/chat/completions \
  -H "Authorization: Bearer $LINCORELINK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Python (OpenAI SDK)

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.lincorelink.ai/v1",
    api_key="sk-lcl-...",   # your lincorelink key
)

resp = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[
        {"role": "system", "content": "You are concise."},
        {"role": "user", "content": "Hello!"},
    ],
)
print(resp.choices[0].message.content)

Node.js (OpenAI SDK)

node

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.lincorelink.ai/v1",
  apiKey: process.env.LINCORELINK_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "deepseek-v4-flash",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);

Chat completions

POST /v1/chat/completions accepts the standard OpenAI Chat Completions body. The most common parameters:

Parameter	Type	Description
`model`	string	Required. One of `deepseek-v4-flash` or `deepseek-v4-pro`.
`messages`	array	Required. A non-empty list of `{ role, content }` messages (`system`, `user`, `assistant`).
`max_tokens`	integer	Maximum tokens to generate. Clamped to your plan's output cap (see Rate limits & tiers).
`stream`	boolean	If `true`, tokens are sent as Server-Sent Events. See Streaming.
`temperature`, `top_p`	number	Sampling controls, passed through to the provider.
`stop`, `frequency_penalty`, `presence_penalty`	various	Standard OpenAI parameters, passed through to the provider.

Response

A non-streaming call returns a chat.completion object. The usage object includes cached vs non-cached input tokens, which is what we bill on (see Pricing & metering).

json

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1734220000,
  "model": "deepseek-v4-pro",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Hello! How can I help?" },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 7,
    "total_tokens": 19,
    "prompt_cache_hit_tokens": 0,
    "prompt_cache_miss_tokens": 12
  }
}

Streaming

Set stream: true to receive tokens incrementally as Server-Sent Events, exactly like the OpenAI API. Each event is a chat.completion.chunk with a delta; the stream ends with a data: [DONE] line.

To receive token usage at the end of a stream, set stream_options.include_usage = true; a final frame then carries the usage object. If you do not request it, we strip our internal usage frame so the stream stays byte-for-byte like a native OpenAI stream.

python

stream = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Write a haiku about the edge"}],
    stream=True,
    stream_options={"include_usage": True},  # optional: get a final usage frame
)
for chunk in stream:
    if chunk.choices:
        print(chunk.choices[0].delta.content or "", end="")

Models

Model ID	Best for	Notes
`deepseek-v4-flash`	High-volume, latency-sensitive, cheap	Available on every plan, including Basic.
`deepseek-v4-pro`	Harder reasoning, agents, code	Available on Pro and Max plans.

Fetch the live list any time with GET /v1/models. More models and providers join the same key and balance over time.

Pricing & metering

Billing is prepaid and metered per token. You are charged on the usage the provider reports, with cached input billed far cheaper than fresh (non-cached) input. Rates per 1M tokens (USD):

Model	Input	Cached input	Output
`deepseek-v4-flash`	$0.16	$0.004	$0.30
`deepseek-v4-pro`	$0.60	$0.005	$1.20

Each request reserves an estimated maximum from your balance, then settles to the actual metered cost when it completes. A request is rejected up front with 402 if your balance cannot cover the estimate. See the pricing page for plans and the latest rates, which may change with notice.

Rate limits & tiers

Per-request context and output caps, concurrency, and any daily token cap depend on your plan. A request whose input exceeds the context cap is rejected before it reaches the provider (413 context_length_exceeded), and max_tokens is clamped to your output cap.

Plan	Context	Max output	Concurrency	Daily tokens
Basic (free)	128K	32K	10	100M
Pro	256K	32K	20	Unlimited
Max	1M	64K	20	Unlimited

When you exceed a limit you receive 429 with a Retry-After header — wait that many seconds, then retry with backoff.

Errors

Errors use the OpenAI shape, so existing error handling works unchanged: { "error": { "message", "type", "param", "code" } }.

json

{
  "error": {
    "message": "Insufficient balance. Top up to continue.",
    "type": "insufficient_quota",
    "param": null,
    "code": "insufficient_funds"
  }
}

Status	code	Meaning
`400`	`model_not_found`	Malformed request, or an unknown model ID.
`401`	`invalid_api_key`	Missing or invalid API key.
`402`	`insufficient_funds`	Balance cannot cover the request. Top up to continue.
`403`	`model_not_available`	The model is not included in your plan (e.g. Pro model on Basic).
`413`	`context_length_exceeded`	Input exceeds your plan's context cap.
`429`	`rate_limit_error`	Too many requests / concurrency or daily cap reached. Honour Retry-After.
`5xx`	`api_error`	Upstream/provider error. Safe to retry with backoff.

Best practices

Keep keys server-side and rotate them periodically. Revoke immediately if one leaks.
Retry on 429 and 5xx with exponential backoff, and honour the Retry-After header when present.
Reuse system prompts and prefixes to benefit from cached-input pricing, which is roughly 50× cheaper than fresh input.
Set max_tokens deliberately so reservations against your balance stay tight and predictable.
Stream long responses for better perceived latency, and request include_usage if you need token counts.

Need a key? Create an account, top up a few dollars, and make your first call in minutes.