API reference

Extract clean agent-ready context from public URLs.

Sinyx returns agent-ready context, Markdown, text, HTML, full metadata, or semantic chunks from a single extraction pipeline so AI workflows can use public web pages with source and quality signals.

Authentication

The API is distributed through RapidAPI. Send your RapidAPI key and host headers with each request.

x-rapidapi-host: sinyx.p.rapidapi.com
x-rapidapi-key: YOUR_KEY

Single extraction

POST https://sinyx.p.rapidapi.com/v1/extract
Field Type Required Description
url String Yes Public HTTP or HTTPS URL to extract.
format String No full, markdown, text, html, chunks, or context.
selector String No Optional CSS selector for targeted extraction.
curl --request POST \
  --url https://sinyx.p.rapidapi.com/v1/extract \
  --header 'Content-Type: application/json' \
  --header 'x-rapidapi-host: sinyx.p.rapidapi.com' \
  --header 'x-rapidapi-key: YOUR_KEY' \
  --data '{
    "url": "https://example.com/article",
    "format": "context"
  }'

Agent quickstart

For an AI agent, request format: "context" and pass the source, freshness, quality, top chunks, and Markdown into your prompt or retrieval step.

const response = await fetch("https://sinyx.p.rapidapi.com/v1/extract", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "x-rapidapi-host": "sinyx.p.rapidapi.com",
    "x-rapidapi-key": process.env.RAPIDAPI_KEY
  },
  body: JSON.stringify({
    url: "https://example.com/article",
    format: "context"
  })
});

const page = await response.json();
const agentContext = {
  source: page.citation,
  freshness: page.freshness,
  quality: page.quality,
  topChunks: page.chunks.slice(0, 3),
  markdown: page.markdown
};

Batch extraction

POST https://sinyx.p.rapidapi.com/v1/extract/batch

Process up to five URLs concurrently. Individual URL failures are returned alongside successful results.

curl --request POST \
  --url https://sinyx.p.rapidapi.com/v1/extract/batch \
  --header 'Content-Type: application/json' \
  --header 'x-rapidapi-host: sinyx.p.rapidapi.com' \
  --header 'x-rapidapi-key: YOUR_KEY' \
  --data '{
    "urls": ["https://apple.com", "https://github.com"],
    "format": "context"
  }'

Chunks schema

Use format: "chunks" when you want semantic blocks ready for retrieval or vector storage.

Field Type Description
chunkIndex Number Sequential position in the extracted document.
headingPath String Breadcrumb path for the content block.
content String Plain text content for the chunk.
priority Number Relevance score from 0.0 to 1.0.

Context format

Use format: "context" when an agent needs source-aware context instead of a single raw representation.

{
  "url": "https://example.com/article",
  "format": "context"
}
Signal Why it matters
citation Lets agents cite source title, domain, author, fetched time, and canonical URL.
freshness Helps agents decide whether a page is current enough for the task.
quality Scores whether the extracted content is dense, useful, and low-noise.
context Returns Markdown, chunks, citation, freshness, and quality in one agent-ready response.
{
  "success": true,
  "url": "https://example.com/article",
  "title": "Example Article",
  "markdown": "# Example Article\n\nClean extracted content...",
  "chunks": [
    {
      "chunkIndex": 0,
      "headingPath": "Root > Overview",
      "priority": 0.7,
      "content": "Clean extracted content for agent workflows."
    }
  ],
  "citation": {
    "title": "Example Article",
    "sourceDomain": "example.com",
    "canonicalUrl": "https://example.com/article",
    "fetchedAt": "2026-05-01T12:00:00.000Z"
  },
  "freshness": {
    "fetchedAt": "2026-05-01T12:00:00.000Z",
    "ageDays": 11,
    "isRecent": true
  },
  "quality": {
    "score": 0.88,
    "pageType": "article",
    "hasTables": true
  }
}

MCP setup

Use Sinyx from Claude Code, Codex, Cursor, Windsurf, and other MCP clients with the published sinyx-mcp package.