Logo

NewsCatcher CatchAll

Authentication Type: API Key

Description: Transform natural language questions into structured, validated records extracted from web sources. Submit queries, track processing, and set up recurring monitors with webhook notifications.


Authentication

To authenticate, you'll need a NewsCatcher CatchAll API key from NewsCatcher.

Note: The CatchAll API uses a separate API key from the standard NewsCatcher v3 API. Use the x-api-key header for authentication.


Jobs

Create and manage processing jobs that extract structured data from web sources.

Create Job

Submit a natural language query to create a new processing job. Jobs typically complete in 10-15 minutes. Use getJobStatus to poll for completion.

Operation Type: Mutation (Write)

Parameters:

  • query string (required): Natural language question describing what to find. More specific queries produce more focused results.
  • schema string (nullable): Template string to guide record formatting. Use placeholders like [COMPANY], [REVENUE], [AMOUNT].
  • context string (nullable): Additional context to focus on specific aspects of your query.

Returns:

  • job_id string: Unique job identifier. Use this to check status and retrieve results.

Example Usage:

{
  "query": "Find all Series A funding rounds announced in the last week for AI startups",
  "schema": "[COMPANY] raised [AMOUNT] in Series A funding led by [INVESTOR]",
  "context": "Focus on artificial intelligence and machine learning companies"
}

Get Job Status

Check the current processing status of a job. Poll every 30-60 seconds until status is completed or failed.

Operation Type: Query (Read)

Parameters:

  • job_id string (required): Job ID returned from createJob

Returns:

  • job_id string: Job identifier
  • status string (nullable): Current processing status
  • steps array of objects: Detailed progress tracking for each stage
    • status string: Step status (submitted, analyzing, fetching, clustering, enriching, completed, failed)
    • order number: Sequential position (1-7)
    • completed boolean: Whether step has finished

Example Usage:

{
  "job_id": "job_abc123xyz"
}

Get Job Results

Retrieve the final results for a completed job with extracted records and citations.

Operation Type: Query (Read)

Parameters:

  • job_id string (required): Job ID returned from createJob
  • page number (default: 1): Page number
  • page_size number (default: 100): Records per page (max: 1000)

Returns:

  • job_id string: Job identifier
  • query string (nullable): Original natural language query
  • context string (nullable): Context provided with query
  • validators array of strings: Validation criteria applied to filter results
  • enrichments array of strings: Extracted field names in the enrichment object
  • status string: Job status
  • duration string (nullable): Total processing time
  • candidate_records number (nullable): Records before validation
  • valid_records number (nullable): Validated records extracted
  • page number: Current page
  • total_pages number: Total pages available
  • page_size number: Records per page
  • all_records array of objects: Extracted records
    • record_id string: Unique record identifier
    • record_title string: Short title summarizing the record
    • enrichment object: Structured data extracted from articles (dynamic schema based on query)
    • citations array of objects: Source articles used to extract this record
      • title string: Article title
      • link string: URL to source article
      • published_date string: Publication date (ISO 8601)

Example Usage:

{
  "job_id": "job_abc123xyz",
  "page": 1,
  "page_size": 50
}

Monitors

Create recurring scheduled jobs with optional webhook notifications.

Create Monitor

Create a monitor from a successful job to run on a schedule. Optionally receive webhook notifications on completion.

Operation Type: Mutation (Write)

Parameters:

  • reference_job_id string (required): Job ID to use as template for scheduled runs
  • schedule string (required): Natural language schedule (e.g., "every day at 12 PM UTC", "every 48 hours"). Minimum 24-hour interval.
  • webhook object (nullable): Optional webhook for completion notifications
    • url string (required): Webhook endpoint URL
    • method string (nullable): HTTP method (POST or PUT, default: POST)
    • headers object (nullable): Custom HTTP headers
    • params object (nullable): Query string parameters

Returns:

  • status string: Response status
  • monitor_id string: Unique monitor identifier

Example Usage:

{
  "reference_job_id": "job_abc123xyz",
  "schedule": "every day at 9 AM UTC",
  "webhook": {
    "url": "https://api.example.com/webhook/newscatcher",
    "method": "POST",
    "headers": {
      "Authorization": "Bearer token123"
    }
  }
}

List Monitors

List all monitors with their schedules and status.

Operation Type: Query (Read)

Parameters:

None required.

Returns:

  • monitors array of objects: List of monitors
    • monitor_id string: Monitor identifier
    • reference_job_id string: Template job ID
    • schedule string: Schedule description
    • is_active boolean: Whether monitor is active
    • created_at string: Creation timestamp

Example Usage:

{}

Get Monitor Results

Retrieve aggregated results from monitor runs.

Operation Type: Query (Read)

Parameters:

  • monitor_id string (required): Monitor ID
  • page number (default: 1): Page number
  • page_size number (default: 100): Records per page

Returns:

  • monitor_id string: Monitor identifier
  • all_records array of objects: Aggregated records (same schema as Get Job Results)
  • page number: Current page
  • total_pages number: Total pages
  • page_size number: Records per page

Example Usage:

{
  "monitor_id": "mon_xyz789abc",
  "page": 1,
  "page_size": 100
}

Enable Monitor

Re-enable a disabled monitor to resume scheduled runs.

Operation Type: Mutation (Write)

Parameters:

  • monitor_id string (required): Monitor ID to enable

Returns:

  • status string: Response status
  • message string: Status message

Example Usage:

{
  "monitor_id": "mon_xyz789abc"
}

Disable Monitor

Disable a monitor to stop scheduled runs.

Operation Type: Mutation (Write)

Parameters:

  • monitor_id string (required): Monitor ID to disable

Returns:

  • status string: Response status
  • message string: Status message

Example Usage:

{
  "monitor_id": "mon_xyz789abc"
}

For real-time news search, headlines, and breaking news clustering, see NewsCatcher.


Common Use Cases

Data Extraction:

  • Extract structured company funding data from news articles automatically
  • Build datasets of product launches, acquisitions, or partnership announcements
  • Create validated records with citations for compliance and audit trails

Competitive Intelligence:

  • Monitor competitor activities with scheduled jobs running daily or weekly
  • Extract executive moves, hiring patterns, and strategic initiatives
  • Track industry trends with natural language queries about specific sectors

Research Automation:

  • Automate data collection for market research with recurring monitors
  • Extract financial metrics, pricing changes, or regulatory updates
  • Build time-series datasets by scheduling monitors to run periodically

Alert Systems:

  • Set up webhook notifications for critical business events
  • Receive structured data when monitors detect relevant news
  • Integrate with internal systems through webhook callbacks