Extract skills with source tracing

Returns a list of skills found in a document with its trace information, including contextual classification data found in the document that resulted in a skill match, and optionally the normalized text from the document used to extract skills.

Supported document types and their expected Content-Type:

  • JSON – application/json Document must be UTF-8 encoded.
  • Plain text – text/plain
  • PDF – application/pdf
  • Word (docx) – application/vnd.openxmlformats-officedocument.wordprocessingml.document

For the most accurate results mapping sourceStart and sourceEnd byte offsets on surface forms and context forms, be sure to request includeNormalizedText as true. Byte offsets are guaranteed to match the text returned in the normalizedText field.

Note that these are byte offsets, be sure the language you are parsing this text in is representing the returned string's characters as 8-bit bytes for proper source offset referencing.

Request document size is limited to 10MB, text parsed from the document is limited to 50KB.

Note that this endpoint has a free tier monthly quota of 50 requests. Contact us if you'd like this increased or made unlimited. Responses from this endpoint will include two headers, RateLimit-Remaining and RateLimit-Reset, which indicate how many requests you have remaining in your current quota period and when that quota will reset, respectively.

Language
Credentials
OAuth2
Click Try It! to start a request and see the response here!