Returns a list of skills found in a document with its trace information, including contextual classification data found in the document that resulted in a skill match, and optionally the normalized text from the document used to extract skills.
Supported document types and their expected Content-Type:
- JSON –
application/json
Document must be UTF-8 encoded. - Plain text –
text/plain
- PDF –
application/pdf
- Word (docx) –
application/vnd.openxmlformats-officedocument.wordprocessingml.document
For the most accurate results mapping
sourceStart
andsourceEnd
byte offsets on surface forms and context forms, be sure to requestincludeNormalizedText
astrue
. Byte offsets are guaranteed to match the text returned in thenormalizedText
field.Note that these are byte offsets, be sure the language you are parsing this text in is representing the returned string's characters as 8-bit bytes for proper source offset referencing.
Request document size is limited to 10MB, text parsed from the document is limited to 50KB.
Note that this endpoint has a free tier monthly quota of 50 requests. Contact us if you'd like this increased or made unlimited. Responses from this endpoint will include two headers, RateLimit-Remaining
and RateLimit-Reset
, which indicate how many requests you have remaining in your current quota period and when that quota will reset, respectively.