Extracting Structured Data
Request
Let’s run a sample extraction with the Documind open-source package.
Parameters
Currently, only URLs are accepted. Ensure your document is hosted and accessible via a public URL.
The file URL.
The schema that defines the structure of the data you want to extract. Read more on how to define a schema.
You can select a template schema that matches your document. [Template options] (/guides/templates/overview)
Use autoSchema to auto-generate your schema
Example Output
Once the extraction process is complete, the result will return a structured JSON object with the extracted data:
Indicates whether the extraction was successful or not.
The number of pages processed in the document.
The extracted data based on the schema.
The name of the processed file
The markdown of the file
Request
Let’s run a sample extraction with the Documind open-source package.
Parameters
Currently, only URLs are accepted. Ensure your document is hosted and accessible via a public URL.
The file URL.
The schema that defines the structure of the data you want to extract. Read more on how to define a schema.
You can select a template schema that matches your document. [Template options] (/guides/templates/overview)
Use autoSchema to auto-generate your schema
Example Output
Once the extraction process is complete, the result will return a structured JSON object with the extracted data:
Indicates whether the extraction was successful or not.
The number of pages processed in the document.
The extracted data based on the schema.
The name of the processed file
The markdown of the file
To extract data from documents on the dashboard:
- Log in to your account and navigate to the “Extract” page from the side menu.
- Upload the document you want to process.
- Define your schema using the visual editor, or switch to JSON mode for direct editing.
- Click the “Extract Data” button to start the extraction process.
- Once the extraction is complete, the results will be available in the “Result” tab.
Here’s how to extract data from a document using the API:
Start by calling the endpoint to create an extraction job.
Pass in the file URL along with your schema.
The response will include the job ID, which you can use with the next endpoint to check the status and retrieve the results of your extraction.
For detailed information, visit the API Reference.