PDF Segmentator

Civil SocietyHuman Rights

A user requires their PDF to be segmented, so an AI system segments it into an annotated PDF and JSON.

The AI system supports users in collecting information they need for different tasks.

A user submits a single PDF to a user-interface (UI).

Model 1 segments the PDF; its output feeds Model 2 which classifies the segmentations. Outputs are a JSON and an annotated PDF.

The user checks the output. If satisfied with the annotated PDF, they accept it and use the JSON; otherwise, they reject and do not use the JSON.

Human-Approved AI

•Models should be pulled from a specific commit number (ad-hoc practice)
•Services that use the AI should have a release version (ad-hoc practice)
•All benchmarks should be saved in a public repository (organization best practice)
•Test sets to assess that performance is maintained (organization best practice)
•Code implemented is open-sourced (organization policy)
•Services should be covered with unit tests, integration tests, and end-to-end tests (organization policy)

The AI system could make mistakes when annotating and segmenting the PDF.