Advanced Document Result Response

Our advanced document result response gives you full visibility into Inscribe's individual features. If you require more granular results than our standard response to make a decision you can use this advanced response. This page will guide you through the response fields that you can expect from the document result endpoint.

Inscribe's Document object has additional information available through an expanded response when performing a GET request. You can access this expanded response by passing the optional query parameter style=advanced. If you are using our webhooks, please email [email protected] to let us know if you would like to also receive this advanced response in your webhook responses.

curl "https://app.inscribe.ai/api/v2/customers/{customer_id}/documents/{document_id}?style=advanced" \
  -H "Authorization: YOUR_API_KEY"
import inscribe

api = inscribe.Client(api_key='YOUR_API_KEY', user_id='YOUR_USER_ID')

api.retrieve_document_results(customer_id=customer_id, document_id=document_id, style="advanced")

For information about fields also returned in the standard response, please see our public-facing documentation.
In addition to the standard response, the following fields are returned:

  • page_urls
  • metadata
  • text_by_page
  • fraud_results

Breaking down the advanced document result response

In this guide, we will breakdown a single example response from the document endpoint with '?style=advanced" as a query parameter. For clarity, the guide will explain different sections of the response at a time. You will notice the fraud_results field, for instance, is broken down into many different sections.

Metadata and text_by_page

The metadata and text_by_page field contain additional information retrieved from the document.

  • metadata - Object - Contains all of the metadata information about the document
  • text_by_page - List - a list of objects
    * page_number - Integer - the page number the object represents
    * text - String - the OCR output of that page

Fraud Results

The results of Inscribes fraud detection algorithms are detailed in the fraud_results field. The fraud_results field will only be present in the response if the document is in the PROCESSED state.


📘

Every detector is a key in the fraud results dictionary and each value of the key contains a type, failed, and details field.

  1. type - The type of fraud result. It can be one of:

    • TRANSFORMATION
    • SEEN_BEFORE
    • X_RAY
    • METADATA
    • DATABASES
  2. failed - The failed boolean field indicates whether this detector has "failed" indicating a fraudulent signal.

  3. details - The details field is unique to each detector and gives a deeper analysis of the result that we found for this detector.

Note: Depending on your account's configuration a combination of failed fraudulent detectors may be required for the final decision in the is_fraudulent field to indicate that the document is fraudulent.

A. Transformations

Inscribe classify five fraudulent detectors as transformations. The five transformations can be found below where a link is present for an in-depth analysis of each detector at our help center.

  1. Fonts
  2. Masking
  3. Overlaid Text
  4. Edited Text
  5. Text Compresssion

The details field of the five transformations contains a pages field that contains a list of page results for that detector. An entry is only present for a page if there were results found for the detector on that page. The list is ordered by ascending page number.

An item in the pages field:

  • page_id - String - unique identifier for the page
  • description - String - a brief description of the result found.
  • page_number - Integer - the page number the object represents
  • positions - List - This field is deprecated. The bounding boxes field in instances will now contain this information
  • instances - List - This list of objects contains all the instances for the page. Each object contains text and the bounding boxes on the page that are associated to this text.
    • bounding_boxes - List - This list of objects contains the x0, x1, top, and bottom positions of the suspicious region on the page. A text region can contain multiple regions on the fonts method.
    • text - String - The text that is found to be suspicious.
    • hidden_text - String - The text hidden by a masking operation (only available on the masking object)

    Extra field available for the fonts detector

    • inconsistent_words - List - A list of inconsistent word fonts found at that page.

    "fraud_results":{
       "fonts":{
          "type":"TRANSFORMATION",
          "failed":true,
          "details":{
             "pages":[
                {
                   "page_id":"27",
                   "description":"1 inconsistent word found",
                   "inconsistent_words":[
    
                   ],
                   "page_number":1,
                   "positions":[
    
                   ],
                   "instances":[
                      {
                         "bounding_boxes":[
                            {
                               "x0":305.551055908203,
                               "x1":315.657440185547,
                               "top":332.974548339844,
                               "bottom":347.029235839844,
                               "color":[
                                  86,
                                  211,
                                  219,
                                  100
                               ]
                            },
                            {
                               "x0":160.502990722656,
                               "x1":305.454040527344,
                               "top":332.823974609375,
                               "bottom":347.372009277344,
                               "color":[
                                  94,
                                  86,
                                  219,
                                  100
                               ]
                            }
                         ],
                         "text":"2018"
                      }
                   ]
                }
             ]
          }
       },
       "masking":{
          "type":"TRANSFORMATION",
          "failed":true,
          "details":{
             "pages":[
                {
                   "page_id":"27",
                   "description":"1 masked region found",
                   "page_number":1,
                   "positions":[
    
                   ],
                   "instances":[
                      {
                         "bounding_boxes":[
                            {
                               "x0":160.502990722656,
                               "x1":315.657440185547,
                               "top":334.823974609375,
                               "bottom":345.372009277344
                            }
                         ],
                         "text":"Deposits",
                         "hidden_text": "Credits",
                      }
                   ]
                }
             ]
          }
       },
       "overlaid_text":{
          "type":"TRANSFORMATION",
          "failed":true,
          "details":{
             "pages":[
                {
                   "page_id":"27",
                   "description":"1 region with overlaid text",
                   "page_number":1,
                   "positions":[
    
                   ],
                   "instances":[
                      {
                         "bounding_boxes":[
                            {
                               "x0":79.5000152587891,
                               "x1":278.951019287109,
                               "top":355.454010009766,
                               "bottom":366.002014160156
                            }
                         ],
                         "text":"Ending Balance on"
                      }
                   ]
                }
             ]
          },
          "edited_text":{
             "type":"TRANSFORMATION",
             "failed":true,
             "details":{
                "pages":[
                   {
                      "page_id":"27",
                      "description":"1 suspicious regions",
                      "page_number":1,
                      "positions":[
    
                      ],
                      "instances":[
                         {
                            "bounding_boxes":[
                               {
                                  "x0":160.502990722656,
                                  "x1":315.657440185547,
                                  "top":334.823974609375,
                                  "bottom":345.372009277344
                               }
                            ],
                            "text":"Ending Balance on June 5, 20 18"
                         }
                      ]
                   }
                ]
             }
          },
          "text_compression":{
             "type":"TRANSFORMATION",
             "failed":true,
             "details":{
                "pages":[
                   {
                      "page_id":"27",
                      "description":"1 suspicious regions",
                      "page_number":1,
                      "positions":[
    
                      ],
                      "instances":[
                         {
                            "bounding_boxes":[
                               {
                                  "x0":160.502990722656,
                                  "x1":315.657440185547,
                                  "top":334.823974609375,
                                  "bottom":345.372009277344
                               }
                            ],
                            "text":""
                         }
                      ]
                   }
                ]
             }
          }
       }
    

    B. Date metadata

    The document's metadata can be used to find evidence of fraud. The date metadata detection can have one of four possible activations, three of which indicate fraud.

    Indicate document is fraudulent:

    1. dates_outside_window - Creation date and modification date do not match in the document's metadata.
    2. dates_inside_window - Creation date and modification date differ by {specified value} in the document's metadata.
    3. no_date - Incomplete date information in the document's metadata.

    Indicate document is legitimate:
    4. dates_match - Creation and modification dates match in the document's metadata.

    The details field in the data metadata has the following properties:

    • activations - String - indicates which of the four possible metadata activations shown above is present in this document.
    • description - String - a brief description of the result found
    • context - Object - The context field gives further information about the dates_outside_window and dates_inside_window if they are indicated on the activations field.
    • mismatch_length - String - Mismatch length is a Postgres interval data type that indicates the difference between the two dates.
    • date_type_1_name - String - The first metadata field name that was used for reference in this suspicious date information found in the metadata.
    • date_type_2_name - String - The second metadata field name that was used for reference in this suspicious date information found in the metadata.

    "date_metadata":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "activation":"dates_outside_window",
          "description":"Creation date and modification date do not match",
          "context":{
             "mismatch_length":"P1937DT00H18M31S",
             "date_type_1_name":"CreateDate",
             "date_type_2_name":"ModifyDate"
          }
       }
    },
    

    C. Metadata

    If the data within a document are all the things you see when you open it — the words and the images that make up the document — then the metadata is everything else. Inscribe is able to identify when something is suspicious in the metadata. The following are suspicious metadata checks that are performed on your document. If the detector has found suspicious information in the metadata then the status field of that detector will be failed. The details field contains a description which gives a brief overview of what the detector is indicating.

    The following is an overview of our metadata detectors and a brief overview of what it indicates.

    • qpdf - File has been edited with QPDF.
    • exif_data_modified - Metadata edited with exiftool. Showing original metadata.
    • edited_while_scanned - Metadata indicates file has been edited during scanning process.
    • high_user_access - PDF access settings allow editing.
    • adobe_fonts - File has been edited with Adobe Pro DC.
    • annotation_dates - Document has inconsistent annotation metadata.
    • annotations - Document has annotations.
    • malformed_date - Document has malformed date in metadata.
    • text_layer_text - Text added with Adobe Photoshop.
    • touchup_text - Document text has been edited

    "qpdf":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"File has been edited with QPDF."
       }
    },
    "exif_data_modified":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Metadata edited with exiftool. Showing original metadata."
       }
    },
    "edited_while_scanned":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Metadata indicates file has been edited during scanning process."
       }
    },
    "high_user_access":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"PDF access settings allow editing."
       }
    },
    "adobe_fonts":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"File has been edited with Adobe Pro DC."
       }
    },
    "annotation_dates":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Document has inconsistent annotation metadata."
       }
    },
    "annotations":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Document has annotations."
       }
    },
    "malformed_date":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Document has malformed date in metadata."
       }
    },
    "text_layer_text":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Text added with Adobe Photoshop."
       }
    },
    "touchup_text":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"Document text has been edited."
       }
    },
    
    • software_blacklist - Inscribe is able to identify what software tools were used to create or modify a document and maintains a blacklist of suspicious tools.

      • description- String - This field is deprecated use instances field.
      • software- String - This field is deprecated use instances field.
      • instances - List - This contains a list of suspicious software that was used to create this document.

    "software_blacklist":{
       "type":"METADATA",
       "failed":true,
       "details":{
          "description":"",
          "software":"",
          "instances":[
             "Adobe Acrobat"
          ]
       }
    },
    

    D. Seen Before

    The seen before detector checks if the document you have uploaded has been uploaded previously by you or anyone in your organization. The detector will fail if a document(s) is found and all previous instances of that document will be listed on the documents field of the details field.

    Each entry in the documents field represents a previously uploaded instance of this document.

    • id - String - unique identifier for the document
    • created_at - String - timestamp of when the document was uploaded to our system. Format: yyyy-MM-dd'T'HH:mm:ssZ
    • name - String - name of the document
    • urls - Object - the urls field contains a links to the document
      • web_app - a link to the customer on the web app
      • api - a link to the customer on the API
    • customer - Object - this represents that customer that this document was uploaded
      • id - String - unique identifier for the customer
      • created_at - String - timestamp of when the customer was created in our system. Format: yyyy-MM-dd'T'HH:mm:ssZ
      • name - String - name of the customer
    • creator - Object - this represents the user that previously uploaded this document
      • id - String - unique identifier for the user
      • email - String - the email address of the user

    "seen_before": {
                "type": "SEEN_BEFORE",
                "failed": true,
                "details": {
                    "documents": [
                        {
                            "id": "48",
                            "created_at": "2020-02-21T15:10:32.759Z",
                            "name": "bank_statement_forgery.pdf",
                            "urls": {
                                "web_app": "https://app.inscribe.ai/#/customers/25d1e478-49df-4026-97e2-fbe84e4642d4/documents/48",
                                "api": "https://app.inscribe.ai/api/v1/customers/25d1e478-49df-4026-97e2-fbe84e4642d4/documents/48"
                            },
                            "customer": {
                                "id": "25d1e478-49df-4026-97e2-fbe84e4642d4",
                                "created_at": "2020-02-19T11:14:12.515Z",
                                "name": "John Smith"
                            },
                            "creator": {
                                "id": "1b942784-fbec-4210-8652-9831e41c0600",
                                "email": "[email protected]"
                            }
                        }
                    ]
                }
            },
    

    E. X-Ray

    Inscribe attempts to reconstruct previous versions of all submitted PDF documents. Whether or not this is possible depends on whether there are any previous versions to recover, as well as the tool used to produce the PDF and the settings used when saving the PDF. For a more in-depth analysis of this detector visit our help center.

    If there is a recovered version of a submitted PDF document this detector status will be failed.

    The details field contains further information about the recovered document.

    • recovered_version - Object - This represents the complete copy of the earliest recovered version that we could recover.
    • url - String - a temporary URL hosting the complete recovered document. This URL will be accessible for 20 minutes from the time of this request.
    • compare_page - The compare pages field contains information for comparing the recovered version against the uploaded version of the document. The field is a list of each page where a recovered version of that page was found. Each item contains the recovered and uploaded version for comparison. The list is ordered by ascending page number.
    • page_number - Integer - The page number that this object reference
    • urls - Object - URLs of the recovered and uploaded version.
      • recovered_version - String - A temporary URL hosting the recovered page. This URL will be accessible for 20 minutes from the time of this request.
      • uploaded_version - String - A temporary URL hosting the uploaded page. This URL will be accessible for 20 minutes from the time of this request.

    "xray": {
                "type": "X_RAY",
                "failed": true,
                "details": {
                    "recovered_version": {
                        "url": "https://app.inscribe.ai/inscribe-app-development/pdfresurrect_1_49_bank_statement_forgery.pdf?AWSAccessKeyId=admin&Signature=qteOTZdhTzHVwmQLwUABCIkX7Q4%3D&Expires=1583149572"
                    },
                    "compare_pages": [
                        {
                            "page_number": 1,
                            "urls": {
                                "recovered_version": "https://app.inscribe.ai/inscribe-app-development/xray_0_Post%20X-Ray_1_49_bank_statement_forgery.png?AWSAccessKeyId=admin&Signature=NDE4idvJY3%2B1cBEkACQ0Iu8Q3nQ%3D&Expires=1583149572",
                                "uploaded_version": "https://app.inscribe.ai/inscribe-app-development/xray_0_Pre%20X-Ray_1_49_bank_statement_forgery.png?AWSAccessKeyId=admin&Signature=eQtxyQRVY7scV1INKGugutZe%2FdE%3D&Expires=1583149572"
                            }
                        }
                    ]
                }
            }
        }
    

    F. Databases

    Inscribe will check for matches against blacklists, templates, and rejected customer decisions that you have provided in your account. If there is a match against a blacklist, template, or rejected customer the failed field will be true.

    The details field contains further information about the blacklist.

    An item in the pages field:

    • page_id - String - unique identifier for the page
    • description - String - This field is deprecated. Use results in instances fields.
    • type - Integer - This field is deprecated. Use results in instances fields.
    • page_number - Integer - the page number the object represents
    • instances - List - This list of objects contains all the suspicious blacklist results for the page. Each object contains information about a blacklist match.
      • blacklist - Objects - The blacklist object that you have previously stored.
        • id - Integer - id of the blacklist
        • name - String - the name provided by you for the blacklist entry.
        • phone - String - the phone number provided by you for the blacklist entry.
        • address - String - the address provided by you for the blacklist entry.
      • matched_text - String - The text that is found to match your blacklist.
      • similarity - String - The confidence score between the matched_text and your blacklist.
      • type - String - The type of blacklist. This can be either NAME, PHONE, or ADDRESS.

      "blacklists":{
         "type":"DATABASES",
         "failed":true,
         "details":{
            "pages":[
               {
                  "page_id":299279,
                  "description":"",
                  "type":"",
                  "page_number":1,
                  "instances":[
                     {
                        "blacklist":{
                           "id":1,
                           "name":"jane customer",
                           "phone":"8129771132",
                           "address":"123 fake street"
                        },
                        "matched_text":"jane customer",
                        "similarity":1.0,
                        "type":"NAME"
                     }
                  ]
               }
            ]
         }
      },
      

      The details field contains further information about the template.

      An item in the pages field:

      • matched_template_name - String - the name of the template that the document has matched with
      • similarity - String - a percentage match
      • page_number - Integer - the page number the object represents

      "templates":{
         "type":"DATABASES",
         "failed":true,
         "details":{
            "pages":[
               {
                  "matched_template_name":"bank_statement_forgery_1.pdf",
                  "similarity":"100%",
                  "page_number":1
               }
            ]
         }
      }
      

      The details field contains further information about the rejected customer. The rejected_details field contains a list of rejected customers by you that match an address on the document.

      An item in the rejected_details field:

      • customer - Object - this represents that customer that this document was uploaded
        • id - String - unique identifier for the customer
        • name - String - name of the customer
      • address - String - the normalized address found in the document that matches the rejected customer.

      "rejected_customer":{
         "type":"DATABASES",
         "failed":true,
         "details":{
            "rejected_details":[
               {
                  "customer":{
                     "id":"9e21ef1f-3d9f-475d-b2f7-2d549e64c5c5",
                     "name":"Ill Behaved"
                  },
                  "address":"123 fake street",
                 	"urls": {
                    "web_app": "https://app.inscribe.ai/#/customers/25d149df-49df-4027-97e2-fbe84e4642d4",
                    "api": "https://app.inscribe.ai/api/v1/customers/25d149df-49df-4027-97e2-fbe84e4642d4",
                  }
               }
            ]
         }
      }