Investment Report Extraction Example
Learn how to extract data from an investment report statement.
Last updated
Learn how to extract data from an investment report statement.
Last updated
In this example, we will use the Feathery Document Intelligence feature to extract the following details from an uploaded investment report:
Account Holder Name
Holding name and ticker symbol for each holding of any type owned by the account
We'll create a new extraction named "Investment Report Extraction" and configure it to extract data from our file upload field named "investment-report-file". In this extraction we'll add two queries: one for the name of the account holder and another for extracting the holding details.
For this query, we'll select the "Single Value" frequency because we're only expecting to find one occurrence of the account holder's name on the investment report. We'll name the entity of this query "Account Holder Name" and save it to the "account_holder_name" field on our Feathery form.
At this point, we could end the query here and see accurate results from the extraction. However, for more consistent results we're going to provide a criteria to the entity within the advanced options. The criteria helps the query by giving more context to the entity along with any constraints or instructions that we may have.
We'll set the criteria to "The name of the person that the statement is regarding". This helps the query by telling it we're looking for a person's name, not a company or organization. It also describes that the account holder's name can be found by looking at who the statement is being addressed to.
Next, we'll add a new query by clicking "Add Query". For the holdings query, we'll select the "Tabular" frequency because we know that the statements contain the holding details within a table. The tabular frequency is optimized to understand and extract data from tables.
Once we select the tabular frequency, we have the option to configure multiple columns to extract data from. We'll start with setting the first column to "Holding Name" and saving it to the "holding_name" hidden field on our Feathery form. This will extract the full name of the holding from the table on the investment report.
Note: You do not have to name the column the exact name of the column found on the investment report.
Similar to the account holder name, we will also define a criteria for more accurate and consistent results. We'll set the criteria of the Holding Name to: "The full name of the holding whether it be a stock, bond, mutual fund, or other financial instrument. Do not include the ticker symbol in the name."
Breaking down this criteria, we're telling the query three things about the holding name column:
We're looking for the full name of the holding
The holding we're looking for can be of any type
Do not include the holding's ticker symbol in the name
We're including the 3rd instruction because we have a pre-determined understanding that the holding name is combined with the ticker symbol in the table on the investment report.
Next, we'll add another entity for the ticker symbol of each holding by clicking "Add Entity". We'll set the column to "Holding Ticker Symbol" and save it to the "holding_ticker_symbol" hidden field on our Feathery form.
Like all of our other queries, we will define a criteria for more accurate and consistent results. We'll set the criteria of the Holding Ticker Symbol to: "The ticker symbol of the holding. For example 'APPL'."
A key tactic used in this criteria is giving an example of what the ticker symbol looks like ("APPL"). This helps the query quickly understand what it's looking for within the investment report.
Now that we have our entities defined for the Holdings Query, we are going to define a "Query Criteria" for the query, found in the Advanced Options. A "Query Criteria" can be used on multiple-value and tabular queries to describe context, instructions and constraints around the set of entities that you defined.
We're going to set the query criteria of our Holdings Query to: "Only extract from pages that contain a table of holdings currently owned by the account. Not activity logs, summaries, or other types of data. Extract all types of holdings including (but not limited to) mutual funds, stocks, bonds, ETFs, etc."
Breaking down our query criteria, we're telling the query the following details:
Only extract from pages that contain holdings currently owned by the account
Don't extract from pages that are activity logs, summaries, or other types of data
Extract all types of holdings
These instructions help the query as an investment report may have many tables of holdings but not all are describing the details of the current holdings held. Typically, page filtering instructions are very useful in the query criteria if you are dealing with large documents. In addition, telling the query what not to extract is very useful for the accuracy of the extraction.
Now that we have setup our extraction, we will connect it to our form by adding an "AI Document Extraction" action as the first action to the button found on the first page of the form. Considering we want to see the account holder name in the next page, we will avoid running the extraction as asynchronous.
Now that everything is connected, we can test our extraction by opening our form and uploading a document. Once we click the button that we configured the action to run on, the extraction will run and you will proceed to the next page once it completes. After the extraction has run, we can see the form field populated with the name of the account holder's name.
To see the holdings data that was extracted, we can go to the "AI" tab and click on the "Investment Report Extraction". Here you will see a new submission.
Clicking on the submission will bring you to a page where you can view all the extracted data and where it was extracted from on the document.