[GH-ISSUE #99] Data Provenance #41

Open
opened 2026-03-03 11:58:33 +03:00 by kerem · 0 comments
Owner

Originally created by @sergeiosipov on GitHub (Sep 3, 2025).
Original GitHub issue: https://github.com/finmars-platform/finmars-core/issues/99

Rationale:

Finmars processes data from different sources and produce its own data.
We need to identify from where this data came from and how it was created for data provenance helping data governance.

Solution:

Implement Provenance Specific Entities (Platform Version, Provider, Source) and add Provenance fields (see section "Provenance Fields" below) for Finmars entities (incl. transactions).

Functionality needed

  1. Create entities (see section "Affected Entities" below) with provenance fields (see section "Provenance Fields" below)
  2. View these fields in the Entities Forms
  3. Use these fields in the Entities Import Schemes
  4. Aggregate/Filter/Search by these fields in Reports: Transaction, Balance, Profit and Loss

Affected Entities

NEW Entities:

Existing Entities:

  • account
  • account_import_scheme
  • account_type
  • accrual_schedule
  • accrual_schedule_import_scheme
  • client_entity
  • counterparty
  • counterparty_import_scheme
  • currency
  • currency_import_scheme
  • factor_schedule
  • factor_schedule_import_scheme
  • fx_rate
  • fx_rate_import_scheme
  • instrument
  • instrument_import_scheme
  • instrument_type
  • layout_dashboard
  • portfolio
  • portfolio_bundle
  • portfolio_history
  • portfolio_import_scheme
  • portfolio_reconcile_group
  • portfolio_reconciliation
  • portfolio_reconciliation_history
  • portfolio_register
  • portfolio_register_record
  • portfolio_type
  • price
  • price_import_scheme
  • responsible
  • responsible_import_scheme
  • strategy1
  • strategy1_import_scheme
  • strategy2
  • strategy3
  • transaction_base
  • transaction_complex
  • transaction_complex_import_scheme
  • transaction_type
    and all other internal platform entities

Provenance Fields

Data context

  • provider_version- [PrvdrVrsnUsrCd- input in ttypes] reference id to the entity Provider. (may be smth like '{obj["configuration_code"]}ch.{obj["channel"]}ver.{obj["version"]}')
  • source_version- [SrcVrsnUsrCd- input in ttypes] reference id to the entity Source. (may be smth like '{obj["configuration_code"]}ch.{obj["channel"]}ver.{obj["version"]}')
  • credential- [CrdntlRcrdUsrCd- input in ttypes] Finmars Vault credential's user code used to access provider's data.
  • reference_ids- [TxIdrsDtls- input in ttypes] json-like field for different key-value pairs identifying entity in different sources, e.g. instrument ids by standards (ISIN: xxx, FIGI: yyy) and by provider/source (bank1_id: xxx, provider1_id: yyy) can be used simultaneously.
  • actual_at- [DtVerified- input in ttypes] the data imported can be actual at the date different from access date, it shows when the data was last revised.
  • attributes_extra- [TxClssfctnDtls- input in ttypes] json-like field for different key-value pairs of attributes that are present in the acquired data but absent in Finmars data models.

Execution context

  • platform_version- [PltfrmVrsn- input in ttypes] reference id to the entity Platform Version.

Trigger context

  • origin_initiator_code- (need to rename to origin_initiator_username) [OrgnInitrUsrnm- input in ttypes] user login (who is responsible for manual/scheduler/api call): john_doe
  • origin_initiator_type- [OrgnInitrTp- input in ttypes] How action was initiated: manual, scheduler, third_party_push.
  • origin_initiator_third_party_push_code- [OrgnInitrThrdPtyPushCd- input in ttypes] exact third party address code for third_party_push, for example ***
  • origin_manual_entry_point - (planning to rename to origin_initiator_manual_code) [OrgnInitrMnlCd- input in ttypes] exact webpage address code (for manual only, including addons address), for example ***

Service context

  • workflow_module_user_code- [WrkflwMdlUsrCd- input in ttypes] workflow used for the data (smth like: com.finmars.standard-workflow:init)
  • workflow_module_version_semantic- [WrkflwMdlVrsnSmntc- input in ttypes] workflow's version used for the data
  • workflow_id- [WrkflwID- input in ttypes] id of the workflow that created the data
  • platform_task_id- [PltfrmTaskId- input in ttypes] id of the Finmars platform task that created the data

Important Notice

One provider may have different sources per run.
We can't standardize the reference_ids json field but we need to be able to query it.

To Do:

  1. Implement Platform Version;
  2. Implement Provider;
  3. Implement Source;
  4. Remove credential_version;
  5. Modify all other entities with fields from Provenance Fields section.
  6. Add these fields into API filters.
  7. Add these fields into Forms.
  8. Add these fields into Import Schemes.
  9. Add these fields into Reports.
Originally created by @sergeiosipov on GitHub (Sep 3, 2025). Original GitHub issue: https://github.com/finmars-platform/finmars-core/issues/99 ### Rationale: Finmars processes data from different sources and produce its own data. We need to identify from where this data came from and how it was created for data provenance helping data governance. ### Solution: Implement Provenance Specific Entities ([Platform Version](https://github.com/finmars-platform/finmars-core/issues/130), [Provider](https://github.com/finmars-platform/finmars-core/issues/128), [Source](https://github.com/finmars-platform/finmars-core/issues/126)) and add Provenance fields (see section "Provenance Fields" below) for Finmars entities (incl. transactions). ### Functionality needed 1. Create entities (see section "Affected Entities" below) with provenance fields (see section "Provenance Fields" below) 2. View these fields in the Entities Forms 3. Use these fields in the Entities Import Schemes 4. Aggregate/Filter/Search by these fields in Reports: Transaction, Balance, Profit and Loss ### Affected Entities #### NEW Entities: - [Platform Version](https://github.com/finmars-platform/finmars-core/issues/130); - [Provider](https://github.com/finmars-platform/finmars-core/issues/128); - [Source](https://github.com/finmars-platform/finmars-core/issues/126). #### Existing Entities: - account - account_import_scheme - account_type - accrual_schedule - accrual_schedule_import_scheme - client_entity - counterparty - counterparty_import_scheme - currency - currency_import_scheme - factor_schedule - factor_schedule_import_scheme - fx_rate - fx_rate_import_scheme - instrument - instrument_import_scheme - instrument_type - layout_dashboard - portfolio - portfolio_bundle - portfolio_history - portfolio_import_scheme - portfolio_reconcile_group - portfolio_reconciliation - portfolio_reconciliation_history - portfolio_register - portfolio_register_record - portfolio_type - price - price_import_scheme - responsible - responsible_import_scheme - strategy1 - strategy1_import_scheme - strategy2 - strategy3 - transaction_base - transaction_complex - transaction_complex_import_scheme - transaction_type **and all other internal platform entities** ### Provenance Fields #### Data context - `provider_version`- [`PrvdrVrsnUsrCd`- input in ttypes] reference id to the entity [Provider](https://github.com/finmars-platform/finmars-core/issues/128). (may be smth like '{obj["configuration_code"]}_ch._{obj["channel"]}_ver._{obj["version"]}') - `source_version`- [`SrcVrsnUsrCd`- input in ttypes] reference id to the entity [Source](https://github.com/finmars-platform/finmars-core/issues/126). (may be smth like '{obj["configuration_code"]}_ch._{obj["channel"]}_ver._{obj["version"]}') - `credential`- [`CrdntlRcrdUsrCd`- input in ttypes] Finmars Vault credential's user code used to access provider's data. - `reference_ids`- [`TxIdrsDtls`- input in ttypes] json-like field for different key-value pairs identifying entity in different sources, e.g. instrument ids by standards (ISIN: xxx, FIGI: yyy) and by provider/source (bank1_id: xxx, provider1_id: yyy) can be used simultaneously. - `actual_at`- [`DtVerified`- input in ttypes] the data imported can be actual at the date different from access date, it shows when the data was last revised. - `attributes_extra`- [`TxClssfctnDtls`- input in ttypes] json-like field for different key-value pairs of attributes that are present in the acquired data but absent in Finmars data models. #### Execution context - `platform_version`- [`PltfrmVrsn`- input in ttypes] reference id to the entity [Platform Version](https://github.com/finmars-platform/finmars-core/issues/130). #### Trigger context - `origin_initiator_code`- (need to rename to `origin_initiator_username`) [`OrgnInitrUsrnm`- input in ttypes] user login (who is responsible for manual/scheduler/api call): john_doe - `origin_initiator_type`- [`OrgnInitrTp`- input in ttypes] How action was initiated: manual, scheduler, third_party_push. - `origin_initiator_third_party_push_code`- [`OrgnInitrThrdPtyPushCd`- input in ttypes] exact third party address code for third_party_push, for example *** - `origin_manual_entry_point` - (planning to rename to `origin_initiator_manual_code`) [`OrgnInitrMnlCd`- input in ttypes] exact webpage address code (for manual only, including addons address), for example *** #### Service context - `workflow_module_user_code`- [`WrkflwMdlUsrCd`- input in ttypes] workflow used for the data (smth like: `com.finmars.standard-workflow:init`) - `workflow_module_version_semantic`- [`WrkflwMdlVrsnSmntc`- input in ttypes] workflow's version used for the data - `workflow_id`- [`WrkflwID`- input in ttypes] id of the workflow that created the data - `platform_task_id`- [`PltfrmTaskId`- input in ttypes] id of the Finmars platform task that created the data ### Important Notice One provider may have different sources per run. We can't standardize the `reference_ids` json field but we need to be able to query it. ### To Do: 1. Implement [Platform Version](https://github.com/finmars-platform/finmars-core/issues/130); 2. Implement [Provider](https://github.com/finmars-platform/finmars-core/issues/128); 3. Implement [Source](https://github.com/finmars-platform/finmars-core/issues/126); 4. Remove [credential_version](https://github.com/finmars-platform/finmars-core/issues/129); 5. Modify all other entities with fields from Provenance Fields section. 6. Add these fields into API filters. 7. Add these fields into Forms. 8. Add these fields into Import Schemes. 9. Add these fields into Reports.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/finmars-core#41
No description provided.