General data issues and challenges

aro092
Jun 8, 2022
6 min read

1. Data as the new oil

Data management has become the essential focus across wide range of industries and has been a subject of necessary dealing in every company. More importantly, it is not just about the contents within the data itself but also the key intelligence information derived from the data as part of the business assets. However, these potential assets must be handled and maintained well from the start to deliver the actual “value” that can truly help businesses to excel.

This intelligence extraction has led to the evolution in data knowledge fields, which also created a wide range of data solutions, attempting to deliver frontier results that will help them to beat the odds and stay ahead of the game. If the company operative functions are the engines, then the relevant data flow and its intelligence extraction are the crucial “oil” that can smoother the engines and parts that are closely relevant to the business. The quality of the “oil” will also determine the “sustainability” of the of each business components and relevant triggered “actions” at the point of need. Using it rightfully will prolong and advancing the business functions and welcoming prospects.

2. The hidden Data Debts storm is approaching

Retrospectively, the data discovery or discoverability aspects have dominated numerous applications, use-cases, and developments of emerging data solutions, not just in terms of how the data is being “well managed” but also the capability of having it under “fully controlled” as intended, covering aspects of operative sustainability as well as risks control management.

Data is potentially a big monster of nightmare, if not tamed wisely or letting loose out of control, can lead to many undesirable upcoming data debts scenarios faced by many market players nowadays, especially those in the financial fields. It is almost like applying the wrong type of “oil” to the engine, waiting to “catch fire” or more dangerously, an “implosion” of meltdown within an organisation, whether internally or cross-departmentally, unless reserved cash for emergency repair is not an issue.

“Data Debt” is often associated with the concept of “technology debt” in which lacking the foresight of innovation agility arising from utilising a weak data management solution. From a financial perspective, Data Debt can be defined as the amount of money required to “fix” the problems from its previous data mishandling and neglects. https://johnladley.com/a-bit-more-on-data-debt/

The data debts situation is a very real problem many companies facing and have yet a good answer to tackle but already been swirled into it and amid attempting to escape or get out of this situation. https://www.dataversity.net/data-debt-one-way-of-impacting-at-the-data-portfolio/

Some data solutions that are dealings with the murky data scenario, take for instance, i.e., Data-Lakes (replacing Data-Warehouses), look promising at the first glance as it offers a clean slate starting point. However, as time passes, it is often discovered rather too late that the “data-lakes” solution can be insufficient and often operate in passive mode rather than dynamic, which seems to trade problem-solving via migration instead of tackling problem roots. This dilemma happened repeatedly too often, amounting to the data debts which can one day become unmanageable.

https://www.chaossearch.io/blog/why-data-lakes-fail

https://databricks.com/discover/data-lakes/challenges

3. A wishful restart

If you ever have a chance to ask the data domain experts within the banking or financial institutions, you will often find that most of them were wishing if they could re-visit their data strategy, deployment approach and hope to start afresh. Not that they are lacking the relevant knowledge while upgrading their data solutions, but mainly due to the unexpected under-foresight of complex relationships with parties involved and the constant mounting of complexity that had led to unavoidable cascades of temporal non-uniform short-term resolutions stacking upon previous resolutions, that have gradually cemented into its legacy system.

The rich choices of data solutions in the market have somewhat “blindfolded” them from predicting or having the real foresight or far vision of best practice for having long-term data sustainability inclusion. Complications arises from multi-vendors’ configurations, even with slight differences will mean the unwanted uptake of enormous time spending and efforts to account for any serious causes of misalignments, mismatches and re-route approaches.

Another scenario is that big entity like investment banks, often deliberately acquiring several data solutions simultaneously in their strategy and then decide which one to retain for mainstream usages while having others as their backup solutions. An inventory of more than 500 data solutions & tools in house within their system is not an unusual practice. While this approach may work in favour to their risk management control, it has also created overloading inventory scenario where it is rather difficult to “let-go” when one solution has been “adopted”, even there were some overlapping similarities.

In short, many institutions are currently overspending further huge sum of money on the data issues that they did not even expect could have happened in the first place when they convincingly bought into those “perfect solutions”, and have no choice now but forced to evolve into the burdens of continuation, where they are reluctantly or unable to migrate or changing for better resolutions.

4. The granularity and differentiation in data management

To help readers, we break down the facets of data management into 5 Taxonomy categories below:

a) Data Storage & Governance – Keeping the right and necessary stuff

· The Input aspect – how data is being established, logged, registered, and archived

· What referencing mechanism is used for differentiation and categorisation

· Application of corporate’s policies and rules on how stored data can be used

· How data information can be communicated and managed

· Data ownership assignment

· Data Segregation & Categorisation

b) Data Provisioning & Accessibility – Entering & Looking

· The Retrieval aspect – how data can be provided, obtained, viewed and inspected

· Frequency and Consistency: being able to provide data from source to consumption

· Data masking or encryption – sensitive data to be masked while allow indication of presence

· What Access Control mechanisms or settings are in place (Rule-based vs Attribute-based)

· How data sharing can be accomplished within same or cross departments

· Priority setting on which data can be viewed & retrieved by which type of users

· Data Preparation & Events Sourcing

· Data Gatekeeping

c) Data Discovery, Discoverability & Observability – Original case

· The Comprehension aspect – how data is interpreted, perceived. analysed and understood

· The ability to identify crucial information, i.e., relevancy, whereabouts, status, reasoning, purpose, intel & insights, etc…

· What relationship and connection between relevant data points

· What transformation of data has undergone: the before & after scenarios

· Value extractions, derivation and utilisation plus relevant processes & metrics involved

· Current lifecycle, operative timeframe and possible expiry.

· Data Monitoring – Normality versus Abnormality – any detection of unusual events

· Data Catalogue and Compilation

d) Data Augmentation & Enhancement – Enriched case

· The Upgrade aspect – how to learn and alleviate quality information & further usefulness

· The act of filling up with additional tweaking to adjust, correct & synthesise possible missing data to boost the quality of the data and enable further information enrichment.

· Also, the ability to filter out and eliminate unwanted noisy or polluted information.

· Incorporation with Business Intelligence (BI) and beefing up on information – via AI or ML

· The ability to estimate and squeeze any further predictions that may profit the company

· To provide far sight of potential upcoming trends and its possible change of directions

· Data Quality, Data Analytics and Data Aggregation

e) Data Compliance & Reconciliation – The Correction & Tolerance issues.

· The Risk measure aspect – how to safeguard usage and responsibility of information

· Satisfying and meeting Regulator’s requirement in handling sensitive information

· Consulting and obtaining rightful consents & permissions from relevant parties

· Data harmonisation, making arguments & storyline consistent with reporting

· Contingency provisioning, compatible assurance & risk mitigation

· Fiduciary duty on information accuracy at the point of dissemination

· Error Tolerance and Allowance setting – acceptable range before incurring fines

· The act/art of convincing regulatory bodies on accepting mishandling with reasoning

· Risk Management & Control

These above five groupings will be encountered by most company, where proper dealings are needed to handle and smooth out its data operations while enhancing business functions. Unless the data is acquiring externally from the 3rd parties, most data will be prepared and produced in-house, which will be exchanged for usage afterwards.

Hence, how each data element come about its final settlement of pertinent material is extremely crucial to produce the “right oil” for the company to operate efficiently and effectively, utilising proper tooling for data management and information exchange.

This led to our next part of how to connect all these facets of data management together, while allowing flexible approach in addressing current data issues and mishandling crisis.

General data issues and challenges

Recent Posts

Comments