Data has become so important in our daily life that a lot of us do not realize how it controls our daily lives. Data can work for you and it can also work against you if it is not understood well or not properly handled. Data can be an important tool if it is used properly.
I work with all kinds of data all my professional life. I started assigning auditors to audit engagements, then I worked on financial and expense data. For the past 15 years I focused on control functions within banks and also on audit issues remediation. All the data that I worked on always ended at someone’s desk. I also realized that my work could be used by senior management to make important business decisions. One of the most important lessons I learned is in order for important decisions to be made the data must be complete and accurate.
I recalled the first exposure to data was a class I took at college on DB2. DB2 was a database application designed to work with a “flat” data. I was amazed that I could filter and sort the data. The rest was just a haze. Microsoft Access which was part of the Microsoft Office suite was revolutionary. I was exposed to the application in my second year at Arthur Andersen. I realized its potential and the found many uses for it in my subsequent years.
Fast forward 15 years, my work with data is more important than ever. My current work requires not only the data analytical skills, I have to make connections of different sets of data and be able to tell story and develop trends that could be easily understood by senior management. One of the biggest challenges in telling the story is incomplete data. In my line of work important decisions are made that could impact the livelihood of other employees. Therefore, the data must exist and factually accurate.
There are several challenges that every businesses must face to ensure the data is reliable. There are many solutions but this article is to focus on what if scenarios and the best way forward to ensure the data is just, reliable and accurate. Also, I will also touch on the role of Artificial Intelligence in using data.
Data Availability
The use of smartphones (do we even call it anymore?) the availability of data has multiplied ten fold. Additionally, the computing power has also multiplied over the past decade that anyone with a regular desktop can churn out enough information to operate a robot. However, for certain industry such as banking, data availability continues to be siloed. Using the bank, HSBC, that I work for, we need to be able to produce management reporting based on information that are generated internally. We can’t use data outside of the bank for various reasons, but mainly for relevancy. When data is not available, we need to create the data internally.
In the AI field, the issue is different. Let’s use Amazon, the largest retailer in the planet as an example. Amazon’s goal is to sell a lot more products to its customers. Prior to the big data revolution, it needed to be able to understand its customers and make suggestions that would entice them to buy more. To achieve that it, it would use data available in house and combined with external data to build a suggestive AI engine model. The AI engine will use both sets of data to build a list of products and display them on the buyer’s homepage. Nowadays Amazon is so big and so vast that it only relies on its own data.
Data Relevancy and Linkages
Most companies today generate tons of data daily. However, there is an important task to analyze how relevant the data is for its day-to-day operations. If the data is relevant, how does the company create relationship or linkages of the data? Using my current work experience as an example. Every employees in the bank are required to take required training annually. Employees are also encouraged to take training that are relevant to their work. How do we create a link of internal employee training with the retail clients? Are they relevant?
A real life example is the COVID-19 impacting how we manage the business. The pandemic has caused the Internal Audit function to revised the annual audit plan. Any changes to the audit plan must be approved by the oversight board, Audit Committee, and the changes must be communicated to the local regulators. Unfortunately, we can’t just “write” it off and hope for the best. My job is to understand the changes, review all past history, current business environment and if any of the risks can be mitigated. I would go through various data, review all hand written notes and produce a summary to the Chief Audit Executive to make sound judgement. Unfortunately, the task was manual because some of the data was incomplete and human judgement was necessary.
Data Organization
Having tons of data available is useless if the data is not organized properly. This is why we have Data Scientist today. Its role is to analyze the use of the data and how to organize the data in a way that it is readily available. If I have to guess, 90% of people who run reports from a preset system do not understand this concept. They will export the data to Microsoft Excel and crunch the numbers without thinking twice where the data comes from.
Data must be organized in a multi-dimensional way so they can be used in multiple ways. The data cannot be flat, like an Excel data-set. Data from multiple tables are linked using key fields. This allows the data to be molded, adjustable and sliced easily. This is where Microsoft Access, and other relational databases shine. Most of the data is saved in a SQL databases where the data can be easily retrieved and analyzed. Some companies created front end reporting systems that do all the hardwork where limited programming knowledge is required. For example, QlikSense and Crystal reporting are some that I am aware of.
Data Lifespan and Knowledge Transfer
Data must be kept for an undetermined lifespan. Regardless of how old the data is, it is a valuable asset. Data can produce trend analysis, provide deep insights and predicts future occurrences. One good example is predicting the weather pattern in the US. However, what is the value of the data it it is housed in a secure location that no one have access to? Proprietary data must be housed in a location that is properly secured and documented. Additionally, the data must be easily transferred should the need arise. Documentation must include location of the data, its intended use, data description and classification of the data. Without properly documentation of the data the knowledge of the data cannot be transferred.
Information Security
For the data to be relevant, all data either proprietary or used to make important business decisions must be protected at all cost. This has become so important that it is one of the subjects that the regulators from around the world are tackling. Data is a valuable asset that it is traded and available for sale in the open market. Everyday we hear about information being stolen by bad actor and sell it in the dark web. Hence, protecting data is paramount for any company. Cybersecurity has become an important field and subject that a lot of companies are investing in. How do you contribute to fighting cybersecurity? Ensure you are aware of the policies and procedures set forth by your company and follow common sense.
This blog is not intended to cover everything on how to manage data. Data is a subject that could take multiple books to write and most universities spend months by just skimming the subject. However, I wanted to touch on several subjects that are essential to ensure the data is relevant in the business world. And I hope that I’ve done that. Thank you for reading.