Tag Archives: data scientist

How to Manage Data – A Quick Guide

Data has become so important in our daily life that a lot of us do not realize how it controls our daily lives. Data can work for you and it can also work against you if it is not understood well or not properly handled. Data can be an important tool if it is used properly.

I work with all kinds of data all my professional life. I started assigning auditors to audit engagements, then I worked on financial and expense data. For the past 15 years I focused on control functions within banks and also on audit issues remediation. All the data that I worked on always ended at someone’s desk. I also realized that my work could be used by senior management to make important business decisions. One of the most important lessons I learned is in order for important decisions to be made the data must be complete and accurate.

I recalled the first exposure to data was a class I took at college on DB2. DB2 was a database application designed to work with a “flat” data. I was amazed that I could filter and sort the data. The rest was just a haze. Microsoft Access which was part of the Microsoft Office suite was revolutionary. I was exposed to the application in my second year at Arthur Andersen. I realized its potential and the found many uses for it in my subsequent years.

Fast forward 15 years, my work with data is more important than ever. My current work requires not only the data analytical skills, I have to make connections of different sets of data and be able to tell story and develop trends that could be easily understood by senior management. One of the biggest challenges in telling the story is incomplete data. In my line of work important decisions are made that could impact the livelihood of other employees. Therefore, the data must exist and factually accurate.

There are several challenges that every businesses must face to ensure the data is reliable. There are many solutions but this article is to focus on what if scenarios and the best way forward to ensure the data is just, reliable and accurate. Also, I will also touch on the role of Artificial Intelligence in using data.

Data Availability

The use of smartphones (do we even call it anymore?) the availability of data has multiplied ten fold. Additionally, the computing power has also multiplied over the past decade that anyone with a regular desktop can churn out enough information to operate a robot. However, for certain industry such as banking, data availability continues to be siloed. Using the bank, HSBC, that I work for, we need to be able to produce management reporting based on information that are generated internally. We can’t use data outside of the bank for various reasons, but mainly for relevancy. When data is not available, we need to create the data internally.

In the AI field, the issue is different. Let’s use Amazon, the largest retailer in the planet as an example. Amazon’s goal is to sell a lot more products to its customers. Prior to the big data revolution, it needed to be able to understand its customers and make suggestions that would entice them to buy more. To achieve that it, it would use data available in house and combined with external data to build a suggestive AI engine model. The AI engine will use both sets of data to build a list of products and display them on the buyer’s homepage. Nowadays Amazon is so big and so vast that it only relies on its own data.

Data Relevancy and Linkages

Most companies today generate tons of data daily. However, there is an important task to analyze how relevant the data is for its day-to-day operations. If the data is relevant, how does the company create relationship or linkages of the data? Using my current work experience as an example. Every employees in the bank are required to take required training annually. Employees are also encouraged to take training that are relevant to their work. How do we create a link of internal employee training with the retail clients? Are they relevant?

A real life example is the COVID-19 impacting how we manage the business. The pandemic has caused the Internal Audit function to revised the annual audit plan. Any changes to the audit plan must be approved by the oversight board, Audit Committee, and the changes must be communicated to the local regulators. Unfortunately, we can’t just “write” it off and hope for the best. My job is to understand the changes, review all past history, current business environment and if any of the risks can be mitigated. I would go through various data, review all hand written notes and produce a summary to the Chief Audit Executive to make sound judgement. Unfortunately, the task was manual because some of the data was incomplete and human judgement was necessary.

Data Organization

Having tons of data available is useless if the data is not organized properly. This is why we have Data Scientist today. Its role is to analyze the use of the data and how to organize the data in a way that it is readily available. If I have to guess, 90% of people who run reports from a preset system do not understand this concept. They will export the data to Microsoft Excel and crunch the numbers without thinking twice where the data comes from.

Data must be organized in a multi-dimensional way so they can be used in multiple ways. The data cannot be flat, like an Excel data-set. Data from multiple tables are linked using key fields. This allows the data to be molded, adjustable and sliced easily. This is where Microsoft Access, and other relational databases shine. Most of the data is saved in a SQL databases where the data can be easily retrieved and analyzed. Some companies created front end reporting systems that do all the hardwork where limited programming knowledge is required. For example, QlikSense and Crystal reporting are some that I am aware of.

Data Lifespan and Knowledge Transfer

Data must be kept for an undetermined lifespan. Regardless of how old the data is, it is a valuable asset. Data can produce trend analysis, provide deep insights and predicts future occurrences. One good example is predicting the weather pattern in the US. However, what is the value of the data it it is housed in a secure location that no one have access to? Proprietary data must be housed in a location that is properly secured and documented. Additionally, the data must be easily transferred should the need arise. Documentation must include location of the data, its intended use, data description and classification of the data. Without properly documentation of the data the knowledge of the data cannot be transferred.

Information Security

For the data to be relevant, all data either proprietary or used to make important business decisions must be protected at all cost. This has become so important that it is one of the subjects that the regulators from around the world are tackling. Data is a valuable asset that it is traded and available for sale in the open market. Everyday we hear about information being stolen by bad actor and sell it in the dark web. Hence, protecting data is paramount for any company. Cybersecurity has become an important field and subject that a lot of companies are investing in. How do you contribute to fighting cybersecurity? Ensure you are aware of the policies and procedures set forth by your company and follow common sense.

This blog is not intended to cover everything on how to manage data. Data is a subject that could take multiple books to write and most universities spend months by just skimming the subject. However, I wanted to touch on several subjects that are essential to ensure the data is relevant in the business world. And I hope that I’ve done that. Thank you for reading.

Is Artificial Intelligence (AI) the next big thing?

Earlier this week I attended New York’s Artificial Intelligence (AI) summit offered for free through my company. This event is meant for businesses who are interested in getting AI in the many uses of every facet of the company. The event was attended by over 5,000 “delegates” and various speakers. While I feel that the overall purpose of the event is informative, I did not find it that much useful. This is due to the fact that AI is not an easy subject to tackle.

That brings me to the title of this blog, is AI the next big thing? Over the century, there were tons of “next big thing”. The discovery of personal mobility, discovery of flight and the coming of internet to name a few. After being exposed to the subject, a developer of the “AI” and attended the many discussion of the AI – I find that the AI is an unavoidable subject that everyone is living in.

What is AI?

The AI is a broad subject that covers a wide range of automation. By just implying AI means robotics is incorrect. AI refers to automating tasks that we do every day. Going into deeper level, AI refers to making not only the tasks easier, but better. However, there are a number of risks involved. In this blog I’m not going into the subject that deep because I’m not an expert in this are.

What is AI, really?

AI is just pure “If and then” statement. In other words, it translates to cause and effect. In computer lingo, you program the machine to identify a statement or action. If the action is satisfied, what will happen next. Let’s use Alexa for an example. You can program Alexa through a routine to run the task of telling you a weather condition. “If” the temperature outside drops below 30 degree Celsius, you want Alexa to remind you to wear heavy jacket before going out to the cold. AI can be categorized in multiple categories. AI can cover front end use (i.e. applications etc), to Machine Learning (ML) and Deep Neural Network. For the sake of not confusing anyone, I will use AI to cover all these subjects.

My exposure to AI

Everyone of us are using AI whether they are aware of not. For example if you are reading this, I can assume that you already have a Netflix account. A normal user would not know that there are various AI running every time he or she opens the Netflix app. Netflix uses your past behavior and make future movie suggestion on your home screen. Additionally, Netflix uses AI to create thumbnails on the home screen. The biggest question is how did Netflix manages to do that in split second? I like everyone else are mainly user of AI. However, only the last year or so I realized that I’m a developer of AI, albeit in a small scale. I’ve been developing databases using Microsoft Access for over 15 years. During this time I crated over 50 different databases (or tools) to do things more efficiently. Additionally these tools were able to generate hundreds of different reports through conditions I built in.

Is AI easy?

Even though I’m an AI developer by definition, I would be lying to you by saying AI is easy. This is one of my criticism of the AI summit that I attended – all the speakers, delegates and the various booths at the conference seem to suggest that AI is a must have and easy. After I attended the bootcamp of Amazon Web Services (AWS) too months ago, I realized that AI is not as easy as advertised. Not only you need to understand computer lingo, you need to have a good understanding of programming. There is nothing “click” and “drag” in AI.

Should everyone work with AI?

The answer is yes. In one of the talks in the conference, the speaker mentioned that in the next several decades AI is so important that it will decide if the business will succeed or fail. Businesses who start incorporate AI in their business models will likely succeed and those who do not thing of AI today will fail (see above). That translates to the work force. If you want to succeed in the work force, you need to start thinking how to incorporate AI in your career. For years I’ve been satisfied staying in my “comfortable” spot and not worrying about my future. This is no longer the case as I see my other friends started to progress further while I feel “stagnant”. That’s why I’ve begun to explore this are a more and more.

What do you need to do now?

If you are currently in the work force, start investigating the subject on AI and how it will help you or vice versa. Start incorporating AI in your daily work. One way to do this is to look at your internal tools and processes and see if there is any option that you can make things more effectively. If you are parents with children still in grade schools, start encouraging them to learn about computer coding (particularly Phyton).

In the next several blogs, I will invest more time in discussing AI. If you have any questions of comments, feel free to respond to this blog.