Tag Archives: Big Data

Importance of Data in Advancement of Artificial Intelligence (AI)

There is a misconception that Artificial Intelligence (AI) started in the past decade. It’s true in a sense, but the real AI started well before then. According to Wiki, the field of AI started in the 50’s. In fact AI can trace back to several centuries back to the use of automaton. The AI becomes one of the most important field today because of several developments: the availability of computing power and availability of data.

I am not an expert in this field but having to learn this subject in the past year, I was surprised to find out that my career background is somehow related to the development of AI. I started my career at Arthur Andersen and I was exposed to “Big Data” when I was responsible for financial reporting at Arthur Andersen. Before Arthur Andersen disappeared from the business world, it has one of the best data system – the Financial System of the 90 (FS90). I had the priviledge to work with Ralph Schonenbach, who is now the CEO of Envoy, in developing several tools for the Financial Control of Arthur Andersen.

I found the data owned by Arthur Andersen fascinating. With a complete data map, I was able to generate various reporting using Microsoft Access. Some of the tools that I created went on to become important integral part of management reporting. The experience I obtained from Arthur Andersen had helped me tremendously as I moved to Citigroup and HSBC where I continue to create different management reporting for the management and the US banking regulators.

The AI today is no different that what I experienced when I was Arthur Andersen. Essentially the AI uses huge amount of data to create trends, outlook and suggestions; from this the AI can harness the data to automate repetitive tasks. The AI has becoming more important as Big Data has becoming readily available. The explosion of smartphones also help fuel the AI as more and more companies found ways to collect data from smartphone users though the apps. For example, apps such as Spotify, Netflix and the ubiquitous Google Chrome collects terabytes of data every single day. Many companies, particularly Google saw the potential of the availability of data and started to monetize this asset.

The computers today has also advanced exponentially that they allow AI developers to be able to crunch data more quickly and efficiently. I remembered when I first bought my first PC in the early 1990s was using 386 Intel chips running at 60 MHz. Today PCs are running at teraflops – a teraflop is a unit of computing speed equal to one million million (1012) floating-point operations per second. Of course not everyone needs that kind of computing power for their everyday use. That’s where the computer hobbyist come into the picture with the development in micro-computers (Raspberry Pi and Arduino). Nowadays many companies in the AI business are talking about Internet of Things (IoT). In case if you are not aware, IoT refers to the interconnection via the Internet of computing devices embedded in everyday objects, enabling them to send and receive data. We are talking about from toasters to door bells.

Why is data so essential to the development of AI? I’m not a data scientist (which is a new field as a result of the explosion of AI). But I can tell you that without data, AIs are just dumb machines. Data enables AI developers to piece and stitch different sets of data together and generate a trend. And from the trend, the developers can generate hypothesis and create predictable analysis.

Let me explain. Before I was exposed to Microsoft Access, I used Microsoft Excel to do a lot of computing work. All financial analysis requires Excel. However Excel data is flat – meaning that what numbers you put in the formula will generate a known result. Microsoft Access database is different because it is called a relational database – essentially the database contains multiple flat tables interconnected through a relationship using key fields. From the relationship, Microsoft Access allows the user to create different results based on selected criteria.

The AI today is using the same concept but at a bigger scale. The data sets may not be even related to each other but the AI understands what the user is looking at and produce results that could be related. Let’s use Netflix as an example. When you first sign-on to the Netflix, the service asks you what genre of movies you like to watch. As you use Netflix more and more, the AI starts to build your profile more and more. It will begin to suggests some of the movies that you would prefer to watch. For example I have always been a WWII aficionado. When I first signed up to my Netflix account, I never told it that I want war movies. Over time it starts to suggest war movies, documentaries and even Sci-Fi movies that are war related.

The above example is on the software side. But what about robotics or hardware. When does data come into play. When I attended the AI Summit, I had the privilege to attend Lockheed Martin presentation on AI. I found the Automatic Ground Collision Avoidance System (GCAS) fascinating and how it saves lives. The pilots of fighter jets go through maneuvers that can produce g-forces strong enough to render a pilot unconscious or cause spatial disorientation.  The GCAS will kick in and automatically level flight and prevent the fighter jets from crashing into the terrain. The GCAS requires multiple data feed such as wind speed, aircraft speed, location of the aircraft, pilots responsiveness, historical data to determine when it is appropriate to take control of the aircraft.

Anyway, this is just a blog not a scientific paper to argue how data becomes so important in the AI field. I am not qualified to provide a view in this field. After being a champion of data quality and user of data for over 15 years, I can tell you that data is everything. Our lives are driven by data and they will continue to be driven by data. I won’t be surprised if we start to embed AI in our consciousness in the next decade or so. There are many opponents to this idea as it crosses the line of privacy – that is another subject for another day.

Advertisement