Note – this article was originally published on www.information-management.com
We are seeing impressive gains in artificial intelligence (AI) systems across diverse industries today: in faster drug discovery with around double the screening success rate; in finance to prevent fraud by analysing billions of transactions for criminal behaviour across multiple networks; and in retail to provide improved, personalized experiences with smarter chatbots, real-time recommendations, and search.
Yet, at times, we find ourselves disappointed with suboptimal AI-produced results. Take, as recent examples, the worst-ever performance by AI-hedge funds and of course amplified bias found in recruiting tools. AI is making progress and being applied in just about every aspect of our lives – but we have a way to go to fulfil the promises of AI with robust and trustworthy systems.
Although we are just beginning to understand the full range of AI impact, there are practical steps we can take now to reap near-term benefits, ameliorate dangers, and create a foundation for what will undoubtedly be a frenetic future. In particular, it’s essential to increase the accuracy and flexibility – in tandem – for AI systems.
If we increase accuracy alone, then correct, exact solutions in one situation will quickly become inappropriate when data or circumstances change (sometimes even the slightest amount.) Likewise, if we only accelerate flexibility, we’ll never get to the precision required for real-world uses. How do we effectively work on both elements at the same time? We can use an originating AI concept to guide us and illustrate that in a modern use case.
Why? A clue can be found in this definition of AI that harkens back to original ideals: “Computer processes that have learned to accomplish tasks in ways that mimic human decisions, which are probabilistic.” Adult humans make tens of thousands of decisions every day and most are dependent upon our surrounding circumstances or perspectives.
For example, when planning a trip, our decisions will vary significantly depending on whether the travel is for work, pleasure, or perhaps an emergency. In language, the intention of a phrase is highly dependent on the context in which it is uttered. For example, if I yell, “Get out!” context is the key to whether I’m expressing a friendly note of surprise or demanding that someone leave the room immediately.
Humans use contextual learning to figure out what’s important in a situation and how to apply that to new situations. Just as we use general frameworks of understanding for better decisions and adapt as circumstances demand, effective AI systems need these same qualities of contextual accuracy and flexibility – to think (sort of) like humans.
Being more correct, more precise, and more adaptive at the same time will enable AI decisions that are situationally appropriate just as people (usually) are. This will allow us to move beyond “narrow AI,” which is defined as being focused on performing one task very well, to more generalized AI that incorporates multiple abilities.
Although AI solutions today usually fall into the “narrow AI” category, one way to make them more widely capable is to provide them with context, surrounding them with related information to use in solving the problem at hand. This makes sense if we think about how humans in complicated, dynamic situations — like driving a car, for instance — continually observe, collect related data, and make connections. We then process all this connected information and make informed, in-context decisions.
For AI to make decisions that closely mimic humans through processes, it also needs to incorporate a lot of context so it can learn from related data, make refined judgments, and adjust as necessary. This is why knowledge graphs, which acquire and integrate adjacent information using data relationships to derive new knowledge, are so prevalent in supporting suggestions and decisions in AI systems today. For example, companies like eBay use knowledge graphs to power more intelligent commerce.
Furthermore, without peripheral and related information, AI requires more exhaustive training, more prescriptive rules, and more specific application. This means slower progress and less flexible solutions. If we consider how we learn as a child, we might touch a hot stove, but only once. We learn pretty fast without hundreds of tests that we need to be cautious in certain contexts such as touching metal surfaces that happen to be counter height.
Like AI, machine learning (ML) is more effective when more context and connections are added. We know that relationships are highly accurate predictors of behaviour and we know that most data scientists want more data to increase the accuracy of their ML models. Although many current data science models ignore relationships and their structure, we can add these predictive features fairly easily using the connections within the data we already have.
A great example of how ML can be improved by using context is in the sphere of fighting financial crimes such as transaction fraud, anti-money laundering and credit fraud. All these white-collar crimes have one common thread: they are based upon a network of criminals to move funds and cover their tracks. However, the fundamental problem with most ML methods currently in use to detect these behaviours is that they rely on flat data structures and tables which reduce or eliminate relationship information – one of the most predictive indicators of behaviour.
An easy way to improve our ML results is by using graph features (predictive elements based on relationships which graphs represent) because it does not require that we change our current machine learning workflow. There are a few different methods for employing graph features.
For example, query-based feature engineering which considers labels or inferred relationships. This method is useful when we know what we’re trying to find, such as identifying how many known fraudsters are in somebody’s network. We can imagine during an online transaction and approval process that we might want to score an applicant by the number and distance of fraudsters (or any predictive behaviour) in their known connections.
Another method uses graph algorithms to engineer features in an unsupervised fashion where we know the general structure we want, but not the exact pattern. For instance, graph algorithms simplify finding a node that reaches every other node the fastest or can quickly identifying community groupings.
Graph community detection algorithms measure community cohesion based on relationships and use those scores as features that improve predictions about financial crime. For example, it’s been noted that isolated islands of tight interactions are more indicative of certain types of fraud such as money laundering.
Even when we have a good set of features, we might use algorithms like PageRank to identify the top features with the most influence for feature selection. This helps prevent overfitting an AI system to a training data set by revealing which features can be safely discarded, making our model more applicable to more real-world data. In this scenario, perhaps a feature based on what type or number of unsecured loans is less relevant. We wouldn’t want to include this feature in our model as it might decrease our accuracy by ignoring new fraud that doesn’t include unsecured loans.
Connected feature engineering is used in many industries and has been particularly helpful for investigating financial crimes like fraud and money laundering. In these scenarios, criminals often try to hide activities through multiple layers of obfuscation and network relationships. This is where graph feature engineering adds relationship data to increase accuracy, precision and recall.
Perhaps most impressively, the International Consortium of Investigative Journalists used feature engineering and graph technology to discover the complicated web we now know as the Panama Papers. Three years on since the 2016 investigation, more than $1.2 billion has been recovered by governments around the world.
AI must be both highly accurate and flexible as a baseline for robustness and trustworthiness. We know that AI systems and ML models become more correct and more broadly applicable once related context, adjacent information, and defining relationships are added. We also gain efficiencies by employing these predictive elements in their native format (relationships!) to process data and more quickly iterate our experimentations.
Want to receive exclusive content? Sign up through the short form below.