Overview of AI and ML Primer
[Start of recorded material at 00:00] [CSIRO Senior Research Scientist Dr Bevan Koopman appears prominent on screen]
Machine learning and artificial intelligence have become real buzzwords. So have you ever wondered how to separate real world advancement from marketing spin? To help you do this in the health space we’ve put together a report that details twenty three real world applications alongside easy to understand technical concepts that are crucial to know in this space.
[Animation of Australian e Health Research Centre logo appears on screen]
[Bevan Koopman appears prominent on screen. His name and title briefly appear on screen]
My name is Dr. Bevan Koopman and I’m one of the lead authors on the report. Artificial Intelligence is a somewhat amorphous term. There are many different techniques and algorithms that make up the wide family that is AI the boundary can be somewhat subjective. In this primer we constrain our focus on AI to these four different domains.
[A slide is presented. Headline is Our focus domains for AI in Health. Four dot points are displayed and the top one is highlighted. Text reads: Predictive Analytics and Data Driven Intelligence]
[Bevan Koopman appears prominent on screen]
First, predictive analytics and data driven intelligence. This is concerned with extracting insights from existing data, often from large datasets, where it’s difficult for humans to derive such insights. In data driven intelligence the intent is for the insights to be bottom up, i.e., identifying trends and insights from often low level data.
Second, knowledge representation and reasoning. This is how we represent or classify information about the world in a form that a computer system can utilize to solve complex tasks, enabling us to infer new knowledge. In health care this is typically about representing medical concepts such as diseases, their properties and relationships in a machine readable and understandable form. In many instances, solving the knowledge representation problem is the key challenge. Once the data is represented in the right form, the problem becomes tractable. That is able to be processed using compute power in approximate time scale.
[The slide reappears. This time the third dot point is highlighted. Text reads: Imaging and Vision]
[Bevan Koopman appears prominent on screen]
Third, imaging and vision. This involves analyzing images or videos to derive insights into the cause and impact of medical conditions. Computer vision and image processing are two areas that have been transformed by new AI methods, particularly deep learning based methods.
[The slide reappears. This time the fourth dot point is highlighted. Text reads: Human Language Understanding]
[Bevan Koopman appears prominent on screen]
Finally, human language understanding, which uses AI methods to better deal with natural language. While we do strive to standardise data and make it machine readable, humans still communicate in natural language and as such, there will always be data in this form. AI methods therefore aim to handle natural language by extracting meaning, by searching, summarising and classifying such language. These are our four focus areas. Now let’s talk about some of the techniques.
[A slide is presented. Symbolic or Statistical Artificial Intelligence? There are three dot points that read. Symbolic AI representing human knowledge into known facts or rules. Dotpoint 2 Statistical AI learning from the underlying data Dotpoint 3 Historically segregated, health is one domain with successful hybrid approaches.
There are two major branches of AI, symbolic and statistical. Symbolic AI represents human knowledge into known facts or rules.
[Bevan Koopman appears prominent on screen]
These facts or rules can be combined with mathematical logic to undertake verifiable and explainable problem solving. This paradigm of AI often uses an ontology that’s a collection of concepts with properties, including the relationships between concepts to describe a particular domain. While less common in other domains symbolic AI has a specific place in AI in health. This is mainly because in health the domains had considerable effort to capture and explicitly represent health data in standards such as the SNOMED CT ontology.
[The slide reappears. Symbolic or Statistical Artificial Intelligence?]
Statistical AI takes the opposite approach. Rather than predefining the knowledge and rules, it learns these from the data itself.
[Bevan Koopman appears prominent on screen]
This approach uses existing data and evidence, along with computational techniques, to extract patterns and insights and thus reason about the world. This process involves training a model using available data. Machine learning and its family of algorithms are the key techniques used in statistical AI. Large increases in the availability of data captured electronically and huge increase in computing power have driven the growth in statistical A.I. In many sectors, statistical approaches have superseded symbolic ones, or the two have remained quite segregated.
[The slide reappears. Symbolic or Statistical Artificial Intelligence]
However, in health, hybrid approaches that utilize symbolic representations with statistical learning have been successful.
AI depends on data and the quality of the data used to either train AI models or for AI based analysis will have a direct impact on the quality of outputs and downstream tasks. As many AI approaches are intrinsically linked to the data type, we outline the myriad of different forms that health data can take.
[A slide is presented with a table with column headings are, Data Type, Examples, Formats/Standards and Avg size/patient and column rows Electronic Health Record, Genomics, Imaging, Administrative and Sensor Data”
This table provides a brief snapshot of the detailed breakdown found in the full report.
[Bevan Koopman appears prominent on screen]
Perhaps the most well known branch of AI is machine learning. This gives computers the ability to learn without being explicitly programmed.
[A slide is presented. Headline is Machine Learning. Three dot points, classification vs regression, supervised vs unsupervised and deep learning]
There are three main techniques for machine learning. Statistical machine learning aims to find types of predictive functions from training data. Reinforcement learning approaches, provide AI algorithms with rewards or penalties based on their problem solving performance. And deep learning approaches make use of artificial neural networks.
[Bevan Koopman appears prominent on screen]
There are two main machine learning tasks: classification and regression. Classification involves using a machine learning model to classify some data according to a finite set of categories.
For example, classifying the type of cancer found in a pathology report as breast, lung, pancreatic, etc.. Regression, in contrast, involves using a machine learning model to predict a continuous value rather than a category. For example, predicting the length of stay for a patient given their condition.