Image analysis and machine learning have entered the medical field decades ago. Such tools and other artificial intelligence applications can analyze large amounts of data and assist in patient diagnosis and even in research—but they also have limitations

The use of artificial intelligence (AI) in various domains of medicine began when the field of AI was still in its infancy, or at least in its adolescent stage. As far back as the 1970s researchers began to integrate machine learning - a subfield of artificial intelligence - into the study of medical problems. Automatic image processing, which also belongs to the world of AI, has been used in computer-assisted diagnostics since the 1980s. These technologies and other applications of AI, which have developed and become more sophisticated over past decades, are now integrated into various domains of medical science.

Artificial intelligence is an efficient tool in many tasks. A robot and a human shake hands | Andrey_Popov, Shutterstock

Deciphering Medical Imagery

Medical diagnostics often rely on the analysis and interpretation of images from medical imaging diagnostic tools, such as X-ray, CT scans and MRIs. The diagnosis is performed by trained doctors who specialize in both deciphering these images and in deriving precise medical conclusions. In other words, the analysis of medical imaging and the subsequent diagnostic processes necessitate the involvement of professionals who have undergone extensive training.

In recent years, technological developments in the field of machine learning have been introduced into medical imaging analysis—with impressive outcomes. Early diagnosis of breast cancer, for example, is performed using an X-ray image of the breast tissue, called a mammogram. Artificial intelligence algorithms succeed in identifying suspicious regions in mammogram images, thus helping doctors make accurate analysis and diagnosis. In experiments where doctors used AI to diagnose lung cancer, they achieved a lower error rate when the AI indicated suspicious regions in the radiological images. Also, in the field of pathology, where doctors examine tissue samples from patients’ bodies, computerized analysis has proven highly beneficial. In the past, doctors examined tissue samples under a microscope; today they scan the tissues digitally, and computerized processing of the information in the images facilitates the diagnostic process. 

Disease diagnosis is not confined to static images, videos also play a pivotal role. Cardiovascular conditions, for example, are diagnosed with the help of echocardiography videos that document cardiac activity. Artificial intelligence algorithms that analyze echocardiograms can discern subtle differences between cardiac cycles, measure cardiac functions, and provide early warning of potential cardiac failure.

Artificial intelligence also proves invaluable in real-time medical procedures. Colonoscopy, for example, is an invasive procedure in which a tiny camera is inserted into the lower digestive system to examine its walls and detect lesions. During this procedure, minute samples of intestinal tissue can be extracted from suspicious areas for later inspection. Computerized assistance systems operating during the examination are able to identify particularly small lesions, which can easily evade the human eye and prove challenging for doctors to detect.

Geoffrey Hinton, one of the founding fathers of modern AI, was ahead of his time in the faith he placed in its abilities. At a conference in 2016, Geoffrey called for an end to the training of radiologists. It’s inevitable - he argued - that AI will surpass the abilities of radiologists within 5 to 10 years. Later, Hinton revised his stance, suggesting that dismissing radiologists entirely is unwise, and that it would be instead recommended to reach a collaborative future where artificial intelligence aids radiologists image interpretation. 

Artificial intelligence assists doctors in diagnosing diseases based on medical medical imaging. An medical scan of a patient with lung disease | Egor Kulinich, Shutterstock

Not So Fast  

For an AI-based system to effectively analyze regions of interest on a medical image, it must undergo a learning process from a multitude of images that have been previously processed, annotated, and classified by human professionals. This learning process is referred to as “training”. Artificial intelligence algorithms rely on databases of annotated images to identify underlying patterns or rules that can be extrapolated and applied to new images.

Different medical centers vary in several aspects: the equipment employed for imaging, the environmental conditions during image capture, the post-classification image annotation performed by doctors, and the unique characteristics of their patient demographics. These variations can influence the algorithm’s capability for generalization. An algorithm trained on images from one medical institution may not necessarily interpret images created in a different medical institution with the same degree of accuracy. Similarly, if an algorithm is trained for a certain demographic, its generalization capabilities may falter when introduced to different demographic populations from another hospital. Furthermore, the annotations standardized within a particular medical center can also impact the algorithm’s efficiency. For instance, an algorithm trained to detect skin cancer using a particular scale present in malignant tumors in a specific dataset, may rely too heavily on a specific marker present only in that dataset, potentially leading to “shortcut” learning. This form of learning compromises the algorithm’s adaptability when presented with data of a slightly different nature. Effectively integrating AI tools into a medical facility often necessitates retraining or customizing the algorithm to fit that specific environment. To this end, the facility must provide a vast, meticulously classified and annotated dataset that is as free as possible from  unnecessary additions.

It's crucial to note that in such classification tasks, machine learning does not generate any novel insights: it merely categorizes based on patterns identified from samples that were already classified by medical experts. Furthermore, while physicians can explain their diagnosis rationale, AI often operates as a “black box”, making it challenging to decipher the underlying reasons for the algorithms' decisions.


During the training process, artificial intelligence relies on a vast database of existing classifications to generalize to new cases. A robot reads a book next to piles of books | Vasilyev Alexandr, Shutterstock

A Helpmate

Researcher and neurosurgeon, Antonio Di Leva, speculated a few years ago that machines will not replace doctors. Instead, physicians aided by AI would supersede those unwilling to adopt it. Even if we don’t entrust the entirety of diagnostic tasks to AI, it can handle specific sub-tasks, serving as an efficient auxiliary tool and increasing the efficacy of medical professionals. 

Medical image processing encompasses various different stages with different levels of complexity. Image processing also involves cleaning the data in the image from unavoidable background noise. By filtering out this noise, machine learning algorithms can enhance and elevate the resolution of medical images. Once the image is ready for interpretation, one of the tasks involves identifying areas of interest. This division of the image into regions of interest is termed ‘segmentation’. Mapping of cardiac activity, for example, requires demarcation of the different parts of the heart, including the atria and ventricles, in order to assess their functioning. Segmentation by means of AI allows demarcation not only at the anatomical level - of organs, such as the sections of the heart - but also at the level of the tissues observable in microscopic images. Artificial intelligence applications can, for example, analyze a microscopic image and demarcate areas of cancerous tissue within it.

Subsequent to the demarcation and identification of lesions, the task shifts to classification. The algorithms are trained to classify the lesion and determine whether it is benign or malignant, and in the latter case, to identify the type of cancer. Presently algorithms are available that have been trained to analyze images of skin malignancies and classify them into different types of skin cancer, at different levels of, to identify the type of cancer. Presently algorithms are available that have been trained to analyze images of skin malignancies and classify them into different types of skin cancer, at different levels of risk.

Each additional annotation on the image, other than the biopsy or the medical mage itself - may lead to incorrect training of the algorithm. An diagnostic image of coronary heart disease | kalewa, Shutterstock



Predicting the Future

A particularly complex and important task is that of prediction—such as assessing the patient’s risk level based on the diagnosis, determining the progression rate of the disease. Researchers are trying to develop AI tools capable of making such predictions. For example, in one study, researchers managed to train an AI to analyze CT images and determine in advance which patients were at risk of developing lung cancer, between 1–6 years before the disease onset. The researchers found that the algorithm identifies characteristics known to be early indicators of lung cancer. In breast cancer  too, there are early indicators that can help predict a high risk of developing breast cancer in the future.

However, medical information does not come only in the form of diagnostic images, but also as indices from blood tests, such as blood glucose levels and red blood cell counts; as genetic information extracted from DNA tests; as types and quantities of different bacteria in our intestines, identified using microbiome tests, and more. In a study by the Israeli health maintenance organization (HMO), ‘Clalit’, researchers developed a predictive model that identifies patients at increased risk of developing chronic kidney failure within a few years. Research is under way for developing predictions for additional conditions, such as cardiovascular diseases and diabetes.

The significance of such tools is profound: they could potentially provide us with advance warning of diseases that a particular patient may suffer from, allowing for better preparation, preventive measures, and as effective treatment as possible. Nevertheless, it is essential to note that advance prediction is mainly relevant for diseases that develop slowly and gradually. This limitation narrows the range of diseases for which these tools hold relevance.


An automated gaming machine offering a range of “predictions” for the future, pulls one randomly when activated. In contrast, a successful prediction model provides us with a much better prediction than a random guess. A Zoltar fortune teller machine | Darryl Brooks, Shutterstock



A Complex Puzzle

The more technology advances, the more factors we are able to measure in the patient’s body with a higher level of accuracy. While it is difficult for humans to grasp the relationship between numerous variables, a machine is indifferent to the number of different variables it handles. Various studies utilize computational capabilities to analyze large amounts of diverse data, identifying complex relationships that hint at different health conditions. For example, certain studies have found a link between characteristics of the retina, which can be identified in images, and kidney disease, cardiac conditions and Alzheimer’s disease.

But we must understand that detecting such intricate relationships still does not provide us with a deep understanding of the biological system, let alone a solution or cure for the disease. The use of machine learning allows us to consider multiple variables and dissect large data sets, but to develop a healing method, the root cause of the problem must be understood.


What’s The Source Of The Information?

For an AI model to efficiently classify medical data and provide a reliable prediction of disease progression in a patient, it must receive accurate, unbiased information. Biases are omnipresent, starting from the equipment used to obtain the sample, through the individual processing the sample, to the cohort from which the sample is drawn. A significant bias that characterizes medical data originating from hospitals and medical centers is that those seeking a physician’s consultation often aren't at the peak of health. Most of us don’t frequent medical establishments for tests, especially when it concerns rare diseases. There are diseases for which the awareness is higher, such as breast cancer, benefitting from annual campaigns that urge the population to get screened. Yet, even in such cases, individuals less inclined towards regular check-ups are less likely to participate. This may lead to an underrepresentation of economically disadvantaged populations. The lack of balanced information becomes more pronounced when dealing with rarer diseases. To make up for this shortfall, several initiatives compile long-span medical data from the most diverse and random volunteer groups possible. The British database UK Biobank contains data that has been collected from half a million volunteers since 2006. In the US, an initiative known as the “All of Us Research Program”, managed by the National Institutes of Health (NIH), aims to create a large and diverse database of health information, gathering health data, genetic information and more from one million or more participants across the US. 


A collection of different indices from numerous individuals enables a search for intricate relationships within medical phenomena. An illustration to demonstrate the integration of technology and data analysis for the benefit of medicine | ArtemisDiana, Shutterstock

Each Snowflake is Unique

The collection and analysis of data from the general population isn’t enough to provide answers for individual patients. Each individual has unique health metrics that are considered to be in the normal range when they are in good health. A deviation from this personal norm could suggest a medical anomaly or condition.The inherent variability within the population, stemming from differences in the metrics between individuals, can be very large. For instance, a value in a patient’s blood test may appear anomalous compared to the wider population, yet it might be entirely typical for that individual’s usual healthy state. On the flip side, a result that seems ordinary when compared to a population’s average range, might be very notably atypical for a specific  patient. In other words, the high variation in population metrics has the potential to mask significant deviations in the state of health of individual patients.

To detect an anomalous change in a patient’s metrics, it’s essential to monitor them over time and detect deviations from their personal norms. The long-term collection of personal data is a challenge in and of itself, which can be met in countries like Israel, in which nearly every citizen belongs to an HMO from the day of his or her birth and his or her medical records are collected over time. However, this is not the case in most countries. Wearable technologies, such as smartwatches, allow detection of deviations in the state of health of the person wearing them. Commercial smartwatches equipped with features like a heart rate monitor, an oximeter (measuring oxygen levels in blood), and more, allow the monitoring of different personal metrics and have in the past altered wearers to abnormal readings, enabling critical, and potentially life-saving medical diagnoses and interventions.

The vast variability in the population does not reflect the characteristics of a particular patient. Spotlight on a few individuals amidst a crowd |, Shutterstock

Humans vs. Machines

In addition to diagnosis of medical conditions, AI is employed in many interface fields, where medical science meets technology. Some of these fields include development of medical equipment, drug research and development, and theoretical issues such as predicting the protein structures.

Alongside the advantages of machine learning, it’s also crucial to recognize its limitations. For instance, if the AI learning process relies on existing medical data - such as identifying a certain tissue structure as indicative of a malignancy - the classification performed by the machine can save precious time and expedite the medical process, but it doesn’t contribute any new medical knowledge. Moreover, the assimilation of AI tools into the medical framework is not straightforward, demanding investment of significant resources and effort. Furthermore, while machine learning allows us to analyze vast data sets, trace numerous  variables, discern complex interrelations among them, and make predictions regarding a patient’s future state of health, it cannot provide an explanation for the underlying mechanisms behind these relations. To truly address and cure a medical issue, comprehending the mechanism causing it is essential. Artificial intelligence is an excellent tool that enhances our predictive, diagnostic and research capabilities, but it cannot replace human research and reasoning.