Computers to Help Diagnose Lymphoma

Group at MIT hopes computer software can help doctors diagnose lymphoma subtypes.

Scientists at MIT’s Computer Science and Artificial Intelligence Laboratory envision a future where your doctor can plug your test results and medical history into a computer program to help pinpoint your specific diagnosis.

Artificial Intelligence - Aiming to Learn from Medical Reports

Lymphomas are of two main types: Hodgkin and Non-Hodgkin lymphomas (NHL). These two main categories, however, have many variations and subtypes within; this is especially true with NHL, which literally refers to all of the lymphomas that are not Hodgkin lymphomas.

According to The Leukemia & Lymphoma Society, there are actually more than 70 different kinds of lymphoma.

Though they are all cancers of lymphocytes, a type of white blood cell, they behave differently and often have different treatment options and outcomes. The advantage of computers, according to MIT researchers, comes in spotting subtle variations in tumor types, which may require sifting through hundreds of previous cases.

According to the group’s director, and co-author of a paper in the “Journal of the American Medical Informatics Association,” anywhere between 5 and 15 percent of lymphoma cases are initially misclassified, which can stand in the way of developing the best treatment plan.

The team at MIT is working on a software system that “learns” from existing medical reports, in order to generate possible diagnoses -- those deemed most likely, based on the computer model -- or a list of top prospects for the doctor’s review and consideration.

In their recently published paper, the team describes how the software might be used to identify the different types of lymphoma.

How it Works

The approach under way at MIT relies in part on a relatively recent phenomenon -- the availability of vast sets of electronic medical records. The software sifts through a huge number of pathology reports, extracting information used by the World Health Organization’s classification system to define the sub-types of the cancer.

The system also captures or identifies words commonly used in the medical records, and then associates them with a particular case. For example, if the words “large atypical cells,” appear in a report, this is noted and linked to the record. In the same way, statements about the cancer’s immune-related characteristics – differences that have to do with which receptors the cells have on the outside –- such as “express CD30” – are registered by the software and linked to the record, providing an additional layer of information; and so on.

The MIT system can then take a new set of test results and compare it to the existing data on a number of levels. This computation and analysis provides clinicians with medical data as well as written descriptions of similar cases, which may help to pinpoint the types of lymphoma a person is most likely to have.

Still Some Way to Go

This is still pure research at this point, and there are many details to be worked out before this approach can be used. In their research paper, the MIT team points out that this whole process is what you might call, “computationally intensive.”  One step, called the tensor factorization, takes 22 min on average using a computer with Intel Core 2 Duo P8600 and 8 GB RAM.

Other steps, including processing the clinical documents, also take a considerable amount of time.

The team also raises what they refer to as parsing challenges -- that is, in some ways, computers are challenged by language that human beings intuitively have no trouble with. For example, while a doctor can read a discharge summary from another doctor and know exactly what was meant, the team at MIT describes challenges for the computer system when clinical notes are expressed using informal or nonstandard construction. For instance, clinical notes may lack very small words that the computer can’t quite seem to do without -- many connecting parts of speech -- such as conjunctions, articles, prepositions -- may be omitted for speed and/or succinctness by a doctor, and this kind of thing can befuddle the computer system right now.

Sources on Computers for Lymphoma Diagnosis

MIT News: How a computer can help your doctor better diagnose cancer. Accessed April 2015.​

JAMIA: Subgraph Augmented Non-Negative Tensor Factorization (SANTF) for Modeling Clinical Narrative Text

Accessed April 2015.

LLS: New Partnership Launches to Empower People Diagnosed with Lymphoma. Accessed April 2015.

Continue Reading