The KHIT Blog: #AI and health diagnostics. "Reproducibility," anyone?

Sunday, February 25, 2018

#AI and health diagnostics. "Reproducibility," anyone?

From ARS Technica:

AI trained to spot heart disease risks using retina scan
The blood vessels in the eye reflect the state of the whole circulatory system.

The idea behind using a neural network for image recognition is that you don't have to tell it what to look for in an image. You don't even need to care about what it looks for. With enough training, the neural network should be able to pick out details that allow it to make accurate identifications.

For things like figuring out whether there's a cat in an image, neural networks don't provide much, if any, advantages over the actual neurons in our visual system. But where they can potentially shine are cases where we don't know what to look for. There are cases where images may provide subtle information that a human doesn't understand how to read, but a neural network could pick up on with the appropriate training.

Now, researchers have done just that, getting a deep-learning algorithm to identify risks of heart disease using an image of a patient's retina.

The idea isn't quite as nuts as it might sound. The retina has a rich collection of blood vessels, and it's possible to detect issues in those that also effect the circulatory system as a whole; things like high levels of cholesterol or elevated blood pressure leave a mark on the eye. So, a research team consisting of people at Google and Verily Life Sciences decided to see just how well a deep-learning network could do at figuring those out from retinal images.

To train the network, they used a total of nearly 300,000 patient images tagged with information relevant to heart disease like age, smoking status, blood pressure, and BMI. Once trained, the system was set loose on another 13,000 images to see how it did.

Simply by looking at the retinal images, the algorithm was typically able to get within 3.5 years of a patient's actual age. It also did well at estimating the patient's blood pressure and body mass index. Given those successes, the team then trained a similar network to use the images to estimate the risk of a major cardiac problem within the next five years. It ended up having similar performance to a calculation that used many of the factors mentioned above to estimate cardiac risk—but the algorithm did it all from an image, rather than some tests and a detailed questionnaire.

The neat thing about this work is that the algorithm was set up so it could report back what it was focusing on in order to make its diagnoses. For things like age, smoking status, and blood pressure, the software focused on features of the blood vessels. Training it to predict gender ended up causing it to focus on specific features scattered throughout the eye, while body mass index ended up without any obvious focus, suggesting there are signals of BMI spread throughout the retina…

OK, I'm all for reliable, accurate tech dx assistance. But, from my latest (paywalled) issue of Science Magazine:

Last year, computer scientists at the University of Montreal (U of M) in Canada were eager to show off a new speech recognition algorithm, and they wanted to compare it to a benchmark, an algorithm from a well-known scientist. The only problem: The benchmark's source code wasn't published. The researchers had to recreate it from the published description. But they couldn't get their version to match the benchmark's claimed performance, says Nan Rosemary Ke, a Ph.D. student in the U of M lab. “We tried for 2 months and we couldn't get anywhere close.”

The booming field of artificial intelligence (AI) is grappling with a replication crisis, much like the ones that have afflicted psychology, medicine, and other fields over the past decade. AI researchers have found it difficult to reproduce many key results, and that is leading to a new conscientiousness about research methods and publication protocols. “I think people outside the field might assume that because we have code, reproducibility is kind of guaranteed,” says Nicolas Rougier, a computational neuroscientist at France's National Institute for Research in Computer Science and Automation in Bordeaux. “Far from it.” Last week, at a meeting of the Association for the Advancement of Artificial Intelligence (AAAI) in New Orleans, Louisiana, reproducibility was on the agenda, with some teams diagnosing the problem—and one laying out tools to mitigate it.

The most basic problem is that researchers often don't share their source code. At the AAAI meeting, Odd Erik Gundersen, a computer scientist at the Norwegian University of Science and Technology in Trondheim, reported the results of a survey of 400 algorithms presented in papers at two top AI conferences in the past few years. He found that only 6% of the presenters shared the algorithm's code. Only a third shared the data they tested their algorithms on, and just half shared “pseudocode”—a limited summary of an algorithm. (In many cases, code is also absent from AI papers published in journals, including Science and Nature.)

Researchers say there are many reasons for the missing details: The code might be a work in progress, owned by a company, or held tightly by a researcher eager to stay ahead of the competition. It might be dependent on other code, itself unpublished. Or it might be that the code is simply lost, on a crashed disk or stolen laptop—what Rougier calls the “my dog ate my program” problem.

Assuming you can get and run the original code, it still might not do what you expect. In the area of AI called machine learning, in which computers derive expertise from experience, the training data for an algorithm can influence its performance. Ke suspects that not knowing the training for the speech-recognition benchmark was what tripped up her group. “There's randomness from one run to another,” she says. You can get “really, really lucky and have one run with a really good number,” she adds. “That's usually what people report.”…

Issues of "proprietary code," "intellectual property," etc? Morever, there's the additional problem, cited in the Science article, that AI applications, by virtue of their "learning" functions, are not strictly "algorithmic." There's a "random walk" aspect, no? Moreover, accuracy of AI results assumes accuracy of the training data. Otherwise, the AI software learns our mistakes.

Years ago, when I was Chair of the ASQ Las Vegas Section, we once had a presentation on the "software life cycle QA" of military fighter jets' avionics at nearby Nellis AFB. That stuff was tightly algorithmic, and was managed with an obsessive beginning-to-end focus on accuracy, reliability.

Reproducibility.

Update: of relevance, from Science Based Medicine:

Replication is the cornerstone of quality control in science, and so failure to replicate studies is definitely a concern. How big a problem is replication, and what can and should be done about it?

As a technical point, there is a difference between the terms “replication” and “reproduction” although I often see the terms used interchangeably (and I probably have myself). Results are said to be reproducible if you analyse the same data again and get the same results. Results are replicable when you repeat the study to obtain fresh data and get the same results.

There are also different kinds of replication. An exact replication, as the name implies, is an effort to exactly repeat the original study in every detail. But scientists acknowledge that “exact” replications are always approximate. There are always going to be slight differences in the materials used and the methodology...

From the lab methodology chapter of my 1998 grad school thesis:

The terms “accuracy” and “precision” are not synonyms. The former refers to closeness of agreement with agreed-upon reference standards, while the latter has to do with the extent of variability in repeated measurements. One can be quite precise, and quite precisely wrong. Precision, in a sense, is a necessary but insufficient prerequisite for the demonstration of “accuracy.” Do you hit the “bull’s eye” red center of the target all the time, or are your shots scattered all over? Are they tightly clustered lower left (high precision, poor accuracy), or widely scattered lower left (poor precision, poor accuracy). In an analytical laboratory, the “accuracy” of production results cannot be directly determined; it is necessarily inferred from the results of quality control (“QC”) data. If the lab does not keep ongoing, meticulous (and expensive) QC records of the performance histories of all instruments and operators, determination of accuracy and precision is not possible….

A “spike” is a sample containing a “known” concentration of an analyte derived from an “NIST-traceable” reference source of established and optimal purity (NIST is the National Institute of Standards and Technology, official source of all U.S. measurement reference standards). A “matrix blank” is an actual sample specimen “known” to not contain any target analytes. Such quality control samples should be run through the lab production process “blind,” i.e., posing as a normal client specimens. Blind testing is the preferred method of quality control assessment, simple in principle but difficult to administer in practice, as lab managers and technicians are usually adept at sniffing out inadequately concealed blinds, which subsequently receive special scrutiny. This is particularly true at certification or contract award time; staffs are typically put on “red alert” when Performance Evaluation samples are certain to arrive in advance of license approvals or contract competitions. Such costly vigilance may be difficult to maintain once the license is on the wall and the contracts signed and filed away…

___

#AI developers, take note. Particularly in the health care space. If someone doesn't get their pizza delivery because of AI errors, that's trivial. Miss an exigent clinical dx, that's entirely another matter.

Related Science Mag article (same issue, Feb. 15th, 2018): "Missing data hinder replication of artificial intelligence studies."

Also tangentially apropos, my November post "Artificial Intelligence and Ethics." And, "Digitech AI news updates."

UPDATE

Also of relevance. A nice long read:

The Coming Software Apocalypse
A small group of programmers wants to change how we code — before catastrophe strikes.

…It’s been said that software is “eating the world.” More and more, critical systems that were once controlled mechanically, or by people, are coming to depend on code. This was perhaps never clearer than in the summer of 2015, when on a single day, United Airlines grounded its fleet because of a problem with its departure-management system; trading was suspended on the New York Stock Exchange after an upgrade; the front page of The Wall Street Journal’s website crashed; and Seattle’s 911 system went down again, this time because a different router failed. The simultaneous failure of so many software systems smelled at first of a coordinated cyberattack. Almost more frightening was the realization, late in the day, that it was just a coincidence.

“When we had electromechanical systems, we used to be able to test them exhaustively,” says Nancy Leveson, a professor of aeronautics and astronautics at the Massachusetts Institute of Technology who has been studying software safety for 35 years. She became known for her report on the Therac-25, a radiation-therapy machine that killed six patients because of a software error. “We used to be able to think through all the things it could do, all the states it could get into.” The electromechanical interlockings that controlled train movements at railroad crossings, for instance, only had so many configurations; a few sheets of paper could describe the whole system, and you could run physical trains against each configuration to see how it would behave. Once you’d built and tested it, you knew exactly what you were dealing with.

Software is different. Just by editing the text in a file somewhere, the same hunk of silicon can become an autopilot or an inventory-control system. This flexibility is software’s miracle, and its curse. Because it can be changed cheaply, software is constantly changed; and because it’s unmoored from anything physical — a program that is a thousand times more complex than another takes up the same actual space — it tends to grow without bound. “The problem,” Leveson wrote in a book, “is that we are attempting to build systems that are beyond our ability to intellectually manage.”…

Read all of it.
__

OPEN SOURCE TO THE RESCUE?

OpenAI's mission is to build safe AGI, and ensure AGI's benefits are as widely and evenly distributed as possible. We expect AI technologies to be hugely impactful in the short term, but their impact will be outstripped by that of the first AGIs.

We're a non-profit research company. Our full-time staff of 60 researchers and engineers is dedicated to working towards our mission regardless of the opportunities for selfish gain which arise along the way...

Lots of ongoing "Open AI" news here. They're on Twitter here.

UPDATE

From Wired:

Why Artificial Intelligence Researchers Should Be More Paranoid

LIFE HAS GOTTEN more convenient since 2012, when breakthroughs in machine learning triggered the ongoing frenzy of investment in artificial intelligence. Speech recognition works most of the time, for example, and you can unlock the new iPhone with your face.

People with the skills to build things such systems have reaped great benefits—they’ve become the most prized of tech workers. But a new report on the downsides of progress in AI warns they need to pay more attention to the heavy moral burdens created by their work.

The 99-page document unspools an unpleasant and sometimes lurid laundry list of malicious uses of artificial-intelligence technology. It calls for urgent and active discussion of how AI technology could be misused. Example scenarios given include cleaning robots being repurposed to assassinate politicians, or criminals launching automated and highly personalized phishing campaigns.

One proposed defense against such scenarios: AI researchers becoming more paranoid, and less open. The report says people and companies working on AI need to think about building safeguards against criminals or attackers into their technology—and even to withhold certain ideas or tools from public release…

We all need to closely read both the article and the 99 page report. The Exec Summary of the Report:

I can assume that many of you have watched to 2018 Winter Olympics. The opening and closing ceremonies featuring dynamic choreographed drone light shows were beautiful, amazing.

Now, imagine a huge hostile swarm of small drones, each armed with explosives, target-enabled with the GPS coordinates of the White House and/or Capitol Hill, AI-assisted, remotely "launched" by controllers halfway around the world.

From a distance they might well resemble a large flock of birds. They wouldn't all have to get through.

'eh?

_____________

More to come...

No comments:

Brave New Health

Commonwell Health Alliance

Another important read (pdf)

I love this kind of stuff. It sustains and humbles me. "As politicians, advertisers, salesmen, and propagandists for various political, economic, moral, religious, psychic, environmental, dietary, and artistic doctrinaire positions know only too well, fallible human minds are easily tricked, by clever verbiage... Common language—or at least, the English language—has an almost universal tendency to disguise epistemological statements by putting them into a grammatical form which suggests to the unwary an ontological statement. A major source of error in current probability theory arises from an unthinking failure to perceive this."

Quotes

"An economist is a person who sees something that works in practice and tries to figure out whether it will work in theory."

- J.D. Kleinke, medical economist
___

"The only person who enjoys change is a baby with a wet diaper."

"Every misspent dollar in our health care system is part of somebody's paycheck.

- Brent James, M.D., M.Stat

“We could do healthcare, at markedly higher quality, for everyone in this country, without rationing or denying anybody the care that they need, without having the government dictate how doctors practice or whether hospitals could expand, at half the cost we do it now.”

- Health Care Futurist Joe Flower

Most of the sciences, unlike parts of medical science, are not concerned with the impossible. There is not complementary and alternative physics, or chemistry, or biochemistry, or engineering. These disciplines compare their ideas against reality, and, if the ideas are found wanting, abandoned."

- Mark A. Crislip, MD

"Q: How much alcohol is too much?
A: More than your doctor drinks."

- a physician I once heard speak during a CME presentation

“Just because science doesn’t know everything, doesn’t mean you get to fill in the gaps with whatever fairy tale most appeals to you.”

- Dara O’Briain

'[I]t is one small step from using the computer for "helping" doctors to monitoring them, judging them, dictating to them what to do, and withdrawing payment for computer non-compliance. The use of computer data is a multi-edged sword. It can be used for the "good," facilitating diagnosis and treatment and making it more accurate and up-to-date, and for “evil,” invading privacy, inviting security breechs, and making decisions based on the opinions of remote authorities rather than those present at the patient-doctor encounter.'

- Richard Reece, MD

“[T]here ARE statistics which are non-political. Just because The Washington Post/Fox News reports the temperature is 75 degrees doesn’t mean it’s really snowing and sunscreen is a liberal/conservative plot. Even if you earn a living being ideological.”

- Michael L. Millenson

"It is a generally a fairly convincing argument that people shouldn’t have to be subsidized to undertake a change which is in their best interest.

The reconciliation seems to be that EHR is not supposed to make a doctor’s practice more efficient and higher quality. It is supposed to make the system of care more efficient and higher quality, which is not the same thing. Those of you who took calc recall that maximizing the total of variables is not achieved by maximizing any one variable and this is a perfect example of that.

Those of you have served in combat certainly noticed that too — if everyone works as a team the unit takes fewer casualties. If you try to save your own hide, you might, but at the expense of more casualties overall."

- Al Lewis

"There are two ideas to keep in mind about Bayesian reasoning and how we tend to mess it up. The first is that base rates matter, even in the presence of evidence about the case at hand. This is often not intuitively obvious. The second is that intuitive impressions of the diagnosticity of evidence are often exaggerated."

- Daniel Kahneman, "Thinking, Fast and Slow"

"Physicians apply advanced scientific knowledge, but they must do so without the favorable conditions that experimental scientists create for themselves. Multitasking is forced on physicians, often in chaotic environments and under severe time and resource constraints."

- Lawrence and Lincoln Weed, "Medicine in Denial"

"It’s time to stop the whining about Obama care and acknowledge we already have universal health care. We just pay for it in the stupidest way possible that ensures problems are that much more disastrous and complicated when they’re finally treated."

- Mark Hoofnagle, MD, PhD

"Every act of conscious learning requires the willingness to suffer an injury to one's self-esteem. That is why young children, before they are aware of their own self-importance, learn so easily."

- Thomas Szasz, MD
___

"Of course, one reason that process metrics* are so popular is that processes are much easier to define and measure than outcomes."

- The Skeptical Scalpel
___

"There is an “illusion of validity” for any random data point, a seductive sense that is colored by what we hope will be true. Mountains of pharmaceutical claims are often made from mere molehills of data."

- Danielle Ofri, MD
___

"Joy empowers people. It is a source of energy that enables people to hope and plan and change their lives for the better. Spend some time around someone who is relentlessly negative and how do you feel–drained, right? More and more research shows that joy is not something that just happens to you, like a bolt of lightening out of the blue. Joy is, instead, a habit to cultivate. Negative thinking and despair are the crabgrass of our souls–weeds that take root and spread, sometimes to all areas of life. Joy, in contrast, is a soul’s rose–hardy when cared for, able to put down roots over time and withstand disease and extremes. Like a rose, however, your joy can become blighted from neglect or harsh conditions. We all need to tend to our joy–to prune away the badness, and to know that, even though it may look like a prickly bare root, if you invest time in a joyous outlook, gorgeous things will bloom, even in the harshest conditions."

- Dr. Jan Gurley
___

"'Solutions' exist only in mathematics."

- Karen Martin
___

"The issue of how to regulate clinical software is, in the long run, indistinguishable from the issue of how to regulate medicine. The only difference is that medicine is practiced in the open, without secrecy, subject to peer review, and under a merit-based state license."

- Adrian Gropper, MD
___

"Economist, rope, tree: some assembly required."

- Source unknown

DISCLAIMER:

I write this blog wholly on my own time and my own dime. The views proffered are expressly my own as a concerned and active citizen/taxpayer (in addition to being the result of my substantive experience in the various IT fields), and in no way reflect any policy views of my former employer, notwithstanding that some of the thinking has indeed obviously been spurred by the implications of the work with which I have been doing for them.

FAIR USE POLICY
I cite a ton of news and web sources spanning the breadth of relevant technical and policy domains, sometimes at substantial length. I believe I remain well within the bounds of "Fair Use," as [1] I am not doing any of this for profit, [2] I always provide attribution and links -- which, [3] far from negatively impacting any copyright holders' commercial interests, might actually increase traffic to and interest in their offerings.

Nonetheless, should I post anything of yours regarding which you have any objection, just let me know and I will remove it forthwith.

The KHIT Blog

Search the KHIT Blog

Sunday, February 25, 2018

#AI and health diagnostics. "Reproducibility," anyone?

No comments:

Post a Comment