The KHIT Blog: "Assuming / Despite / If / Then / Therefore / Else..." Could AI do "argument analysis?"

Tuesday, May 23, 2017

"Assuming / Despite / If / Then / Therefore / Else..." Could AI do "argument analysis?"

When I was a kid in grade school, back prior to indoor plumbing, it was just broadly referred to as "reading comprehension" -- "What was the author's main point?" Did she provide good evidence for her point of view? Do you agree or disagree with the author's conclusion? Why? Explain..."

The oral equivalent was taught in "debate teams" prep. #NLP

Now along comes the part of "AI" technology R&D (Artificial Intelligence) known by its top-level acronym "NLP" (Natural Language Processing). We see increasing discourse on developments in "Machine Learning," "Deep Learning," "Natural Language Generation" (NLG) and "Natural Language Understanding" (NLU).

There's been a good bit of chatter of late in the Health IT news about the asserted utility of NLP. See here as well.

I am interested in particular in the latter (NLU), most specifically as it pertains to rational "argumentative" discourse (mostly of the written type). e.g., "Critical Thinking" comes to mind (I was lucky to get to teach it for a number of years as an adjunct faculty member). I was subsequently accorded the opportunity to teach a graduate seminar in the higher-level "Argument Analysis."

From my grad seminar syllabus:

We focus on effective analysis and evaluation of arguments in ordinary language. The "analysis" part involves the process of getting at what is truly being argued by a proponent of a position on an issue. Only once we have done that can we begin to accurately assess the relative merits of a proposition—the "evaluation" phase. These skills are essential to grasp if we are to become honest and constructive contributors to debate and the resolution of issues.

Our 24/7 global communications civilization is awash in arguments ranging from the trivial to grand themes of moral import. Advocates of every stripe and theme pepper us relentlessly with persuasion messages ranging from the "short and sweet" to the dense and inscrutable. We have more to consider and evaluate than time permits, so we must prioritize. This we often do by making precipitous snap judgments—"Ready-Shoot-Aim"—which then frequently calcify into prejudice. The sophistication and nuance of language enables a savvy partisan to entice us into buying into an argument perhaps not well supported by the facts and logic...

I first encountered "Argument Analysis" in the fall of 1994 as an "Ethics & Policy Studies" graduate student myself. I chose for my first semester paper an analytic deconstruction of the PNHP 1994 JAMA paper "A Better-Quality Alternative: Single-Payer National Health System Reform."

The first two opening paragraphs:

MANY MISCONSTRUE US health system reform options by presuming that "trade-offs" are needed to counter-balance the competing goals of increasing access, containing costs, and preserving quality. Standing as an apparent paradox to this zero-sum equation are countries such as Canada that ensure access to all at a cost 40% per capita less, with satisfaction and outcomes as good as or better than those in the United States. While the efficiencies of a single-payer universal program are widely acknowledged to facilitate simultaneous cost control and universal access, lingering concerns about quality have blunted support for this approach.

Quality is of paramount importance to Americans. Opponents of reform appeal to fears of diminished quality, warning of waiting lists, rationing, and "government control." Missing from more narrow discussions of the accuracy of such charges is a broader exploration of the quality implications of a universal health care program. Conversely, advocates of national health insurance have failed to emphasize quality issues as key criteria for reform, often assuming that we have "the best medical services in the world." They portray reform primarily as extending the benefits of private insurance to those currently uninsured, with safeguards added to preserve quality.

For the "analysis" phase I undertook to examine and "flowchart" the subordinate arguments' logic of the 49 paragraphs of assertions comprising the PHNP article, numbering every argument statement as "paragraph(n), sentence(n.n), and sub-sentence truth-claim clause(n.n.a,b,c...) where warranted" as evident by close reading of the text. My full (pdf) copy of the paper is parked here.

Click to enlarge.

Dotted lines denote a "despite" (a.k.a. "notwithstanding") statement, whereas solid lines depict "because-therefore" premise-to-conclusion movement in the direction of the arrowheads.

It was tedious. The bulk of the first 25 pages of the 56 page paper comprised this analytic "flowcharting" visualization helpful for what the late Steve Covey would characterize as a crucial "seek first to understand" effort. The remaining 31 pages subsequently focused on my (in large measure subjective) critical evaluation of the logic and evidence provided by the authors.

BTW: I'm certain I didn't get everything exactly right on the "analysis" side (or the eval side, for that matter). It was my first run at this type of thing. And, I had a second course to deal with at the time ("History of Ethics," 11 required texts) and was still working full-time at my Medicare QIO job.

Look at sentence 1.1, for example. You could nit-pick my decision, by splitting it up into "b" and "a" because-therefore clauses. Because "presuming trade-offs are needed," therefore "Many misconstrue..." Not that it'd have made a material difference in the analysis, but, still.

UPDATE: per the topic of my 1994 paper, Dr. Danielle Ofri in the news:

Americans Have Realized They Deserve Health Care
How long until they accept that the only way to guarantee it is through single-payer?

I have a good 100 hours or so in that one grad school paper. Imagine trying to do that to an entire book. Utterly impractical. So, we mostly suffer our "confirmation bias" and similar heuristic afflictions and jump to premature conclusions -- the bells we can't un-ring.

Hmmm...

Could we develop an AI NLU "app" for that? (I don't underestimate the difficulty, given the myriad fluid nuances of natural language. But, still...)

Thinking about NLP applicability to digital health infotech (EHRs), the differential dx SOAP method (Subjective, Objective, Assessment, and Plan) is basically an argument process, no? You assemble and evaluate salient clinical evidence (the "S" and the "O" data, whether numerical, encoded, or lexical narrative), which point in the aggregate to a dx conclusion and tx decision (the "A" and the "P"). I guess we'll see going forward whether applied NLP has any material net additional utility in the dx arena, or whether it will be just another HIT R&D sandbox fad.

Logic visualization software is not exactly news. In the late 80's I developed an instrumentation statistical process control program for the radiation lab where I worked in Oak Ridge -- the "IQCstats" system (pdf). Below is one page of the 100 or so comprising the logic flowcharts set included in my old 2" bound "User and Technical Guide" manual.

Click to enlarge

The flowcharts were generated by an "app" known as "CLEAR," which parsed my source code logic and rendered a complete set of flowcharts.

While "critical evaluation" of arguments proffered in ordinary language might not lend itself to automated digital assessment (human "judgments"), mapping the "Assuming / Despite / If / Then / Therefore / Else" logic might indeed be do-able in light of advances in "Computational Linguistics" (abetted by our exponentially increasing availability of ever-cheaper raw computing power).

Below, my graphical analogy for the fundamental unit of "argument" (a.k.a. "truth claim").

Click to enlarge

Any complex argument arises from assemblages of the foregoing "atomic" and "molecular" "particles" (once you've weeded through and discarded all of the "noise").

I should add that most of what I'm interested in here goes to "informal/propositional logic" in ordinary language. Formal syllogistic logic (e.g., formal deductive "proofs") comprise a far smaller subset of what we humans do in day-to-day reasoning.

English language discourse, recall, beyond the smaller "parts of speech," is comprised of four sentence types:

Declarative;
Interrogative;
Imperative;
Exclamatory.

We are principally interested in the subset of declaratives known as "truth claims" -- claims in need of evaluation prior to acceptance -- though we also have to be alert to the phony "interrogative" known as the "loaded question," i.e., an argument insinuation disingenuously posed as a "have-you-stopped-beating-your-wife" type of query. (Then there's also stuff like subtle inferences, ambiguities, and sarcasm, etc that might elude AI/NLU.)

NLP AND LINGUISTICS

It occurs to me that, notwithstanding my longstanding chops on the verbal/written side, I've never had any formal study in "linguistics," much less its application in NLP. Time to start reading up.

Introduction
Natural languages are the languages which have naturally evolved and used by human beings for communication purposes, For example Hindi, English, French, German are natural languages. Natural language processing or NLP (also called computational linguistics) is the scientific study of languages from computational perspective. natural language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages. Natural language generation systems convert information from computer databases into readable human language. Natural language understanding systems convert samples of human language into more formal representations such as parse trees or first order logic that are easier for computer programs to manipulate. Many problems within NLP apply to both generating and understanding; for example, the computer must be able to model morphology (the structure of words) in order to understand an English sentence, and a model of morphology is also needed for producing a grammatically correct English sentence, i.e., natural language generator.

NLP has significant overlap with the field of computational linguistics, and is often considered a subfield of artificial intelligence. The term natural language is used to distinguish human languages (such as Spanish, Swahili, or Swedish) from formal or computer languages (such as C++, Java, or LISP). Although NLP may end comp us both text and speech, work on speech processing is conventionally done in a separate field.

In NLP, the techniques are developed which aim the computer to understand the commands given in natural language and perform according to it. At present, to work with computer, the input is required to be given in formal languages. The formal languages are those languages which are specifically developed to communicate with computer and are understood by machine, e.g., FORTRAN, Pascal, etc. Obviously, to communicate with computer, the study of these formal languages is required. Understanding these languages is cumbersome and requires additional efforts to understand these. Hence, it limits their applications in computer. As compared to this, the communication in natural language will facilitate the functioning and communication with computer easily and in user-friendly way.

Natural language processing is a significant area of artificial intelligence because a computer would be considered intelligent if it can understand the commands given in natural language instead of C, FORTRAN, or Pascal. Hence, with the ability of computers to understand natural language it becomes much easier to communicate with computers. Also the natural language processing can be applied as a productivity tool in applications ranging from summarization of news to translate from one language to another. Though, the surface level processing of natural languages seems to be easy the deep level processing of natural languages, understanding of implicit messages and intentions of the speaker are extremely difficult avenues...

Ya have to wonder whether that was written by a computer. Minimally, a non-native English speaker/writer.

I've also just read up on "linguistics" broadly via a couple of short books, just to survey the domain.

The real meat comes here:

801 pages of dense, comprehensive detail.

Introduction
The field of computational linguistics (CL), together with its engineering domain of natural language processing (NLP), has exploded in recent years. It has developed rapidly from a relatively obscure adjunct of both AI and formal linguistics into a thriving scientific discipline. It has also become an important area of industrial development. The focus of research in CL and NLP has shifted over the past three decades from the study of small prototypes and theoretical models to robust learning and processing systems applied to large corpora. This handbook is intended to provide an introduction to the main areas of CL and NLP, and an overview of current work in these areas. It is designed as a reference and source text for graduate students and researchers from computer science, linguistics, psychology, philosophy, and mathematics who are interested in this area.

The volume is divided into four main parts. Part I contains chapters on the formal foundations of the discipline. Part II introduces the current methods that are employed in CL and NLP, and it divides into three subsections. The first section describes several influential approaches to Machine Learning (ML) and their application to NLP tasks. The second section presents work in the annotation of corpora. The last section addresses the problem of evaluating the performance of NLP systems. Part III of the handbook takes up the use of CL and NLP procedures within particular linguistic domains. Finally, Part IV discusses several leading engineering tasks to which these procedures are applied...

(2013-04-24). The Handbook of Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics) (p. 1). Wiley. Kindle Edition.

Interesting. BTW, nice summation of Computational Linguistics on the Wiki.

Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective.

Traditionally, computational linguistics was performed by computer scientists who had specialized in the application of computers to the processing of a natural language. Today, computational linguists often work as members of interdisciplinary teams, which can include regular linguists, experts in the target language, and computer scientists. In general, computational linguistics draws upon the involvement of linguists, computer scientists, experts in artificial intelligence, mathematicians, logicians, philosophers, cognitive scientists, cognitive psychologists, psycholinguists, anthropologists and neuroscientists, among others.

Computational linguistics has theoretical and applied components. Theoretical computational linguistics focuses on issues in theoretical linguistics and cognitive science, and applied computational linguistics focuses on the practical outcome of modeling human language use...

"applied computational linguistics focuses on the practical outcome of modeling human language..."

Like, well, NLU Argument Analytics?

UPDATE: I'm hitting a motherload of good stuff in Chapter 15 of "The Handbook..." on "computational semantics."

After getting up to speed on the technical concepts and salient details, perhaps the next step would involve learning Python.

"This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:

Extract information from unstructured text, either to guess the topic or identify "named entities"

Analyze linguistic structure in text, including parsing and semantic analysis

Access popular linguistic databases, including WordNet and treebanks

Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence

This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful."

apropos of this topic generally, a couple of prior posts of mine come to mind. See "The Great A.I. Awakening? Health Care Implications?" and "Are structured data now the enemy of health care quality?"

Tangentially, my post of July 2015 "AI vs IA: At the cutting edge of IT R&D" as well.

So, could we use digital NLU technology to passably analyze natural language arguments, rather than just turning lab data and ICD-10 codes into SOAP narratives (and the converse)?

Me and my crazy ideas. Never gonna make it into any episodes of "Silicon Valley" (NSFW).

Perhaps our Bootcamp Insta-Engineer pals at ZIPcode Wilmington could have a run at Argumentation NLU?

Seriously, how about a new subset of CL tech R&D, "NLAA" -- "Natural Language Argument Analysis?"
__

UPDATE: RECENT NLG REPORTAGE

From Wired:

What News-Writing Bots Mean for the Future of Journalism
Joe Keohane, 02.16.17

WHEN REPUBLICAN STEVE King beat back Democratic challenger Kim Weaver in the race for Iowa’s 4th congressional district seat in November, The Washington Post snapped into action, covering both the win and the wider electoral trend. “Republicans retained control of the House and lost only a handful of seats from their commanding majority,” the article read, “a stunning reversal of fortune after many GOP leaders feared double-digit losses.” The dispatch came with the clarity and verve for which Post reporters are known, with one key difference: It was generated by Heliograf, a bot that made its debut on the Post’s website last year and marked the most sophisticated use of artificial intelligence in journalism to date.

When Jeff Bezos bought the Post back in 2013, AI-powered journalism was in its infancy. A handful of companies with automated content-generating systems, like Narrative Science and Automated Insights, were capable of producing the bare-bones, data-heavy news items familiar to sports fans and stock analysts. But strategists at the Post saw the potential for an AI system that could generate explanatory, insightful articles. What’s more, they wanted a system that could foster “a seamless interaction” between human and machine, says Jeremy Gilbert, who joined the Post as director of strategic initiatives in 2014. “What we were interested in doing is looking at whether we can evolve stories over time,” he says...

More and more examples abound on the NLG side of things. Just Google "written by AI."

UPDATE: SPEAKING OF NEWS

I cited this excellent book a while back.

Re: Chapter 8, "Computational Journalism"

In 2009, Fred Turner and I wrote: “What is computational journalism? Ultimately, interactions among journalists, software developers, computer scientists and other scholars over the next few years will have to answer that question. For now though, we define computational journalism as the combination of algorithms, data, and knowledge from the social sciences to supplement the accountability function of journalism.”

Hamilton, James T. (2016-10-10). Democracy’s Detectives (Kindle Locations 10750-10753). Harvard University Press. Kindle Edition.

NLP seems an obvious fit, 'eh?
__

UPDATE

Check this out:

Study the intersection of language and technology and place yourself at the forefront of a dynamic field by earning a Master of Science in Computational Linguistics from the University of Washington.

The powerful connections between text, human speech and computer technology are having a growing impact on society and our everyday lives. In this program, you can explore a discipline that has applications in a wide variety of fields – including business, law and medicine – and incorporates such diverse technologies as predictive text messaging, search engines, speech recognition, machine translation and dialogue systems...

Nice. That's something I would do in a heartbeat. Here's their web link.

Seattle is a special place for me to begin with. Both of my daughters were born there. (I'm writing this while sitting with my younger daughter as she goes through round 2 of her chemo tx.)

Seattle. Those were the days...

(With much gratitude for the Statute of Limitations.)

I still have numerous rock-solid friendships there. Sadly, recently lost one. He succumbed after a terrible 11 battle with Mantle Cell Lymphoma. He was a younger brother to me. Without qualification one of the best drummers on the planet. He could've played for Sting.

CODA

From The Atlantic:

Rethinking Ethics Training in Silicon Valley
“If technology can mold us, and technologists are the ones who shape that technology, we should demand some level of ethics training for technologists.”
- Irina Raicu

I work at an ethics center in Silicon Valley.

I know, I know, “ethics” is not the first word that comes to mind when most people think of Silicon Valley or the tech industry. It’s probably not even in the top 10. But given the outsized role that tech companies now play, it’s time to focus on the ethical responsibilities of the technologists who help shape our lives...

Yeah, I know, the jokes just write themselves. "Silicon Valley" and "ethics" in the same sentence?

I was not even aware of this place.

I will have to study up on them and report further.

apropos, see my 2015 post "The old internet of data, the new internet of things, and 'Big Data' and the evolving internet of YOU."

UPDATE

See my follow-on post "Continuing with NLP: a $4,200 'Study'."

OCT 2021 UPDATE

Wow. Just finished this riveting book. No, AI/NLU will not be doing ordinary language Argument Analysis and Evaluation anytime soon, if ever. Read it to understand precisely why. A masterwork.

____________

No comments:

Brave New Health

Commonwell Health Alliance

Another important read (pdf)

I love this kind of stuff. It sustains and humbles me. "As politicians, advertisers, salesmen, and propagandists for various political, economic, moral, religious, psychic, environmental, dietary, and artistic doctrinaire positions know only too well, fallible human minds are easily tricked, by clever verbiage... Common language—or at least, the English language—has an almost universal tendency to disguise epistemological statements by putting them into a grammatical form which suggests to the unwary an ontological statement. A major source of error in current probability theory arises from an unthinking failure to perceive this."

Quotes

"An economist is a person who sees something that works in practice and tries to figure out whether it will work in theory."

- J.D. Kleinke, medical economist
___

"The only person who enjoys change is a baby with a wet diaper."

"Every misspent dollar in our health care system is part of somebody's paycheck.

- Brent James, M.D., M.Stat

“We could do healthcare, at markedly higher quality, for everyone in this country, without rationing or denying anybody the care that they need, without having the government dictate how doctors practice or whether hospitals could expand, at half the cost we do it now.”

- Health Care Futurist Joe Flower

Most of the sciences, unlike parts of medical science, are not concerned with the impossible. There is not complementary and alternative physics, or chemistry, or biochemistry, or engineering. These disciplines compare their ideas against reality, and, if the ideas are found wanting, abandoned."

- Mark A. Crislip, MD

"Q: How much alcohol is too much?
A: More than your doctor drinks."

- a physician I once heard speak during a CME presentation

“Just because science doesn’t know everything, doesn’t mean you get to fill in the gaps with whatever fairy tale most appeals to you.”

- Dara O’Briain

'[I]t is one small step from using the computer for "helping" doctors to monitoring them, judging them, dictating to them what to do, and withdrawing payment for computer non-compliance. The use of computer data is a multi-edged sword. It can be used for the "good," facilitating diagnosis and treatment and making it more accurate and up-to-date, and for “evil,” invading privacy, inviting security breechs, and making decisions based on the opinions of remote authorities rather than those present at the patient-doctor encounter.'

- Richard Reece, MD

“[T]here ARE statistics which are non-political. Just because The Washington Post/Fox News reports the temperature is 75 degrees doesn’t mean it’s really snowing and sunscreen is a liberal/conservative plot. Even if you earn a living being ideological.”

- Michael L. Millenson

"It is a generally a fairly convincing argument that people shouldn’t have to be subsidized to undertake a change which is in their best interest.

The reconciliation seems to be that EHR is not supposed to make a doctor’s practice more efficient and higher quality. It is supposed to make the system of care more efficient and higher quality, which is not the same thing. Those of you who took calc recall that maximizing the total of variables is not achieved by maximizing any one variable and this is a perfect example of that.

Those of you have served in combat certainly noticed that too — if everyone works as a team the unit takes fewer casualties. If you try to save your own hide, you might, but at the expense of more casualties overall."

- Al Lewis

"There are two ideas to keep in mind about Bayesian reasoning and how we tend to mess it up. The first is that base rates matter, even in the presence of evidence about the case at hand. This is often not intuitively obvious. The second is that intuitive impressions of the diagnosticity of evidence are often exaggerated."

- Daniel Kahneman, "Thinking, Fast and Slow"

"Physicians apply advanced scientific knowledge, but they must do so without the favorable conditions that experimental scientists create for themselves. Multitasking is forced on physicians, often in chaotic environments and under severe time and resource constraints."

- Lawrence and Lincoln Weed, "Medicine in Denial"

"It’s time to stop the whining about Obama care and acknowledge we already have universal health care. We just pay for it in the stupidest way possible that ensures problems are that much more disastrous and complicated when they’re finally treated."

- Mark Hoofnagle, MD, PhD

"Every act of conscious learning requires the willingness to suffer an injury to one's self-esteem. That is why young children, before they are aware of their own self-importance, learn so easily."

- Thomas Szasz, MD
___

"Of course, one reason that process metrics* are so popular is that processes are much easier to define and measure than outcomes."

- The Skeptical Scalpel
___

"There is an “illusion of validity” for any random data point, a seductive sense that is colored by what we hope will be true. Mountains of pharmaceutical claims are often made from mere molehills of data."

- Danielle Ofri, MD
___

"Joy empowers people. It is a source of energy that enables people to hope and plan and change their lives for the better. Spend some time around someone who is relentlessly negative and how do you feel–drained, right? More and more research shows that joy is not something that just happens to you, like a bolt of lightening out of the blue. Joy is, instead, a habit to cultivate. Negative thinking and despair are the crabgrass of our souls–weeds that take root and spread, sometimes to all areas of life. Joy, in contrast, is a soul’s rose–hardy when cared for, able to put down roots over time and withstand disease and extremes. Like a rose, however, your joy can become blighted from neglect or harsh conditions. We all need to tend to our joy–to prune away the badness, and to know that, even though it may look like a prickly bare root, if you invest time in a joyous outlook, gorgeous things will bloom, even in the harshest conditions."

- Dr. Jan Gurley
___

"'Solutions' exist only in mathematics."

- Karen Martin
___

"The issue of how to regulate clinical software is, in the long run, indistinguishable from the issue of how to regulate medicine. The only difference is that medicine is practiced in the open, without secrecy, subject to peer review, and under a merit-based state license."

- Adrian Gropper, MD
___

"Economist, rope, tree: some assembly required."

- Source unknown

DISCLAIMER:

I write this blog wholly on my own time and my own dime. The views proffered are expressly my own as a concerned and active citizen/taxpayer (in addition to being the result of my substantive experience in the various IT fields), and in no way reflect any policy views of my former employer, notwithstanding that some of the thinking has indeed obviously been spurred by the implications of the work with which I have been doing for them.

FAIR USE POLICY
I cite a ton of news and web sources spanning the breadth of relevant technical and policy domains, sometimes at substantial length. I believe I remain well within the bounds of "Fair Use," as [1] I am not doing any of this for profit, [2] I always provide attribution and links -- which, [3] far from negatively impacting any copyright holders' commercial interests, might actually increase traffic to and interest in their offerings.

Nonetheless, should I post anything of yours regarding which you have any objection, just let me know and I will remove it forthwith.

The KHIT Blog

Search the KHIT Blog

Tuesday, May 23, 2017

"Assuming / Despite / If / Then / Therefore / Else..." Could AI do "argument analysis?"

No comments:

Post a Comment