Search the KHIT Blog

Tuesday, May 23, 2017

"Assuming / Despite / If / Then / Therefore / Else..." Could AI do "argument analysis?"


When I was a kid in grade school, back prior to indoor plumbing, it was just broadly referred to as "reading comprehension" -- "What was the author's main point?" Did she provide good evidence for her point of view? Do you agree or disagree with the author's conclusion? Why? Explain..."

The oral equivalent was taught in "debate teams" prep. #NLP

Now along comes the part of "AI" technology R&D (Artificial Intelligence) known by its top-level acronym "NLP" (Natural Language Processing). We see increasing discourse on developments in "Machine Learning," "Deep Learning," "Natural Language Generation" (NLG) and "Natural Language Understanding" (NLU).
There's been a good bit of chatter of late in the Health IT news about the asserted utility of NLP. See here as well.
I am interested in particular in the latter (NLU), most specifically as it pertains to rational "argumentative" discourse (mostly of the written type). e.g., "Critical Thinking" comes to mind (I was lucky to get to teach it for a number of years as an adjunct faculty member). I was subsequently accorded the opportunity to teach a graduate seminar in the higher-level "Argument Analysis."

From my grad seminar syllabus:
We focus on effective analysis and evaluation of arguments in ordinary language. The "analysis" part involves the process of getting at what is truly being argued by a proponent of a position on an issue. Only once we have done that can we begin to accurately assess the relative merits of a proposition—the "evaluation" phase. These skills are essential to grasp if we are to become honest and constructive contributors to debate and the resolution of issues.

Our 24/7 global communications civilization is awash in arguments ranging from the trivial to grand themes of moral import. Advocates of every stripe and theme pepper us relentlessly with persuasion messages ranging from the "short and sweet" to the dense and inscrutable. We have more to consider and evaluate than time permits, so we must prioritize. This we often do by making precipitous snap judgments—"Ready-Shoot-Aim"—which then frequently calcify into prejudice. The sophistication and nuance of language enables a savvy partisan to entice us into buying into an argument perhaps not well supported by the facts and logic...
I first encountered "Argument Analysis" in the fall of 1994 as an "Ethics & Policy Studies" graduate student myself. I chose for my first semester paper an analytic deconstruction of the PNHP 1994 JAMA paper "A Better-Quality Alternative: Single-Payer National Health System Reform."

The first two opening paragraphs:
MANY MISCONSTRUE US health system reform options by presuming that "trade-offs" are needed to counter-balance the competing goals of increasing access, containing costs, and preserving quality. Standing as an apparent paradox to this zero-sum equation are countries such as Canada that ensure access to all at a cost 40% per capita less, with satisfaction and outcomes as good as or better than those in the United States. While the efficiencies of a single-payer universal program are widely acknowledged to facilitate simultaneous cost control and universal access, lingering concerns about quality have blunted support for this approach.
Quality is of paramount importance to Americans. Opponents of reform appeal to fears of diminished quality, warning of waiting lists, rationing, and "government control." Missing from more narrow discussions of the accuracy of such charges is a broader exploration of the quality implications of a universal health care program. Conversely, advocates of national health insurance have failed to emphasize quality issues as key criteria for reform, often assuming that we have "the best medical services in the world." They portray reform primarily as extending the benefits of private insurance to those currently uninsured, with safeguards added to preserve quality.
For the "analysis" phase I undertook to examine and "flowchart" the subordinate arguments' logic of the 49 paragraphs of assertions comprising the PHNP article, numbering every argument statement as "paragraph(n), sentence(n.n), and sub-sentence truth-claim clause(n.n.a,b,c...) where warranted" as evident by close reading of the text. My full (pdf) copy of the paper is parked here.

Click to enlarge.
Dotted lines denote a "despite" (a.k.a. "notwithstanding") statement, whereas solid lines depict "because-therefore" premise-to-conclusion movement in the direction of the arrowheads.

It was tedious. The bulk of the first 25 pages of the 56 page paper comprised this analytic "flowcharting" visualization helpful for what the late Steve Covey would characterize as a crucial "seek first to understand" effort. The remaining 31 pages subsequently focused on my (in large measure subjective) critical evaluation of the logic and evidence provided by the authors.
BTW: I'm certain I didn't get everything exactly right on the "analysis" side (or the eval side, for that matter). It was my first run at this type of thing. And, I had a second course to deal with at the time ("History of Ethics," 11 required texts) and was still working full-time at my Medicare QIO job.
Look at sentence 1.1, for example. You could nit-pick my decision, by splitting it up into "b" and "a" because-therefore clauses. Because "presuming trade-offs are needed," therefore "Many misconstrue..." Not that it'd have made a material difference in the analysis, but, still.
UPDATE: per the topic of my 1994 paper, Dr. Danielle Ofri in the news:
Americans Have Realized They Deserve Health Care
How long until they accept that the only way to guarantee it is through single-payer?
I have a good 100 hours or so in that one grad school paper. Imagine trying to do that to an entire book. Utterly impractical. So, we mostly suffer our "confirmation bias" and similar heuristic afflictions and jump to premature conclusions -- the bells we can't un-ring.

Hmmm...


Could we develop an AI NLU "app" for that? (I don't underestimate the difficulty, given the myriad fluid nuances of natural language. But, still...)
Thinking about NLP applicability to digital health infotech (EHRs), the differential dx SOAP method (Subjective, Objective, Assessment, and Plan) is basically an argument process, no? You assemble and evaluate salient clinical evidence (the "S" and the "O" data, whether numerical, encoded, or lexical narrative), which point in the aggregate to a dx conclusion and tx decision (the "A" and the "P"). I guess we'll see going forward whether applied NLP has any material net additional utility in the dx arena, or whether it will be just another HIT R&D sandbox fad.
Logic visualization software is not exactly news. In the late 80's I developed an instrumentation statistical process control program for the radiation lab where I worked in Oak Ridge -- the "IQCstats" system (pdf). Below is one page of the 100 or so comprising the logic flowcharts set included in my old 2" bound "User and Technical Guide" manual.

Click to enlarge

The flowcharts were generated by an "app" known as "CLEAR," which parsed my source code logic and rendered a complete set of flowcharts.

While "critical evaluation" of arguments proffered in ordinary language might not lend itself to automated digital assessment (human "judgments"), mapping the "Assuming / Despite / If / Then / Therefore / Else" logic might indeed be do-able in light of advances in "Computational Linguistics" (abetted by our exponentially increasing availability of ever-cheaper raw computing power).

Below, my graphical analogy for the fundamental unit of "argument" (a.k.a. "truth claim").

Click to enlarge



Any complex argument arises from assemblages of the foregoing "atomic" and "molecular" "particles" (once you've weeded through and discarded all of the "noise").
I should add that most of what I'm interested in here goes to "informal/propositional logic" in ordinary language. Formal syllogistic logic (e.g., formal deductive "proofs") comprise a far smaller subset of what we humans do in day-to-day reasoning.
English language discourse, recall, beyond the smaller "parts of speech," is comprised of four sentence types:
  1. Declarative;
  2. Interrogative;
  3. Imperative;
  4. Exclamatory.
We are principally interested in the subset of declaratives known as "truth claims" -- claims in need of evaluation prior to acceptance -- though we also have to be alert to the phony "interrogative" known as the "loaded question," i.e., an argument insinuation disingenuously posed as a "have-you-stopped-beating-your-wife" type of query. (Then there's also stuff like subtle inferences, ambiguities, and sarcasm, etc that might elude AI/NLU.)

NLP AND LINGUISTICS

It occurs to me that, notwithstanding my longstanding chops on the verbal/written side, I've never had any formal study in "linguistics," much less its application in NLP. Time to start reading up.

Introduction
Natural languages are the languages which have naturally evolved and used by human beings for communication purposes, For example Hindi, English, French, German are natural languages.  Natural language processing or NLP (also called computational linguistics) is the scientific study of languages from computational perspective. natural language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages. Natural language generation systems convert information from computer databases into readable human language. Natural language understanding systems convert samples of human language into more formal representations such as parse trees or first order logic that are easier for computer programs to manipulate. Many problems within NLP apply to both generating and understanding; for example, the computer must  be able to model morphology (the structure of words) in order to understand an English sentence, and a model of morphology is also needed for producing a grammatically correct English sentence, i.e., natural language generator.

NLP has significant overlap with the field of computational linguistics, and is often considered a subfield of artificial intelligence. The term natural language is used to distinguish human languages (such as Spanish, Swahili, or Swedish) from formal or computer languages (such as C++, Java, or LISP).  Although NLP may end comp us both  text and speech, work on speech processing is conventionally done in a separate field.

In NLP, the techniques are developed which aim the computer to understand the commands given in natural language and perform according to it. At present, to work with computer, the input is required to be given in formal languages. The formal languages are those languages which are specifically developed to communicate  with computer and are understood by machine, e.g., FORTRAN, Pascal, etc. Obviously, to communicate with computer, the study of these formal languages is required. Understanding these languages is  cumbersome and requires additional efforts to understand these. Hence, it limits their applications in computer. As compared to this, the communication in natural language will facilitate the functioning and communication with computer easily and in user-friendly way.

 Natural language processing is a significant area of artificial intelligence because a computer would be considered intelligent  if it can understand the commands given in natural language instead of C, FORTRAN, or Pascal. Hence, with the ability of computers to understand natural language it becomes much easier to communicate with computers. Also the natural language processing can be applied as a productivity tool in applications ranging from summarization of news to translate from one language to another. Though, the surface level processing of natural languages seems to be easy the deep level processing of natural languages, understanding of implicit messages and intentions of the speaker are extremely difficult avenues...
Ya have to wonder whether that was written by a computer. Minimally, a non-native English speaker/writer.

I've also just read up on "linguistics" broadly via a couple of short books, just to survey the domain.


The real meat comes here:


801 pages of dense, comprehensive detail.
Introduction
The field of computational linguistics (CL), together with its engineering domain of natural language processing (NLP), has exploded in recent years. It has developed rapidly from a relatively obscure adjunct of both AI and formal linguistics into a thriving scientific discipline. It has also become an important area of industrial development. The focus of research in CL and NLP has shifted over the past three decades from the study of small prototypes and theoretical models to robust learning and processing systems applied to large corpora. This handbook is intended to provide an introduction to the main areas of CL and NLP, and an overview of current work in these areas. It is designed as a reference and source text for graduate students and researchers from computer science, linguistics, psychology, philosophy, and mathematics who are interested in this area.
The volume is divided into four main parts. Part I contains chapters on the formal foundations of the discipline. Part II introduces the current methods that are employed in CL and NLP, and it divides into three subsections. The first section describes several influential approaches to Machine Learning (ML) and their application to NLP tasks. The second section presents work in the annotation of corpora. The last section addresses the problem of evaluating the performance of NLP systems. Part III of the handbook takes up the use of CL and NLP procedures within particular linguistic domains. Finally, Part IV discusses several leading engineering tasks to which these procedures are applied...

(2013-04-24). The Handbook of Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics) (p. 1). Wiley. Kindle Edition.
Interesting. BTW, nice summation of Computational Linguistics on the Wiki.
Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective.

Traditionally, computational linguistics was performed by computer scientists who had specialized in the application of computers to the processing of a natural language. Today, computational linguists often work as members of interdisciplinary teams, which can include regular linguists, experts in the target language, and computer scientists. In general, computational linguistics draws upon the involvement of linguists, computer scientists, experts in artificial intelligence, mathematicians, logicians, philosophers, cognitive scientists, cognitive psychologists, psycholinguists, anthropologists and neuroscientists, among others.

Computational linguistics has theoretical and applied components. Theoretical computational linguistics focuses on issues in theoretical linguistics and cognitive science, and applied computational linguistics focuses on the practical outcome of modeling human language use...
"applied computational linguistics focuses on the practical outcome of modeling human language..."

Like, well, NLU Argument Analytics?

UPDATE: I'm hitting a motherload of good stuff in Chapter 15 of "The Handbook..." on "computational semantics."

After getting up to speed on the technical concepts and salient details, perhaps the next step would involve learning Python.

"This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:
  • Extract information from unstructured text, either to guess the topic or identify "named entities"
  • Analyze linguistic structure in text, including parsing and semantic analysis
  • Access popular linguistic databases, including WordNet and treebanks
  • Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence
This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful."
apropos of this topic generally, a couple of prior posts of mine come to mind. See "The Great A.I. Awakening? Health Care Implications?" and "Are structured data now the enemy of health care quality?"

Tangentially, my post of July 2015 "AI vs IA: At the cutting edge of IT R&D" as well.

So, could we use digital NLU technology to passably analyze natural language arguments, rather than just turning lab data and ICD-10 codes into SOAP narratives (and the converse)?

Me and my crazy ideas. Never gonna make it into any episodes of "Silicon Valley" (NSFW).

Perhaps our Bootcamp Insta-Engineer pals at ZIPcode Wilmington could have a run at Argumentation NLU?

Seriously, how about a new subset of CL tech R&D, "NLAA" -- "Natural Language Argument Analysis?"
__

UPDATE: RECENT NLG REPORTAGE

From Wired:
What News-Writing Bots Mean for the Future of Journalism
Joe Keohane, 02.16.17

WHEN REPUBLICAN STEVE King beat back Democratic challenger Kim Weaver in the race for Iowa’s 4th congressional district seat in November, The Washington Post snapped into action, covering both the win and the wider electoral trend. “Republicans retained control of the House and lost only a handful of seats from their commanding majority,” the article read, “a stunning reversal of fortune after many GOP leaders feared double-digit losses.” The dispatch came with the clarity and verve for which Post reporters are known, with one key difference: It was generated by Heliograf, a bot that made its debut on the Post’s website last year and marked the most sophisticated use of artificial intelligence in journalism to date.

When Jeff Bezos bought the Post back in 2013, AI-powered journalism was in its infancy. A handful of companies with automated content-generating systems, like Narrative Science and Automated Insights, were capable of producing the bare-bones, data-heavy news items familiar to sports fans and stock analysts. But strategists at the Post saw the potential for an AI system that could generate explanatory, insightful articles. What’s more, they wanted a system that could foster “a seamless interaction” between human and machine, says Jeremy Gilbert, who joined the Post as director of strategic initiatives in 2014. “What we were interested in doing is looking at whether we can evolve stories over time,” he says...
More and more examples abound on the NLG side of things. Just Google "written by AI."

UPDATE: SPEAKING OF NEWS

I cited this excellent book a while back.


Re: Chapter 8, "Computational Journalism"
In 2009, Fred Turner and I wrote: “What is computational journalism? Ultimately, interactions among journalists, software developers, computer scientists and other scholars over the next few years will have to answer that question. For now though, we define computational journalism as the combination of algorithms, data, and knowledge from the social sciences to supplement the accountability function of journalism.”

Hamilton, James T. (2016-10-10). Democracy’s Detectives (Kindle Locations 10750-10753). Harvard University Press. Kindle Edition.
NLP seems an obvious fit, 'eh?
__

UPDATE

Check this out:

Study the intersection of language and technology and place yourself at the forefront of a dynamic field by earning a Master of Science in Computational Linguistics from the University of Washington.

The powerful connections between text, human speech and computer technology are having a growing impact on society and our everyday lives. In this program, you can explore a discipline that has applications in a wide variety of fields – including business, law and medicine – and incorporates such diverse technologies as predictive text messaging, search engines, speech recognition, machine translation and dialogue systems...
Nice. That's something I would do in a heartbeat. Here's their web link.


Seattle is a special place for me to begin with. Both of my daughters were born there. (I'm writing this while sitting with my younger daughter as she goes through round 2 of her chemo tx.)

Seattle. Those were the days...
(With much gratitude for the Statute of Limitations.)
I still have numerous rock-solid friendships there. Sadly, recently lost one. He succumbed after a terrible 11 battle with Mantle Cell Lymphoma. He was a younger brother to me. Without qualification one of the best drummers on the planet. He could've played for Sting.

CODA

From The Atlantic:
Rethinking Ethics Training in Silicon Valley
“If technology can mold us, and technologists are the ones who shape that technology, we should demand some level of ethics training for technologists.”

- Irina Raicu
I work at an ethics center in Silicon Valley.

I know, I know, “ethics” is not the first word that comes to mind when most people think of Silicon Valley or the tech industry. It’s probably not even in the top 10. But given the outsized role that tech companies now play, it’s time to focus on the ethical responsibilities of the technologists who help shape our lives...
Yeah, I know, the jokes just write themselves. "Silicon Valley" and "ethics" in the same sentence?

I was not even aware of this place.



I will have to study up on them and report further.

apropos, see my 2015 post "The old internet of data, the new internet of things, and 'Big Data' and the evolving internet of YOU."

UPDATE

See my follow-on post "Continuing with NLP: a $4,200 'Study'."

OCT 2021 UPDATE


Wow. Just finished this riveting book. No, AI/NLU will not be doing ordinary language Argument Analysis and Evaluation anytime soon, if ever. Read it to understand precisely why. A masterwork.
____________

No comments:

Post a Comment