Search the KHIT Blog

Wednesday, May 31, 2017

Continuing with NLP, a $4,200 "study."

OK, I've been keyword-searching around on Natural Language Processing (NLP) topics in the wake of my prior post, while I finish Daniel Dennett's glorious new witty and detailed 496 pg book on the evolution of the human mind (btw, $9.18 Kindle price, and of direct relevance to AI/NLP/CL). #NLP

Ran into this beaut.

Physician Computer Assisted Coding for Professionals: Market Shares, Strategies, and Forecasts, Worldwide, 2017 to 2023

LEXINGTON, Massachusetts (March 13, 2017) – WinterGreen Research announces that it has published a new study Professional Physician Practice and Ambulatory Clinical Facility Computer Assisted Coding: Market Shares, Strategy, and Forecasts, Worldwide, 2017 to 2023. Next generation Computer Assisted Coding of medical information is able to leverage natural language software technology to support some automation of the billing process and use of analytics to achieve higher quality patient outcomes. The study has 299 pages and 110 tables and figures.

Computer assisted coding of medical information uses natural language solutions to link the physician notes in an electronic patient record to the codes used for billing Medicare, Medicaid, and private insurance companies. 

Natural language processing is used determine the links to codes. 88% of the coding can occur automatically without human review. Computer assisted coding is used in all parts of the healthcare delivery system. The coding systems work well to implement automated coding process.

Physicians think about patient conditions in terms of words. Software is configured to achieve working with physicians who are more comfortable describing a patient treatment in words than codes. The electronic patient record, created using physician dictation, is used to form the base for the coding.  Natural language solutions implement computer coding to identify key words and patterns of language. The physician dictation can be done using regular language that the software recognizes and translates into billing codes.  

Properly designed natural language processing (NLP) solutions do not require physicians to change the way they work. They can dictate in a free-flowing fashion, consistent with the way they think, and are not limited to structured inputs that may or may not fully capture the unique circumstances of each patient encounter.

Matching codes generated from physician notes to standard treatment protocols promises to improve health care delivery. Accompanying that type of physician patient management against best practice promises to revolutionize health care delivery. The ability to further check as to whether the recommendations for follow up made by radiologists and matching the commendations with the actual follow up heralds’ significant promise of vastly improved health care delivery. 

Computer assisted coding applications depend on the development of production quality natural language processing (NLP)-based computer assisted coding applications. This requires a process-driven approach to software development and quality assurance. 

A well-defined software engineering process consists of requirements analysis, preliminary design, detailed design, implementation, unit testing, system testing and deployment. NLP complex technology defines the key features of a computer assisted coding (CAC) application.

Automation of process will revolutionize health care delivery. In addition to automating the insurance, billing, and transaction systems, streamlined care delivery is an added benefit. The ability to look at workflow and compare actual care to best practice is fundamental to automated business process. 

The ability to link diagnostic patient information to treatment regimes and drug prescriptions is central to improving medical care delivery. Once a physician can see what conditions need to be followed, and see that appropriate care has been prescribed 100% of the time, care delivery improves dramatically. Diagnosis of conditions using radiology frequently results in detection of events that need follow-up.

According to Susan Eustis, lead author of the team that prepared the study, “Growing acceptance of computer assisted coding for physician offices represents a shift to cloud computing and billing by the procedure coded. Because SaaS based CAC provides an improvement over current coding techniques the value is being recognized. Administrators are realizing the benefits to quality of care. Patients feel better after robotic surgery and the surgeries are more likely to be successful.” 

The worldwide market for Computer Assisted Coding is $898 million in 2016, anticipated to reach $2.5 billion by 2023. The complete report provides a comprehensive analysis of Computer Assisted Coding in different categories, illustrating the diversity of software market segments. A complete procedure analysis is done, looking at numbers of procedures and doing penetration analysis. 

Major health plans report a smooth transition to ICD-10. This is due to rigorous testing for six years. ICD-10 has had a positive impact on reimbursement. ICD-10 coding system requires use of 72,000 procedure codes and 68,000 CM codes, as opposed to the 4,000 and 14,000 in the ICD-9 system. Managing high volume of codes requires automation. Healthcare providers and payers use complex coding systems, which drives demand for technologically advanced CAC systems. 

The market for computer-assisted coding grows because it provides management of workflow process value by encouraging increasing efficiency in care delivery for large Professional Physician Practice and Ambulatory Clinical Facility. By making more granular demarcation of diagnoses and care provided for each diagnosis, greater visibility into the care delivery system is provided. Greater visibility brings more ability to adapt the system to successful treatments...
Need I elaborate? Seriously? The writing is painful, as are the topic-skipping lack of focus and blinding-glimpses-of-the-obvious/not-exactly-news "research analysis" observations.
"Patients feel better after robotic surgery and the surgeries are more likely to be successful."
And, this is a result of back-office NLP/CAC precisely how?

Okeee-dokeee, then. A mere 14 bucks a page for a PDF file?

First of all, EHR narrative fields "text mining" functionality has been a commonplace for years now across a number of mainstream platforms (as is the converse; turning codes and data into faux-narrative text dx "impressions'). Re-labeling such now with the trendy "NLP" moniker doesn't change that (none of which is to imply that the infotech is not improving). Second, I'm not gonna pay $4,200 to maybe verify whether "[exactly?] 88% of the coding can occur automatically without human review" (in payor audit-defensible fashion). Finally, we all "think" about things in terms of "words," but dx narrative impressions are a result of the SOAP process, not the cause. They're the "A" and the "P." The "S" and "O" comprise a mix of numeric data, informatics codes, and open-ended deliberative textual information.

Beyond those, the rest of the foregoing is a poorly-written rambling melange of platitudes and unhelpfully vague filler assertions.

CEO and "Senior Analyst" Susan Eustis:


Lordy, mercy. A proxy spokesmodel? How about the CEO herself?

Stay tuned, just getting underway. Behind the curve this week, tending to my ailing daughter.

UPDATE

From TechCrunch:
Mary Meeker’s latest trends report highlights Silicon Valley’s role in the future of health care

Mary Meeker’s latest Internet Trends Report, out today, was full of insights on how tech is shaping our future — including now in health care. This was the first year Meeker included healthcare in her report and it shows just how much of a role tech is going to play in improving our lives going forward...
Free download, 355 pages.

Mary Meeker, 2016:


__

NEXT, PREPARE TO ENJOY

Finished Daniel Dennett's book.

Consider medical education. Watson is just one of many computer-based systems that are beginning to outperform the best diagnosticians and specialists on their own turf. Would you be willing to indulge your favorite doctor in her desire to be an old-fashioned “intuitive” reader of symptoms instead of relying on a computer-based system that had been proven to be a hundred times more reliable at finding rare, low-visibility diagnoses than any specialist? Your health insurance advisor will oblige you to submit to the tests, and conscientious doctors will see that they must squelch their yearnings to be diagnostic heroes and submit to the greater authority of the machines whose buttons they push. What does this imply about how to train doctors? Will we be encouraged to jettison huge chunks of traditional medical education— anatomy, physiology, biochemistry— along with the ability to do long division and read a map? Use it or lose it is the rule of thumb cited at this point, and it has many positive instances. Can your children read road maps as easily as you do or have they become dependent on GPS to guide them? How concerned should we be that we are dumbing ourselves down by our growing reliance on intelligent machines?

So far, there is a fairly sharp boundary between machines that enhance our “peripheral” intellectual powers (of perception, algorithmic calculation, and memory) and machines that at least purport to replace our “central” intellectual powers of comprehension (including imagination), planning, and decision-making ...We can expect that boundary to shrink, routinizing more and more cognitive tasks, which will be fine so long as we know where the boundary currently is. The real danger, I think, is not that machines more intelligent than we are will usurp our role as captains of our destinies, but that we will over-estimate the comprehension of our latest thinking tools, prematurely ceding authority to them far beyond their competence.

Dennett, Daniel C. (2017-02-07). From Bacteria to Bach and Back: The Evolution of Minds (Kindle Locations 6649-6666). W. W. Norton & Company. Kindle Edition.
"How concerned should we be that we are dumbing ourselves down by our growing reliance on intelligent machines?"

Well, I recall a couple of books I've heretofore cited on that issue.


Back to Daniel Dennett, some concluding 'Bacteria to Bach" broad stroke thoughts:.
We have now looked at a few of the innovations that have led us to relinquish the mastery of creation that has long been a hallmark of understanding in our species. More are waiting in the wings. We have been motivated for several millennia by the idea expressed in Feynman’s dictum, “What I cannot create, I do not understand.” But recently our ingenuity has created a slippery slope: we find ourselves indirectly making things that we only partially understand, and they in turn may create things we don’t understand at all. Since some of these things have wonderful powers, we may begin to doubt the value— or at least the preeminent value— of understanding. “Comprehension is so passé, so vieux jeux, so old-fashioned! Who needs understanding when we can all be the beneficiaries of artifacts that save us that arduous effort?”

Is there a good reply to this? We need something more than tradition if we want to defend the idea that comprehension is either intrinsically good— a good in itself, independently of all the benefits it indirectly provides— or practically necessary if we are to continue living the kinds of lives that matter to us. Philosophers, like me, can be expected to recoil in dismay from such a future. As Socrates famously said, “the unexamined life is not worth living,” and ever since Socrates we have taken it as self-evident that achieving an ever greater understanding of everything is our highest professional goal, if not our highest goal absolutely. But as another philosopher, the late Kurt Baier, once added, “the over-examined life is nothing to write home about either.” Most people are content to be the beneficiaries of technology and medicine, scientific fact-finding and artistic creation without much of a clue about how all this “magic” has been created. Would it be so terrible to embrace the over-civilized life and trust our artifacts to be good stewards of our well-being?

I myself have been unable to concoct a persuasive argument for the alluring conclusion that comprehension is “intrinsically” valuable— though I find comprehension to be one of life’s greatest thrills— but I think a good case can be made for preserving and enhancing human comprehension and for protecting it from the artifactual varieties of comprehension now under development in deep learning, for deeply practical reasons. Artifacts can break, and if few people understand them well enough either to repair them or substitute other ways of accomplishing their tasks, we could find ourselves and all we hold dear in dire straits. Many have noted that for some of our high-tech artifacts, the supply of repair persons is dwindling or nonexistent. A new combination color printer and scanner costs less than repairing your broken one. Discard it and start fresh. Operating systems for personal computers follow a similar version of the same policy: when your software breaks or gets corrupted, don’t bother trying to diagnose and fix the error, unmutating the mutation that has crept in somehow; reboot, and fresh new versions of your favorite programs will be pulled up from safe storage in memory to replace the copies that have become defective. But how far can this process go?

Consider a typical case of uncomprehending reliance on technology. A smoothly running automobile is one of life’s delights; it enables you to get where you need to get, on time, with great reliability, and for the most part, you get there in style, with music playing, air conditioning keeping you comfortable, and GPS guiding your path. We tend to take cars for granted in the developed world, treating them as one of life’s constants, a resource that is always available. We plan our life’s projects with the assumption that of course a car will be part of our environment. But when your car breaks down, your life is seriously disrupted. Unless you are a serious car buff with technical training you must acknowledge your dependence on a web of tow-truck operators, mechanics, car dealers, and more. At some point, you decide to trade in your increasingly unreliable car and start afresh with a brand new model. Life goes on, with hardly a ripple.

But what about the huge system that makes this all possible: the highways, the oil refineries, the automakers, the insurance companies, the banks, the stock market, the government? Our civilization has been running smoothly— with some serious disruptions— for thousands of years, growing in complexity and power, Could it break down? Yes, it could, and to whom could we then turn to help us get back on the road? You can’t buy a new civilization if yours collapses, so we had better keep the civilization we have running in good repair. Who, though, are the reliable mechanics? The politicians, the judges, the bankers, the industrialists, the journalists, the professors— the leaders of our society, in short— are much more like the average motorist than you might like to think: doing their local bit to steer their part of the whole contraption, while blissfully ignorant of the complexities on which the whole system depends. According to the economist and evolutionary thinker Paul Seabright (2010), the optimistic tunnel vision with which they operate is not a deplorable and correctable flaw in the system but an enabling condition. This distribution of partial comprehension is not optional. The edifices of social construction that shape our lives in so many regards depend on our myopic confidence that their structure is sound and needs no attention from us.

At one point Seabright compares our civilization with a termite castle. Both are artifacts, marvels of ingenious design piled on ingenious design, towering over the supporting terrain, the work of vastly many individuals acting in concert. Both are thus by-products of the evolutionary processes that created and shaped those individuals, and in both cases, the design innovations that account for the remarkable resilience and efficiency observable were not the brain-children of individuals, but happy outcomes of the largely unwitting, myopic endeavors of those individuals, over many generations. But there are profound differences as well. Human cooperation is a delicate and remarkable phenomenon, quite unlike the almost mindless cooperation of termites, and indeed quite unprecedented in the natural world, a unique feature with a unique ancestry in evolution. It depends, as we have seen, on our ability to engage each other within the “space of reasons,” as Wilfrid Sellars put it. Cooperation depends, Seabright argues, on trust, a sort of almost invisible social glue that makes possible both great and terrible projects, and this trust is not, in fact, a “natural instinct” hard-wired by evolution into our brains. It is much too recent for that. 104 Trust is a by-product of social conditions that are at once its enabling condition and its most important product. We have bootstrapped ourselves into the heady altitudes of modern civilization, and our natural emotions and other instinctual responses do not always serve our new circumstances.

Civilization is a work in progress, and we abandon our attempt to understand it at our peril. Think of the termite castle. We human observers can appreciate its excellence and its complexity in ways that are quite beyond the nervous systems of its inhabitants. We can also aspire to achieving a similarly Olympian perspective on our own artifactual world, a feat only human beings could imagine. If we don’t succeed, we risk dismantling our precious creations in spite of our best intentions. Evolution in two realms, genetic and cultural, has created in us the capacity to know ourselves. But in spite of several millennia of ever-expanding intelligent design, we still are just staying afloat in a flood of puzzles and problems, many of them created by our own efforts of comprehension, and there are dangers that could cut short our quest before we— or our descendants— can satisfy our ravenous curiosity. [Dennett, op cit, Kindle Locations 6729-6787]
He closes,
[H]uman minds, however intelligent and comprehending, are not the most powerful imaginable cognitive systems, and our intelligent designers have now made dramatic progress in creating machine learning systems that use bottom-up processes to demonstrate once again the truth of Orgel’s Second Rule: Evolution is cleverer than you are. Once we appreciate the universality of the Darwinian perspective, we realize that our current state, both individually and as societies, is both imperfect and impermanent. We may well someday return the planet to our bacterial cousins and their modest, bottom-up styles of design improvement. Or we may continue to thrive, in an environment we have created with the help of artifacts that do most of the heavy cognitive lifting their own way, in an age of post-intelligent design. There is not just coevolution between memes and genes; there is codependence between our minds’ top-down reasoning abilities and the bottom-up uncomprehending talents of our animal brains. And if our future follows the trajectory of our past— something that is partly in our control— our artificial intelligences will continue to be dependent on us even as we become more warily dependent upon them. [ibid, Kindle Locations 6832-6840]
'eh? A lot to think about, in the context of "AI/IA" broadly (and "NLP/NLU" specifically).

Back to my original "NLU" question: will we be able to write code that can accurately parse, analyze, and "understand" arguments composed in ordinary language?

A lot more study awaits me. Suffice it to say I'm a bit skeptical at this point.

Maybe we could put the hackers at Hooli on it! ;) (NSFW)

MORE NLP NEWS
The evolution of computational linguistics and where it's headed next
May 31, 2017 by Andrew Myers

Earlier this year, Christopher Manning, a Stanford professor of computer science and of linguistics, was named the Thomas M. Siebel Professor in Machine Learning, thanks to a gift from the Thomas and Stacey Siebel Foundation.

Manning specializes in natural language processing – designing computer algorithms that can understand meaning and sentiment in written and spoken language and respond intelligently. His work is closely tied to the sort of voice-activated systems found in smartphones and in online applications that translate text between human languages. He relies on an offshoot of artificial intelligence known as deep learning to design algorithms that can teach themselves to understand meaning and adapt to new or evolving uses of language...
Worth re-citing/linking these prior posts at this point:
The Great A.I. Awakening? Health care implications?
Are structured data the enemy of health care quality?
"I'm a bit skeptical at this point."

15 Computational Semantics
CHRIS FOX

1 Introduction
In this chapter we will generally use ‘semantics’ to refer to a formal analysis of meaning, and ‘computational’ to refer to approaches that in principle support effective implementation, following Blackburn and Bos (2005). There are many difficulties in interpreting natural language. These difficulties can be classified into specific phenomena – such as scope ambiguity, anaphora, ellipsis and presuppositions. Historically, different phenomena have been explored within different frameworks, based upon different philosophical and methodological foundations. The nature of these frameworks, and how they are formulated, has an impact on whether a given analysis is computationally feasible. Thus the topic of computational semantics can be seen to be concerned with the analysis of semantic phenomena within computationally feasible frameworks...

2.1 A standard approach
In general it is difficult to reason directly in terms of sentences of natural language. There have been attempts to produce proof-theoretic accounts of sentential reasoning (for example, Zamansky et al., 2006; Francez & Dyckhoff 2007), but it is more usual to adopt a formal language, either a logic or some form of set theory, and then translate natural language expressions into that formal language. In the context of computational semantics, that means a precise description of an algorithmic translation rather than some intuitive reformulation of natural language. Such translations usually appeal to a local principle of compositionality. This can be characterized by saying that the meaning of an expression is a function of the meaning of its parts...

2.2 Basic types
When considering the representations of nouns, verbs, and sentences as properties, relations, and propositions respectively, we may have to pay attention to the nature of the permitted arguments. For example, we may have: properties of individuals; relationships between individuals; relationships between individuals and propositions (such as statements of belief and knowledge); and, in the case of certain modifiers, relations that take properties as arguments to give a new property of individuals. Depending upon the choice of permitted arguments, and how they are characterized, there can be an impact on the formal power of the underlying theory. This is of particular concern for a computational theory of meaning: if the theory is more powerful than first-order logic, then some valid conclusions will not be derivable by computational means; such a logic is said to be incomplete, which corresponds with the notion of decidability (Section 1, and Section 1.2 of Chapter 2, COMPUTATIONAL COMPLEXITY IN NATURAL LANGUAGE)...

2.3 Model theory and proof theory
There are two ways in which traditional formal semantic accounts of indicatives have been characterized. First, we may be interested in evaluating the truth of indicatives (or at least their semantic representation) by evaluating their truth conditions with respect to the world (or, more precisely, some formal representation or model of a world). This can be described as model-theoretic semantics. Model-theoretic accounts are typically formulated in set theory. Set theory is a very powerful formalism that does not lend itself to computational implementation. In practice, the full power of set theory may not be exploited. Indeed, if the problem domain itself is finite in character, then an effective implementation should be possible regardless of the general computational properties of the formal framework (see Klein 2006 for example).

On the second characterization of formal semantic accounts, the goal is to formalize some notion of inference or entailment for natural language. If one expression in natural language entails another, then we would like that relation to be captured by any formalization that purports to capture the meaning of natural language. This can be described as proof-theoretic semantics. Such rules may lend themselves to fairly direct implementation (see for example van Eijck and Unger (2004); Ramsay (1995); Bos and Oka (2002), the last of which supplements theorem proving with model building).

Although a proof-theoretic approach may seem more appropriate for computational semantics, the practical feasibility of general theorem proving is open to question. Depending on the nature of the theory, the formalization may be unde-cidable. Even with a decidable or semi-decidable theory, there may be problems of computational complexity, especially given the levels of ambiguity that may be present (Monz and de Rijke 2001)...


(2013-04-24). The Handbook of Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics) (pp. 394 - 402). Wiley. Kindle Edition.
Most of what I'm finding thus far is a lot of jargon-laden academic abstraction, none of it going to answering my core NLP question: can we develop code capable of accurately analyzing the logic in prose arguments -- the aggregate "semantic" "meanings" comprising a proffer? This book, notwithstanding its 801 pg. heft, frequently begs off with "beyond the scope" apologies.
Perhaps the difficulties are simply too numerous and imposing to surmount (as of today, anyway) in light of the innumerable ways to state any given prose argument -- ranging from the utterly inelegant (e.g., Eustis) to the eloquently evocative (e.g., Dennett), from the methodically Socratic/trial-lawyer-like to the rambling, unfocused, and redundant, from the fastidiously lexically and grammatically precise to the sloppily mistake-ridden, from the explicit and accessible to the obtusely inferential...
But, then,
"Computers will understand sarcasm before Americans do."  - Geoffrey Hinton
There's certainly a thriving international academic community energetically whacking away at this stuff.


From one of the "invited papers" (pdf) in this edition:



UPDATE

Good article.
How Language Led To The Artificial Intelligence Revolution
Dan Rowinski

In 2013 I had a long interview with Peter Lee, corporate vice president of Microsoft Research, about advances in machine learning and neural networks and how language would be the focal point of artificial intelligence in the coming years.

At the time the notion of artificial intelligence and machine learning seemed like a “blue sky” researcher’s fantasy. Artificial intelligence was something coming down the road … but not soon.
I wish I had taken the talk more seriously.

Language is, and will continue to be, the most important tool for the advancement of artificial intelligence. In 2017, natural language understanding engines are what drive the advancement of bots and voice-activated personal assistants like Microsoft’s Cortana, Google Assistant, Amazon’s Alexa and Apple’s Siri. Language was the starting point and the locus of all new machine learning capabilities that have come out in recent years...
____________

More to come...

Tuesday, May 23, 2017

"Assuming / Despite / If / Then / Therefore / Else..." Could AI do "argument analysis?"


When I was a kid in grade school, back prior to indoor plumbing, it was just broadly referred to as "reading comprehension" -- "What was the author's main point?" Did she provide good evidence for her point of view? Do you agree or disagree with the author's conclusion? Why? Explain..."

The oral equivalent was taught in "debate teams" prep. #NLP

Now along comes the part of "AI" technology R&D (Artificial Intelligence) known by its top-level acronym "NLP" (Natural Language Processing). We see increasing discourse on developments in "Machine Learning," "Deep Learning," "Natural Language Generation" (NLG) and "Natural Language Understanding" (NLU).
There's been a good bit of chatter of late in the Health IT news about the asserted utility of NLP. See here as well.
I am interested in particular in the latter (NLU), most specifically as it pertains to rational "argumentative" discourse (mostly of the written type). e.g., "Critical Thinking" comes to mind (I was lucky to get to teach it for a number of years as an adjunct faculty member). I was subsequently accorded the opportunity to teach a graduate seminar in the higher-level "Argument Analysis."

From my grad seminar syllabus:
We focus on effective analysis and evaluation of arguments in ordinary language. The "analysis" part involves the process of getting at what is truly being argued by a proponent of a position on an issue. Only once we have done that can we begin to accurately assess the relative merits of a proposition—the "evaluation" phase. These skills are essential to grasp if we are to become honest and constructive contributors to debate and the resolution of issues.

Our 24/7 global communications civilization is awash in arguments ranging from the trivial to grand themes of moral import. Advocates of every stripe and theme pepper us relentlessly with persuasion messages ranging from the "short and sweet" to the dense and inscrutable. We have more to consider and evaluate than time permits, so we must prioritize. This we often do by making precipitous snap judgments—"Ready-Shoot-Aim"—which then frequently calcify into prejudice. The sophistication and nuance of language enables a savvy partisan to entice us into buying into an argument perhaps not well supported by the facts and logic...
I first encountered "Argument Analysis" in the fall of 1994 as an "Ethics & Policy Studies" graduate student myself. I chose for my first semester paper an analytic deconstruction of the PNHP 1994 JAMA paper "A Better-Quality Alternative: Single-Payer National Health System Reform."

The first two opening paragraphs:
MANY MISCONSTRUE US health system reform options by presuming that "trade-offs" are needed to counter-balance the competing goals of increasing access, containing costs, and preserving quality. Standing as an apparent paradox to this zero-sum equation are countries such as Canada that ensure access to all at a cost 40% per capita less, with satisfaction and outcomes as good as or better than those in the United States. While the efficiencies of a single-payer universal program are widely acknowledged to facilitate simultaneous cost control and universal access, lingering concerns about quality have blunted support for this approach.
Quality is of paramount importance to Americans. Opponents of reform appeal to fears of diminished quality, warning of waiting lists, rationing, and "government control." Missing from more narrow discussions of the accuracy of such charges is a broader exploration of the quality implications of a universal health care program. Conversely, advocates of national health insurance have failed to emphasize quality issues as key criteria for reform, often assuming that we have "the best medical services in the world." They portray reform primarily as extending the benefits of private insurance to those currently uninsured, with safeguards added to preserve quality.
For the "analysis" phase I undertook to examine and "flowchart" the subordinate arguments' logic of the 49 paragraphs of assertions comprising the PHNP article, numbering every argument statement as "paragraph(n), sentence(n.n), and sub-sentence truth-claim clause(n.n.a,b,c...) where warranted" as evident by close reading of the text. My full (pdf) copy of the paper is parked here.

Click to enlarge.
Dotted lines denote a "despite" (a.k.a. "notwithstanding") statement, whereas solid lines depict "because-therefore" premise-to-conclusion movement in the direction of the arrowheads.

It was tedious. The bulk of the first 25 pages of the 56 page paper comprised this analytic "flowcharting" visualization helpful for what the late Steve Covey would characterize as a crucial "seek first to understand" effort. The remaining 31 pages subsequently focused on my (in large measure subjective) critical evaluation of the logic and evidence provided by the authors.
BTW: I'm certain I didn't get everything exactly right on the "analysis" side (or the eval side, for that matter). It was my first run at this type of thing. And, I had a second course to deal with at the time ("History of Ethics," 11 required texts) and was still working full-time at my Medicare QIO job.
Look at sentence 1.1, for example. You could nit-pick my decision, by splitting it up into "b" and "a" because-therefore clauses. Because "presuming trade-offs are needed," therefore "Many misconstrue..." Not that it'd have made a material difference in the analysis, but, still.
UPDATE: per the topic of my 1994 paper, Dr. Danielle Ofri in the news:
Americans Have Realized They Deserve Health Care
How long until they accept that the only way to guarantee it is through single-payer?
I have a good 100 hours or so in that one grad school paper. Imagine trying to do that to an entire book. Utterly impractical. So, we mostly suffer our "confirmation bias" and similar heuristic afflictions and jump to premature conclusions -- the bells we can't un-ring.

Hmmm...


Could we develop an AI NLU "app" for that? (I don't underestimate the difficulty, given the myriad fluid nuances of natural language. But, still...)
Thinking about NLP applicability to digital health infotech (EHRs), the differential dx SOAP method (Subjective, Objective, Assessment, and Plan) is basically an argument process, no? You assemble and evaluate salient clinical evidence (the "S" and the "O" data, whether numerical, encoded, or lexical narrative), which point in the aggregate to a dx conclusion and tx decision (the "A" and the "P"). I guess we'll see going forward whether applied NLP has any material net additional utility in the dx arena, or whether it will be just another HIT R&D sandbox fad.
Logic visualization software is not exactly news. In the late 80's I developed an instrumentation statistical process control program for the radiation lab where I worked in Oak Ridge -- the "IQCstats" system (pdf). Below is one page of the 100 or so comprising the logic flowcharts set included in my old 2" bound "User and Technical Guide" manual.

Click to enlarge

The flowcharts were generated by an "app" known as "CLEAR," which parsed my source code logic and rendered a complete set of flowcharts.

While "critical evaluation" of arguments proffered in ordinary language might not lend itself to automated digital assessment (human "judgments"), mapping the "Assuming / Despite / If / Then / Therefore / Else" logic might indeed be do-able in light of advances in "Computational Linguistics" (abetted by our exponentially increasing availability of ever-cheaper raw computing power).

Below, my graphical analogy for the fundamental unit of "argument" (a.k.a. "truth claim").

Click to enlarge



Any complex argument arises from assemblages of the foregoing "atomic" and "molecular" "particles" (once you've weeded through and discarded all of the "noise").
I should add that most of what I'm interested in here goes to "informal/propositional logic" in ordinary language. Formal syllogistic logic (e.g., formal deductive "proofs") comprise a far smaller subset of what we humans do in day-to-day reasoning.
English language discourse, recall, beyond the smaller "parts of speech," is comprised of four sentence types:
  1. Declarative;
  2. Interrogative;
  3. Imperative;
  4. Exclamatory.
We are principally interested in the subset of declaratives known as "truth claims" -- claims in need of evaluation prior to acceptance -- though we also have to be alert to the phony "interrogative" known as the "loaded question," i.e., an argument insinuation disingenuously posed as a "have-you-stopped-beating-your-wife" type of query. (Then there's also stuff like subtle inferences, ambiguities, and sarcasm, etc that might elude AI/NLU.)

NLP AND LINGUISTICS

It occurs to me that, notwithstanding my longstanding chops on the verbal/written side, I've never had any formal study in "linguistics," much less its application in NLP. Time to start reading up.

Introduction
Natural languages are the languages which have naturally evolved and used by human beings for communication purposes, For example Hindi, English, French, German are natural languages.  Natural language processing or NLP (also called computational linguistics) is the scientific study of languages from computational perspective. natural language processing (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages. Natural language generation systems convert information from computer databases into readable human language. Natural language understanding systems convert samples of human language into more formal representations such as parse trees or first order logic that are easier for computer programs to manipulate. Many problems within NLP apply to both generating and understanding; for example, the computer must  be able to model morphology (the structure of words) in order to understand an English sentence, and a model of morphology is also needed for producing a grammatically correct English sentence, i.e., natural language generator.

NLP has significant overlap with the field of computational linguistics, and is often considered a subfield of artificial intelligence. The term natural language is used to distinguish human languages (such as Spanish, Swahili, or Swedish) from formal or computer languages (such as C++, Java, or LISP).  Although NLP may end comp us both  text and speech, work on speech processing is conventionally done in a separate field.

In NLP, the techniques are developed which aim the computer to understand the commands given in natural language and perform according to it. At present, to work with computer, the input is required to be given in formal languages. The formal languages are those languages which are specifically developed to communicate  with computer and are understood by machine, e.g., FORTRAN, Pascal, etc. Obviously, to communicate with computer, the study of these formal languages is required. Understanding these languages is  cumbersome and requires additional efforts to understand these. Hence, it limits their applications in computer. As compared to this, the communication in natural language will facilitate the functioning and communication with computer easily and in user-friendly way.

 Natural language processing is a significant area of artificial intelligence because a computer would be considered intelligent  if it can understand the commands given in natural language instead of C, FORTRAN, or Pascal. Hence, with the ability of computers to understand natural language it becomes much easier to communicate with computers. Also the natural language processing can be applied as a productivity tool in applications ranging from summarization of news to translate from one language to another. Though, the surface level processing of natural languages seems to be easy the deep level processing of natural languages, understanding of implicit messages and intentions of the speaker are extremely difficult avenues...
Ya have to wonder whether that was written by a computer. Minimally, a non-native English speaker/writer.

I've also just read up on "linguistics" broadly via a couple of short books, just to survey the domain.


The real meat comes here:


801 pages of dense, comprehensive detail.
Introduction
The field of computational linguistics (CL), together with its engineering domain of natural language processing (NLP), has exploded in recent years. It has developed rapidly from a relatively obscure adjunct of both AI and formal linguistics into a thriving scientific discipline. It has also become an important area of industrial development. The focus of research in CL and NLP has shifted over the past three decades from the study of small prototypes and theoretical models to robust learning and processing systems applied to large corpora. This handbook is intended to provide an introduction to the main areas of CL and NLP, and an overview of current work in these areas. It is designed as a reference and source text for graduate students and researchers from computer science, linguistics, psychology, philosophy, and mathematics who are interested in this area.
The volume is divided into four main parts. Part I contains chapters on the formal foundations of the discipline. Part II introduces the current methods that are employed in CL and NLP, and it divides into three subsections. The first section describes several influential approaches to Machine Learning (ML) and their application to NLP tasks. The second section presents work in the annotation of corpora. The last section addresses the problem of evaluating the performance of NLP systems. Part III of the handbook takes up the use of CL and NLP procedures within particular linguistic domains. Finally, Part IV discusses several leading engineering tasks to which these procedures are applied...

(2013-04-24). The Handbook of Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics) (p. 1). Wiley. Kindle Edition.
Interesting. BTW, nice summation of Computational Linguistics on the Wiki.
Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective.

Traditionally, computational linguistics was performed by computer scientists who had specialized in the application of computers to the processing of a natural language. Today, computational linguists often work as members of interdisciplinary teams, which can include regular linguists, experts in the target language, and computer scientists. In general, computational linguistics draws upon the involvement of linguists, computer scientists, experts in artificial intelligence, mathematicians, logicians, philosophers, cognitive scientists, cognitive psychologists, psycholinguists, anthropologists and neuroscientists, among others.

Computational linguistics has theoretical and applied components. Theoretical computational linguistics focuses on issues in theoretical linguistics and cognitive science, and applied computational linguistics focuses on the practical outcome of modeling human language use...
"applied computational linguistics focuses on the practical outcome of modeling human language..."

Like, well, NLU Argument Analytics?

UPDATE: I'm hitting a motherload of good stuff in Chapter 15 of "The Handbook..." on "computational semantics."

After getting up to speed on the technical concepts and salient details, perhaps the next step would involve learning Python.

"This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:
  • Extract information from unstructured text, either to guess the topic or identify "named entities"
  • Analyze linguistic structure in text, including parsing and semantic analysis
  • Access popular linguistic databases, including WordNet and treebanks
  • Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence
This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful."
apropos of this topic generally, a couple of prior posts of mine come to mind. See "The Great A.I. Awakening? Health Care Implications?" and "Are structured data now the enemy of health care quality?"

Tangentially, my post of July 2015 "AI vs IA: At the cutting edge of IT R&D" as well.

So, could we use digital NLU technology to passably analyze natural language arguments, rather than just turning lab data and ICD-10 codes into SOAP narratives (and the converse)?

Me and my crazy ideas. Never gonna make it into any episodes of "Silicon Valley" (NSFW).

Perhaps our Bootcamp Insta-Engineer pals at ZIPcode Wilmington could have a run at Argumentation NLU?

Seriously, how about a new subset of CL tech R&D, "NLAA" -- "Natural Language Argument Analysis?"
__

UPDATE: RECENT NLG REPORTAGE

From Wired:
What News-Writing Bots Mean for the Future of Journalism
Joe Keohane, 02.16.17

WHEN REPUBLICAN STEVE King beat back Democratic challenger Kim Weaver in the race for Iowa’s 4th congressional district seat in November, The Washington Post snapped into action, covering both the win and the wider electoral trend. “Republicans retained control of the House and lost only a handful of seats from their commanding majority,” the article read, “a stunning reversal of fortune after many GOP leaders feared double-digit losses.” The dispatch came with the clarity and verve for which Post reporters are known, with one key difference: It was generated by Heliograf, a bot that made its debut on the Post’s website last year and marked the most sophisticated use of artificial intelligence in journalism to date.

When Jeff Bezos bought the Post back in 2013, AI-powered journalism was in its infancy. A handful of companies with automated content-generating systems, like Narrative Science and Automated Insights, were capable of producing the bare-bones, data-heavy news items familiar to sports fans and stock analysts. But strategists at the Post saw the potential for an AI system that could generate explanatory, insightful articles. What’s more, they wanted a system that could foster “a seamless interaction” between human and machine, says Jeremy Gilbert, who joined the Post as director of strategic initiatives in 2014. “What we were interested in doing is looking at whether we can evolve stories over time,” he says...
More and more examples abound on the NLG side of things. Just Google "written by AI."

UPDATE: SPEAKING OF NEWS

I cited this excellent book a while back.


Re: Chapter 8, "Computational Journalism"
In 2009, Fred Turner and I wrote: “What is computational journalism? Ultimately, interactions among journalists, software developers, computer scientists and other scholars over the next few years will have to answer that question. For now though, we define computational journalism as the combination of algorithms, data, and knowledge from the social sciences to supplement the accountability function of journalism.”

Hamilton, James T. (2016-10-10). Democracy’s Detectives (Kindle Locations 10750-10753). Harvard University Press. Kindle Edition.
NLP seems an obvious fit, 'eh?
__

UPDATE

Check this out:

Study the intersection of language and technology and place yourself at the forefront of a dynamic field by earning a Master of Science in Computational Linguistics from the University of Washington.

The powerful connections between text, human speech and computer technology are having a growing impact on society and our everyday lives. In this program, you can explore a discipline that has applications in a wide variety of fields – including business, law and medicine – and incorporates such diverse technologies as predictive text messaging, search engines, speech recognition, machine translation and dialogue systems...
Nice. That's something I would do in a heartbeat. Here's their web link.


Seattle is a special place for me to begin with. Both of my daughters were born there. (I'm writing this while sitting with my younger daughter as she goes through round 2 of her chemo tx.)

Seattle. Those were the days...
(With much gratitude for the Statute of Limitations.)
I still have numerous rock-solid friendships there. Sadly, recently lost one. He succumbed after a terrible 11 battle with Mantle Cell Lymphoma. He was a younger brother to me. Without qualification one of the best drummers on the planet. He could've played for Sting.

CODA

From The Atlantic:
Rethinking Ethics Training in Silicon Valley
“If technology can mold us, and technologists are the ones who shape that technology, we should demand some level of ethics training for technologists.”

- Irina Raicu
I work at an ethics center in Silicon Valley.

I know, I know, “ethics” is not the first word that comes to mind when most people think of Silicon Valley or the tech industry. It’s probably not even in the top 10. But given the outsized role that tech companies now play, it’s time to focus on the ethical responsibilities of the technologists who help shape our lives...
Yeah, I know, the jokes just write themselves. "Silicon Valley" and "ethics" in the same sentence?

I was not even aware of this place.



I will have to study up on them and report further.

apropos, see my 2015 post "The old internet of data, the new internet of things, and 'Big Data' and the evolving internet of YOU."
____________

More to come...

Wednesday, May 17, 2017

An "Innovation" Oopsie


So, I got a new Twitter Follow, and, as is my custom after just a quick bit of relevance and authenticity vetting, I reciprocated.


OK, I'll bite. Curious, despite knowing that the requisite "registration" form meant that I'd be getting pitched thereafter, despite my being a mere solo independent ankle-biting health care space blogger.

Corporate "Change Management" (remember that?) is by now long passe. We need Disruptive, Transformational Innovation Management.


Suppressing just a slight waft of a Jim Kwik Moment, I attempted to register.


Okeee-dokeee, then.

Below: at the outset of this encounter (after they got the registration form to work), This was rich.


Whatever, bro's. I thereupon substituted an email alias that bounces off one of my websites to my Comcast ISP inbox, and that worked, notwithstanding that it's the same destination.

Didn't matter (see above). Zip. Zilch. Nada. Nyet.

I expect they'll fix this mess straight away. I had other stuff to tend to.

My Twitter relationship with these peeps may well quickly go the way of my short-lived ZIPcode Wilmington mutual hang. They dropped me like a broken radioisotope container.

BTW: This Jeremiah Owyang fellow is all over YouTube.


OK, my Jim Kwik Moment is allayed. Notwithstanding the profuse Silicon Valley jargon.

It will not surprise me one whit to soon see an Irony-Free Zone pitch for "Creativity Management Software."

BTW: On "disruption," see my 2017 New Year's Day post.
__

OTHER STUFF

Trying to get back on pace with my reading.



The "Handbook of Computational Linguistics..." in particular will make your head spin.

Specifically interesting in potential apps for "NLU" (Natural Language Understanding). The "Natural Language Generation" part of NLP is way further along (e.g., turning "data" into narrative text/language via "AI").

May have to dust off my ancien coding chops and learn Python.


Stay tuned.
____________

More to come...

Monday, May 15, 2017

The dx from Hell: ICD-10 code C25.9

Pancreatic Cancer
Pancreatic cancer (PC) has the highest case-fatality rate of any of the major cancers, both in the US and worldwide. The disease is difficult to detect, rapidly metastatic, resistant to treatment, and often results in death. Pancreatic adenocarcinoma is also one of the most difficult cancers to study. Case-control studies may be inaccurate because patients with PC often die within weeks of their diagnosis. At the same time, prospective studies of PC are challenging due to the relative rarity of this type of cancer (~ 1% lifetime risk) and low prevalence due to short life expectancy. Consequently, PC etiology is often investigated by analyzing data from large-scale prospective studies or clinical trials for diseases other than PC, but limited numbers of cases and methodological heterogeneity (e.g., no or incomplete histological verification) affect the validity of these results.

The etiology of PC is widely acknowledged to be multi-factorial. The incidence of PC is greater in males than in females, and higher in Blacks than Whites. According to SEER 17 areas data, the age-adjusted incidence of PC in 2006 per 100,000 individuals was 11.61 (95%CI 11.34-11.88) for Whites and 15.57 (95%CI 14.57-16.62) for Blacks, with 16.56 for Black men (95% CI 15.08-18.61). Environmental or host risk factors shown to be associated with PC include cigarette smoking, obesity, type II diabetes mellitus, chronic pancreatitis, physical inactivity and blood groups A or B. Dietary risks may be related to low fruit and vegetable intake and increased intake of high-heat cooked meats (i.e., grilled/fried animal protein sources). Two separate, recent studies linked pancreatic cancer to high consumption of carbohydrates and alcohol. Unfortunately, these common risk factors have small effect sizes, so it is difficult to produce highly accurate risk models. For example, smoking yields a risk ratio of approximately 2. The risk of developing PC is recognized as being exceptionally elevated in certain genetically predisposed families (e.g., hereditary pancreatitis,), but only about 10% of all PC cases can be attributed to genetic causes...
From pancreas.org

Lightening strikes yet again. I've alluded briefly to this new circumstance here and there in prior posts.

On March 29th, the radiologist's report from a CT scan done at a Kaiser Permanente facility indicated that my younger daughter Danielle is afflicted with Stage IV metastatic pancreatic cancer, a finding confirmed by a subsequent liver biopsy. (She's given permission to go public with this horrific news.)

Our world has been turned upside down ever since. My KHIT efforts have been significantly hamstrung. Danielle started Folfirinox chemo (after first being accepted into and then, in the wake of some subsequent disqualifyingly elevated adverse labs, excluded from a UCSF clinical trial). Her severe side-effects reaction to the first chemo round landed her in the hospital, where my wife and I spent all of last week by her bedside in shifts.


Above, Sissy (top) and Danielle (bottom) in 1974 in Seattle, the year I got a divorce and custody of both of them. The backstory on my salt and pepper kids.

Needless to say, we are all reeling. One of her friends started a crowdfunding page for her, for which we could not be more grateful. She will need every dime. Danielle's out-of-pocket expenses to date alone are mind-boggling (KP membership notwithstanding). If her illness doesn't kill her, it will most certainly bankrupt her -- in relatively short order. She will shortly be the former Executive Director of The Stepping Stones Project (they've generously given her extended "medical leave," which, though, necessarily runs out by month's end).

We've moved her back home, and I will be breaking her lease and packing up and stowing her apartment shortly, and tending to the myriad piling-up logistical and legal assistance details.
And, get excused from the jury duty summons I just got.
So, yeah, I'm a bit behind the curve. My life the past seven weeks has been an endless recursion of "oh, SHIT!" moments.

Prior to this news. I'd been trying to finish out my "One in Three" book. Gonna have to scrap the title and cover photo and broaden the scope.

1980, Knoxville
 __

AUGUST 13TH UPDATE
____________

More to come...