Above: yeah, that'd be @BobbyGcyborg in the wake of the Singularity. |
Lotta good stuff in Science Magazine of late. to wit:
When Iosif Gidiotis began his doctoral studies in educational technology this year, he was intrigued by reports that new tools powered by artificial intelligence (AI) could help him digest the literature in his discipline. With the number of papers burgeoning—across all of science, close to 3 million were published last year—an AI research assistant “sounds great,” says Gidiotis, who is studying at the KTH Royal Institute of Technology. He hoped AI could find more relevant papers than other search tools and summarize their highlights.
He experienced a bit of a letdown. When he tried AI tools such as one called Elicit, he found that only some of the returned papers were relevant, and Elicit’s summaries weren’t accurate enough to win him over. “Your instinct is to read the actual paper to verify if the summary is correct, so it doesn’t save time,” he says. (Elicit says it is continuing to improve its algorithms for its 250,000 regular users, who in a survey credited it with saving them 90 minutes a week in reading and searching, on average.)
Created in 2021 by a nonprofit research organization, Elicit is part of a growing stable of AI tools aiming to help scientists navigate the literature. “There’s an explosion of these platforms,” says Andrea Chiarelli, who follows AI tools in publishing for the firm Research Consulting. But their developers face challenges. Among them: The generative systems that power these tools are prone to “hallucinating” false content, and many of the papers searched are behind paywalls. Developers are also looking for sustainable business models; for now, many offer introductory access for free. “It is very difficult to foresee which AI tools will prevail, and there is a level of hype, but they show great promise,” Chiarelli says.
Like ChatGPT and other large-language models (LLMs), the new tools are “trained” on large numbers of text samples, learning to recognize word relationships. These associations enable the algorithms to summarize search results. They also identify relevant content based on context in the paper, yielding broader results than a query that uses only keywords. Building and training an LLM from scratch is too costly for all but the wealthiest organizations, says Petr Knoth, director of CORE, the world’s largest repository of open-access papers. So Elicit and others use existing open-source LLMs trained on a wide array of texts, many nonscientific...
So, of course, I hopped on over to the Elicit website to rummage around and check things out.
- Good reasoning is reasoning that reliably arrives at true beliefs and good decisions.
- Good reasoning is rare but essential for government, companies, and research.
- Advanced AI systems are an opportunity to radically scale up good reasoning.
- Getting this right is crucial for guiding the transition to a society that runs on machine thinking.
"A society that runs on machine thinking." Hmmm... what could possibly go wrong?
What is good reasoning?Yeah. No pick with any of that.We want people, organizations, and machines to arrive at:The status quoGood reasoning is reasoning that reliably leads to these outcomes.
- True beliefs
- Good decisions
Right now, good reasoning is rare. Bad reasoning is everywhere:
- Governments allocate resources for pandemics and similar risks based on biases, political pressures, or vested interests
- Courts judge cases based on selective, fallacious, or unfair use of evidence, arguments, and law
- Companies decide their strategies based on overconfidence, underestimation, or complacency
- Investors allocate capital based on herd mentality, hype, or fear
- Researchers do work that is unimpactful or harmful, and manipulate or hide data
But, "why do humans reason?" Well, predominantly to win the argument (Sperber & Mercier's "Adaptive Utility"). Should verifiable truth happen along the way, so much the better.
I sent an email to my son Nick and his lifelong best friend Nate:
Yo, FYI:As scientists face a flood of papers, AI developers aim to help | Science | AAAS
This stuff is serious interest to me. One of my long-term questions has been, “can AI do argument analysis?“
What is “argument analysis?“ Actually the full phrase is “argument analysis and evaluation.“ I got hip to it in graduate school at UNLV. The “analysis“ part is basically where you “flowchart“ the logic in a prose piece making the case for some assertion of truth. All of the “if/then/therefore/else contingency stuff. The “evaluation“ part that follows is basically then an argument of YOURS assessing the relative strengths and weaknesses of the original proffer set forth by the author. The difficulty comes when you consider that everyone has a significantly different way of expressing the same ideas in prose writing. Language fluidity (not to mention the 100+ different human languages around the world these days).
See my argument analysis paper here. (pdf)
Basically, I went through the JAMA paper paragraph by paragraph, flowcharting the “logic” that comprised the sub-arguments leading to the final conclusion (took me probably 100 hours). I wrote about this also on my blog, here:
Imagine trying to do this shit manually to a full length book. You’d fucking never be heard from again.
If someone could come up with a “AI/LLM“ app that could parse the reasoning in a complex scientific or technical paper (or book), or, say, a complex political policy argument, it would totally kick ass. My 3GL/4GL "structured" RDBMS programming skills are two generations out of date by now. But, logic is logic, and the same basic principles apply – I assume.
Just looking for a bit of feedback. You guys are way smart.
Pop
PS- looking into this “Elicit” company cited in the article, oh, geez, they too are trying to “make the world a better place.” 🤣
'eh?
Nick, our baby, is now 40. He started his current job as the Operations Manager for a custom beverage packaging & distribution company based here in Baltimore. He's transitioned of late into an IT management role. He has degrees in accounting and finance, and operations management. He and his bud Nate, a year younger, met at The Hebrew Academy in Las Vegas when Nick was 9. We're not Jewish, but we sent him there anyway. Neither are we Catholic (I'm a fully recovered Episcopalian, dilettante UU, and Secular Zen Sympathizer), but we subsequently sent Nick to Bishop Gorman High School, also in Vegas (I have a funny story about that; I'll get to it).
Nate is absurd. He's the son of Frank Sinatra's Las Vegas bassist, Seth Kimball. Nate now has a Master's in Jazz Performance, a real estate agent license, a private pilot's license, and recently transitioned to work in software engineering for Microsoft (Azure platform). We used to call him "Number Two Son." He spent more time at our house than at home.
Below: From his jazz CD. An Old Soul, our Nate.
Could AI do accurate real-world-relevant "argument analysis" (absent significant "hallucination")?
More from Science:
EDITORIAL702?
Science, justice, and evidence
JENNIFER MNOOKIN
Courts in the United States have increasingly relied on scientific evidence and expert testimony to help resolve questions of fact. On 1 December 2023, amendments to Federal Rule of Evidence 702 will take effect, further clarifying the court’s responsibilities as a gatekeeper for expert evidence. This update comes just a few months after the 30-year anniversary of the Supreme Court’s landmark decision on how federal judges should evaluate scientific evidence. Daubert v. Merrell Dow was hailed as a victory for the use of scientific information in the legal system and certainly cast a much-needed spotlight on scientific evidence in the courtroom. But the nuanced and flexible nature of the “Daubert standard” has since led to substantial inconsistencies in its application. Most strikingly, it has had far more impact in civil cases than criminal cases. Daubert’s core tenet—that scientific evidence introduced in court should be adequately valid and reliable—needs to be taken just as seriously in the criminal justice system and for forensic science as it has been in civil cases.
Daubert instructs judges to be “gatekeepers” responsible for assessing the validity of the science brought to court. Previously, courts often asked only whether the science was “generally accepted” by the relevant scientific community. Because judges often treated an expert witness’s own assertion of general acceptance as adequate, the rule did not present much of a bar for admissibility. With Daubert, the onus is more squarely on the judge to assess validity. The opinion details numerous possible factors to consider (including testing, error rate, peer review, and general acceptance), but gives little truly concrete guidance and allows the court great flexibility in weighing these factors.
When the Daubert decision was handed down, many legal analysts and scientists agreed that the use of experts in court had long been a mess. Some critics lambasted judges for too often permitting expert testimony that wasn’t scientifically credible; others worried that juries were fundamentally incapable of making reasoned decisions when competing experts offered wildly different, contradictory testimony. Daubert shined a much-needed spotlight on expert evidence—but 30 years later, what has been its evidentiary impact?...
Jennifer Mnookin is chancellor and professor of law at the University of Wisconsin–Madison, Madison, WI, USA
JDSupraBack in the mid-late 1980's I worked in a forensic-level environmental radiation lab in Oak Ridge as a systems programmer and QC analyst (we did a ton of litigation support and regulatory action analytics).
F.R.E. 702 12/01/23 AMENDMENTS
In April, the Supreme Court sent a list of proposed amendments to Congress that amend the Federal Rules of Evidence. Absent action by Congress, the rules go into effect December 1, 2023. The proposed amendments affect Rules 106, 615 and, relevant to this article, 702.
Rule 702 addresses testimony by an expert witness. The proposed rule reads as follows (new material is underlined; matters omitted are lined through):
A witness who is qualified as an expert by knowledge, skill, experience, training, or education may testify in the form of an opinion or otherwise if the proponent demonstrates to the court that it is more likely than not that:
(a) the expert’s scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence or to determine a fact in issue;
(b) the testimony is based on sufficient facts or data;
(c) the testimony is the product of reliable principles and methods; and
(d) the expert has reliably applied expert’s opinion reflects a reliable application of the principles and methods to the facts of the case.
The proposed amended seeks to clarify how a judge should view his or her gatekeeping role without substantively changing the rule. Assuming Congress adopts the proposed rule, the proposal amends the rule in two ways.
First, it clarifies that a court should not admit expert testimony unless the proponent demonstrates that it is “more likely than not” that the proffered testimony meets Rule 702’s admissibility requirements. The rules committee recommended the change because many courts hold that the critical question of the sufficiency of the basis for an expert’s opinions goes to the weight of the testimony, not its admissibility. Thus, to be admissible in the future, the proponent of expert testimony must demonstrate, consistent with Rule 104(a) and case law interpreting Rule 104(a), that the testimony meets admissibility requirements – i.e., meets a preponderance of the evidence standard…
Our (now-late) founder and CEO, John A. Auxier, PhD, CHP was the nation's premier expert on radiation dose/exposure. Former Director of Industrial Health & Safety at Oak Ridge National Laboratory, he was a member of the Three Mile Island Commission, and Editor of the Health Physics Society Journal.
The governing forensic legal standard then was "the Frye Standard." One hopes that this new FRE 702 Amendment will materially improve things with respect to "Science, Justice, and Evidence” in ouor endless adversarial venues.
And, let us always keep in mind "Ethics."
More to come. Still studyin' up on these Elicit peeps (pdf). Smart bunch.
Introduction
The world has changed since the rise of civilization some 10,000 years ago. Look around—what do you see? You may be sitting on a chair, in front of a computer, a glass of water next to it on the table, within the walls of a house, the neighbor’s dog barking outside, cars driving by on asphalt roads built next to power lines supplying electricity for the local school or hospital. Almost every part of your environment has been shaped by humans; it is there because we intend for it to be there, or at least approve of its existence.
This has arguably been a change for the better—as indicated by the existence of the notion of progress—even if not without exceptions, and not without contention. Basic human needs such as food, shelter, health, and physical safety are provided to a degree far beyond hunter-gatherer times. What is responsible for this change? While it may be difficult to pin down the relative contributions of different causes and enabling factors, it is safe to say that our capability for thought was a necessary ingredient…
Chapter 2
Background: Probabilistic Programming
A probabilistic program is a program in a universal programming language with primitives for sampling from probability distributions, such as Bernoulli, Gaussian, and Poisson. Execution of such a program leads to a series of computations and random choices. Probabilistic programs thus describe models of the stochastic gen- eration of results, implying a distribution on return values. Most of our examples use Church [26], a probabilistic programming language based on the stochastic lambda calculus. This calculus is universal in the sense that it can be used to define any com- putable discrete probability distribution [49] (and indeed, continuous distributions when encoded via rational approximation)…
Reasoning about Reasoning as Nested ConditioningHmmm... WWSS? (What Would Sapolsky Say?)
4.1 Introduction
Reasoning about the beliefs, desires, and intentions of other agents—theory of mind—is a central part of human cognition and a critical challenge for human-like artificial intelligence. Reasoning about an opponent is critical in competitive situations, while reasoning about a compatriot is critical for cooperation, communication, and maintaining social connections. A variety of approaches have been suggested to explain humans’ theory of mind. These include informal approaches from philosophy and psychology, and formal approaches from logic, game theory, artificial intelligence, and, more recently, Bayesian cognitive science.
Many of the older approaches neglect a critical aspect of human reasoning uncertainty—while recent probabilistic approaches tend to treat theory of mind as a special mechanism that cannot be described in a common representational framework with other aspects of mental representation. In this chapter, we discuss how probabilistic programming, a recent merger of programming languages and Bayesian statistics, makes it possible to concisely represent complex multi-agent reasoning scenarios. This formalism, by representing reasoning itself as a program, exposes an essential contiguity with more basic mental representations.
Probability theory provides tools for modeling reasoning under uncertainty: distributions formalize agents’ beliefs, conditional updating formalizes updating of beliefs based on evidence or assertions. This approach can capture a wide range of reasoning patterns, including induction and non-monotonic inference. In cognitive science, probabilistic methods have been very successful at capturing aspects of human learning and reasoning [92]. However, the fact that conditioning is an operation applied to such models and not itself represented in such models makes it difficult to accommodate full theory of mind: We would like to view reasoning as probabilistic inference and reasoning about others’ reasoning as inference about inference; however, if inference is not itself represented as a probabilistic model we cannot formulate inference about inference in probabilistic terms.
Probabilistic programming is a new, and highly expressive, approach to probabilistic modeling. A probabilistic program defines a stochastic generative process that can make use of arbitrary deterministic computation. In probabilistic programs, conditioning itself can be defined as an ordinary function within the modeling language. By expressing conditioning as a function in a probabilistic program, we rep- resent knowledge about the reasoning processes of agents in the same terms as other knowledge. Because conditioning can be used in every way an ordinary function can, including composition with arbitrary other functions, we may easily express nested conditioning: we can condition any random variable, including random variables that are defined in terms of other conditioned random variables. Nested conditioning de- scribes reasoning about reasoning and this makes theory of mind amenable to the kind of statistical analysis that has been applied to the study of mental representation more generally.
The probabilistic program view goes beyond other probabilistic views by extending compositionality from a restricted model specification language to a Turing-complete language, which allows arbitrary composition of reasoning processes. For example, the multi-agent influence diagrams proposed by Koller and Milch [45] combine the expressive power of graphical models with the analytical tools of game theory, but their focus is not on representing knowledge that players’ might have about other players’ reasoning…
ERRATUM
UPDATE
OK, then...
___
UPDATE: ARTIFICIAL PARROTS?"
MORE…
Good stuff.
In the wake of studying Mustafa Suleyman's new book "The Coming Wave," I've begun interacting a bit with his Pi.ai app.
e.g., a question I posed yesterday:
MEAgain, my 1994 grad school Argument Analysis & Eval paper (pdf).
"In 1994, the journal of the American Medical Association, published a paper by an organization known as “Physicians for a National Healthcare Plan.“ In the paper, they argue for what is known as a “single payer system“ for healthcare. Can you analyze and evaluate their argument for me?"
Pi
"The 1994 JAMA paper by Physicians for a National Healthcare Plan (PNHP) argued for the adoption of a single-payer healthcare system in the US. Their main argument was that such a system would provide universal coverage, eliminate the need for private insurance, and reduce administrative costs, thereby improving access to healthcare and reducing overall costs. They proposed that a national formulary would be used to determine which prescription drugs and medical supplies would be covered, and that an expert panel would oversee the program to ensure its effectiveness. Their argument was based on the premise that all Americans should have access to quality healthcare, and that a single-payer system was the most efficient and cost-effective way to achieve this. However, their proposal was met with criticism from those who argued that it would lead to a decline in the quality of care and a loss of patient choice."
Pi.ai left me a bit less than dazzled here. The response was "accurate," but nowhere in the depth I wanted to see. Sounded like a Google search result. But, it was lightening fast, gotta give 'em that.
Q*
In AI research—and, really, in all of science—the rise and fall of ideas is not based on pure meritocracy. Usually, the scientists and companies with the most resources and the biggest loudspeakers exert the greatest influence. Consensus forms around these entities, which effectively means that they determine the direction of AI development. Within the AI industry, power is already consolidated in just a few companies—Meta, Google, OpenAI, Microsoft, and Anthropic. This imperfect process of consensus-building is the best we have, but it is becoming even more limited because the research, once largely performed in the open, now happens in secrecy.
Over the past decade, as Big Tech became aware of the massive commercialization potential of AI technologies, it offered fat compensation packages to poach academics away from universities. Many AI Ph.D. candidates no longer wait to receive their degree before joining a corporate lab; many researchers who do stay in academia receive funding, or even a dual appointment, from the same companies. A lot of AI research now happens within or connected to tech firms that are incentivized to hide away their best advancements, the better to compete with their business rivals… via @TheAtlantic "Why Won’t OpenAI Say What the Q* Algorithm Is?"
COMPLETING THE PICTURE (FOR NOW)
AGI software (e.g., "Scheme" & "Church") has to run on cutting-edge hardware–e.g., the GPU technology. Massively parallel computing.
"AI and the Nvidia GPU"
The revelation that ChatGPT, the astonishing artificial-intelligence chatbot, had been trained on an Nvidia supercomputer spurred one of the largest single-day gains in stock-market history. When the Nasdaq opened on May 25, 2023, Nvidia’s value increased by about two hundred billion dollars. A few months earlier, Jensen Huang, Nvidia’s C.E.O., had informed investors that Nvidia had sold similar supercomputers to fifty of America’s hundred largest companies. By the close of trading, Nvidia was the sixth most valuable corporation on earth, worth more than Walmart and ExxonMobil combined. Huang’s business position can be compared to that of Samuel Brannan, the celebrated vender of prospecting supplies in San Francisco in the late eighteen-forties. “There’s a war going on out there in A.I., and Nvidia is the only arms dealer,” one Wall Street analyst said…UPDATE:
THE NAKED CAPITALISM SKUNK AT THE AI GARDEN PARTY
CODA
__________