Die Ergebnisse können Gästen nur in Auswahl angezeigt werden. Bitte loggen Sie sich für Vollzugriff ein: Login

Treffer: Evaluating Inductive Reasoning and Programming Capabilities of Large Language Models With The One-Dimensional Abstract Reasoning Corpus.

Title:

Evaluating Inductive Reasoning and Programming Capabilities of Large Language Models With The One-Dimensional Abstract Reasoning Corpus.

Authors:

Mesnage, Cédric¹ (AUTHOR) c.s.mesnage@exeter.ac.uk, Wang, Xiaoyang² (AUTHOR), Dong, Hang² (AUTHOR), Aishwaryaprajna² (AUTHOR)

Source:

Intelligenza Artificiale. Aug2025, Vol. 19 Issue 2, p102-115. 14p.

Subject Terms:

*LANGUAGE models, *PYTHON programming language, *GENERATIVE pre-trained transformers, *EVALUATION methodology, *INDUCTION (Logic), *COMPUTER programming

Database:

Academic Search Index

Weitere Informationen

We present an initial automated test to evaluate LLMs' capacity to perform inductive reasoning tasks. We use the GPT-3.5 and GPT-4 models to create a system which generates Python code as hypotheses for inductive reasoning to transform sequences of the One Dimensional Abstract Reasoning Corpus (1D-ARC) challenge. We experiment with three prompting techniques, namely standard prompting, Chain of Thought (CoT), and direct feedback. We provide results and an analysis of cost-to-success rate and benefit-cost ratio. Our best result is an overall 25% success rate with our CoT prompting on GPT-4, significantly surpassing the standard prompting approach. We assess the programming capabilities of the LLM by analysing the execution rate and errors of the generated code for inductive reasoning. We discuss potential avenues to improve our experiments, testing other strategies, and combining deductive reasoning with LLM-based inductive reasoning. [ABSTRACT FROM AUTHOR]

Treffer: Evaluating Inductive Reasoning and Programming Capabilities of Large Language Models With The One-Dimensional Abstract Reasoning Corpus.

Weitere Informationen

Links

Zusatz-Funktionen