【免费】LLM归纳与演绎推理能力的研究资源-CSDN文库

语言模型

需积分: 0 65 浏览量 2024-09-22 11:56:25 上传评论收藏 1.62MB PDF 举报

资源推荐

资源详情

资源评论

Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities

of LLMs

Kewei Cheng

1,2

, Jingfeng Yang

, Haoming Jiang

, Zhengyang Wang

, Binxuan Huang

Ruirui Li

, Shiyang Li

, Zheng Li

, Yifan Gao

, Xian Li

, Bing Yin

, Yizhou Sun

University of California, Los Angeles

Amazon

{chenkewe, jingfe, jhaoming, zhengywa, binxuan, ruirul,

syangli, amzzhe, yifangao, xianlee, alexbyin}@amazon.com

yzsun@cs.ucla.edu

Abstract

Reasoning encompasses two typical types: de-

ductive reasoning and inductive reasoning. De-

spite extensive research into the reasoning ca-

pabilities of Large Language Models (LLMs),

most studies have failed to rigorously diﬀerenti-

ate between inductive and deductive reasoning,

leading to a blending of the two. This raises an

essential question: In LLM reasoning, which

poses a greater challenge - deductive or induc-

tive reasoning? While the deductive reasoning

capabilities of LLMs, (i.e. their capacity to

follow instructions in reasoning tasks), have

received considerable attention, their abilities

in true inductive reasoning remain largely unex-

plored. To investigate the true inductive reason-

ing capabilities of LLMs, we propose a novel

framework, SolverLearner. This framework

enables LLMs to learn the underlying function

(i.e.,

𝑦 = 𝑓

𝑤

(𝑥)

), that maps input data points

(𝑥)

to their corresponding output values

(𝑦)

using only in-context examples. By focusing

on inductive reasoning and separating it from

LLM-based deductive reasoning, we can isolate

and investigate inductive reasoning of LLMs in

its pure form via SolverLearner. Our observa-

tions reveal that LLMs demonstrate remarkable

inductive reasoning capabilities through Solver-

Learner, achieving near-perfect performance

with ACC of 1 in most cases. Surprisingly, de-

spite their strong inductive reasoning abilities,

LLMs tend to relatively lack deductive reason-

ing capabilities, particularly in tasks involving

“counterfactual” reasoning.

1 Introduction

Recent years have witnessed notable progress in

Natural Language Processing (NLP) with the de-

velopment of Large Language Models (LLMs) like

GPT-3 (Brown et al., 2020) and ChatGPT (Ope-

nAI, 2023). While these models exhibit impressive

reasoning abilities across various tasks, they face

challenges in certain domains. For example, a re-

cent study (Wu et al., 2023) has shown that while

LLMs excel in conventional tasks (e.g., base-10

arithmetic), they often experience a notable decline

in accuracy when dealing “counterfactual” reason-

ing tasks that deviate from the conventional cases

seen during pre-training (e.g., base-9 arithmetic).

It remains unclear whether they are capable of fun-

damental reasoning, or just approximate retrieval.

In light of this, our paper seeks to investigate

the reasoning capabilities of LLMs. Reasoning

can encompasses two types: deductive reasoning

and inductive reasoning, as depicted in Fig. 1. De-

ductive reasoning starts with a general hypothesis

and proceeds to derive speciﬁc conclusions about

individual instances while inductive reasoning in-

volves formulating broad generalizations or princi-

ples from a set of instance observations. Despite

extensive research into the reasoning capabilities of

LLMs, most studies have not clearly diﬀerentiated

between inductive and deductive reasoning. For in-

stance, arithmetic reasoning task primarily focuses

on comprehending and applying mathematical con-

cepts to solve arithmetic problems, aligning more

with deductive reasoning. Yet, when employing

in-context learning for arithmetic reasoning tasks,

where the model is prompted with a few

〈

input,

output

〉

examples, the observed improvements are

often attributed to their inductive reasoning capac-

ity. This fusion of reasoning types poses a critical

question: Which is the more signiﬁcant limita-

tion in LLM reasoning, deductive or inductive

reasoning?

To explore this question, it’s crucial to diﬀeren-

tiate between deductive and inductive reasoning.

Current methods that investigate deductive and in-

ductive reasoning often rely on disparate datasets,

making direct comparisons challenging (Xu et al.,

2023a; Tang et al., 2023; Dalvi et al., 2021; Han

et al., 2022; Sinha et al., 2019; Yu et al., 2020). To

overcome this limitation, we have designed a set of

comparative experiments that utilize a consistent

task across diﬀerent contexts, each emphasizing

arXiv:2408.00114v2 [cs.AI] 7 Aug 2024

(b) Few-shot IO w/ Mapping Function

Q: Assuming that all numbers are in base-8 where

the digits are "01234567", what is 57+27?

A: The result for 57+27 is 106.

Q: Assuming that all numbers are in base-8 where

the digits are "01234567", what is 36+33?

(Output) The result for 36+33 is 71.

(a) Zero-shot

Q: Assuming that all numbers are in

base-8 where the digits are

"01234567", what is 36+33?

(Output) 71

Deductive Reasoning Inductive Reasoning

General Principle Specific Conclusion

The result for 36+33 is 71.

General ➔ Specific

Deductive

Reasoning

Q: You are asked to add two numbers, the

base of which is unknown, what is 57+27?

A: The result for 57+27 is 106.

Q: You are asked to add two numbers, the

base of which is unknown, what is 36+33?

(Output) The result for 36+33 is 71.

(d) SolverLearner

Q: You are asked to add two numbers, the base of which is unknown, what is 57+27?

A: The result for 57+27 is 106.

Q: What is the function to map the input to the output?

(Output) def solver(n1: str, n2: str) -> str:\n # Let's analyze the given examples to find

the base\n # 57 + 27 = 106\n # It seems like the base is 8 (octal)\n\n # Convert the

input strings to integers in base 8\n num1 = int(n1, 8)\n num2 = int(n2, 8)\n\n #

Calculate the sum\n result = num1 + num2\n\n # Convert the result back to a string in

base 8\n return oct(result)[2:]

Addition in base 8

Begin from the rightmost digit, perform the addition. If the sum

exceeds 8, subtract 8, record the remainder, and carry over 1

to the next column. Repeat this process from right to left for

each column, and your final result will be the sum in base 8.

The result for 71+44 is 135.

The result for 42+70 is 132.

The result for 50+45 is 115.

The result for 61+55 is 136.

The result for 63+22 is 105.

Specific ➔ Specific Specific ➔ GeneralGeneral + Specific ➔ Specific

Specific Observation General Principle

Addition in base 8

Begin from the rightmost digit, perform the addition. If the sum

exceeds 8, subtract 8, record the remainder, and carry over 1

to the next column. Repeat this process from right to left for

each column, and your final result will be the sum in base 8.

Inductive

Reasoning

Integrate the deductive reasoning with few-shot

examples

Traditional IO prompting for inductive

reasoning

Completely decouple inductive reasoning from deductive reasoningDeductive Reasoning

Deductive Setting (mapping function is provided)

Inductive Setting (mapping function is not provided)

Figure 1: We have designed a set of comparative experiments that utilize a consistent task across diﬀerent contexts, each

emphasizing either deductive (i.e., methods (a) and (b)) or inductive reasoning (i.e., methods (c) and (d)). As we move from left

to right across the ﬁgure, the methods gradually transition their primary focus from deductive reasoning to inductive reasoning.

Speciﬁcally, method (a) is designed to demonstrate the LLMs’ deductive reasoning in its pure form. Conversely, method (c)

utilizes Input-Output (IO) prompting strategies, which are prevalent for probing the inductive reasoning skills of LLMs. However,

we can observe that methods (c) cannot fully disentangle inductive reasoning from deductive reasoning as their learning process

directly moves from observations to speciﬁc instances, blurring the lines between the two. To exclusively focus on and examine

inductive reasoning, we introduce a novel framework called SolverLearner, positioned at the far right of the spectrum.

either deductive (i.e., methods (a) and (b)) or induc-

tive reasoning (i.e., methods (c) and (d)), as depicted

in Fig 1. For instance, in an arithmetic task, the pro-

ﬁciency of a LLM in deductive reasoning depends

on its ability to apply a given input-output mapping

function to solve problems when this function is

explicitly provided. Conversely, an LLM’s skill

in inductive reasoning is measured by its ability

to infer these input-output mapping functions (i.e.,

𝑦 = 𝑓

𝑤

(𝑥)

), that maps input data points

(𝑥)

to their

corresponding output values

(𝑦)

, based solely on

in-context examples. The base system often serves

as the input-output mapping function in an arith-

metic task. In line with the aforementioned setup,

we employ four methods to investigate the reason-

ing capacity of LLMs. As we move from left to

right across Fig. 1, the methods gradually transition

their primary focus from deductive reasoning to

inductive reasoning. Method (a), at the far left of

the ﬁgure, aims to explore the deductive reasoning

capabilities of LLMs in its pure form, where no in-

context-learning examples are provided (zero-shot

settings). While exploring deductive reasoning in

its pure form appears relatively straightforward in

zero-shot settings, untangling inductive reasoning

poses more signiﬁcant challenges. Recent studies

have investigated the inductive reasoning abilities

of LLMs (Yang et al., 2022; Gendron et al., 2023;

Xu et al., 2023b), they have primarily used Input-

Output (IO) prompting (Mirchandani et al., 2023),

which involves providing models with a few

〈

in-

put, output

〉

as demonstrations without providing

the underlying mapping function. The models

are then evaluated based on their ability to han-

dle unseen examples, as illustrated in method (c).

These studies often ﬁnd LLMs facing diﬃculties

with inductive reasoning. Our research suggests

that the use of IO prompting might not eﬀectively

separate LLMs’ deductive reasoning skills from

their inductive reasoning abilities. This is because

the approach moves directly from observations to

speciﬁc instances, obscuring the inductive reason-

ing steps. Consequently, the underperformance in

the context of inductive reasoning tasks may be

attributed to poor deductive reasoning capabilities,

i.e., the ability of LLMs to execute tasks, rather than

being solely indicative of their inductive reasoning

capability.

To disentangle inductive reasoning from deduc-

tive reasoning, we propose a novel model, referred

to as SolverLearner. Given our primary focus on in-

ductive reasoning, SolverLearner follows a two-step

process to segregate the learning of input-output

mapping functions from the application of these

functions for inference. Speciﬁcally, functions are

applied through external interpreters, such as code

interpreters, to avoid incorporating LLM-based

deductive reasoning.

We evaluate the performance of several LLMs

across various tasks. LLMs consistently demon-

strate remarkable inductive reasoning capabilities

through SolverLearner, achieving near-perfect per-

formance with ACC of 1 in most cases. Surprisingly,

despite their strong inductive reasoning abilities,

LLMs tend to exhibit weaker deductive capabilities,

particularly in terms of “counterfactual” reasoning.

This ﬁnding, though unexpected, aligns with the

previous research (Wu et al., 2023). In a zero-shot

scenario, the ability of an LLM to correctly exe-

cute tasks by applying principles (i.e. deductive

reasoning) heavily relies on the frequency with

which the model was exposed to the tasks during

its pre-training phase.

2 Task Deﬁnition

Our research is focused on a relatively unexplored

question: Which presents a greater challenge to

LLMs - deductive reasoning or inductive reasoning?

To explore this, we designed a set of comparative

experiments that apply a uniform task across var-

ious contexts, each emphasizing either deductive

or inductive reasoning. The primary distinction

between the deductive and inductive settings is

whether we explicitly present input-output map-

pings to the models. Informally, we can describe

these mappings as a function

𝑓

𝑤

: 𝑋 → 𝑌

, where

an input

𝑥 ∈ 𝑋

is transformed into an output

𝑦 ∈ 𝑌

We distinguish between the deductive and inductive

settings as follows:

•

Deductive setting: we provide the models with

direct input-output mappings (i.e., 𝑓

𝑤

•

Inductive setting: we oﬀer the models a few

examples (i.e.,

(𝑥, 𝑦)

pairs) while intentionally

leaving out input-output mappings (i.e., 𝑓

𝑤

For example, consider arithmetic tasks, where the

base system is the input-output mapping function.

The two approaches on the left side of Fig. 1 (i.e.,

method (a) and (b)) follow the deductive setting,

illustrating the case where the arithmetic base is

explicitly provided. In contrast, the two methods

(i.e., method (c) and (d)) on right side of Fig. 1

adhere to the inductive setting, depicting the sce-

nario characterized by the absence of a speciﬁed

arithmetic base, while a few input-output examples

are provided for guidance.

3 Our Framework for Inductive

Reasoning: SolverLearner

While recent studies have explored the inductive

reasoning abilities of LLMs (Yang et al., 2022; Gen-

dron et al., 2023; Xu et al., 2023b), they have primar-

ily relied on Input-Output (IO) prompting (Mirchan-

dani et al., 2023). This method involves providing

models with a few

〈

input, output

〉

demonstrations

and then evaluating their performance on unseen

examples, as depicted in method (c) in Fig. 1. Our

research suggests that the use of IO prompting and

directly evaluating the ﬁnal instance performance

might not eﬀectively separate LLMs’ deductive

reasoning skills from their inductive reasoning abil-

ities. This is because the approach moves directly

from observations to speciﬁc instances, obscuring

the inductive reasoning steps. To better disentangle

inductive reasoning, we propose a novel framework,

SolverLearner. This framework enables LLMs to

learn the function (i.e.,

𝑦 = 𝑓

𝑤

(𝑥)

), that maps in-

put data points

(𝑥)

to their corresponding output

values

(𝑦)

, using only in-context examples. By

focusing on inductive reasoning and setting aside

LLM-based deductive reasoning, we can isolate and

investigate inductive reasoning of LLMs in its pure

form via SolverLearner. SolverLearner includes

two-stages as illustrated in Fig. 2:

•

Function Proposal: In this initial phase, we

propose a function, that could be used to map

input data points

(𝑥)

to their corresponding output

values

(𝑦)

. This is corresponding to the inductive

reasoning process.

•

Function Execution: In the second phase, the

proposed function is applied through external

code interpreters to solve the test queries for

evaluation purposes. This phase ensures that

the LLM is fully prevented from engaging in

deductive reasoning.

3.1 Framework

In this subsection, we will take the arithmetic task

as a case study to demonstrate the entire process.

Function Proposal: Given the in-context ex-

amples, the primary goal of LLMs is to learn a

function that can map input data points

(𝑥)

to their

corresponding output values

(𝑦)

. This process of

learning the mapping between inputs and outputs

is akin to inductive reasoning, while employing

the learned function to address unseen queries

aligns with deductive reasoning. In order to sepa-

rate inductive reasoning from deductive reasoning,

the execution of the learned function should be

completely detached from LLMs. To achieve this

separation, external tools such as code interpreters

serve as eﬃcient way to execute these functions in-

dependently. By encapsulating the learned function

within Python code, we can eﬀectively detach the

duty of deductive reasoning from LLMs, assigning

The result for 71+44 is 135.

The result for 42+70 is 132.

The result for 50+45 is 115.

The result for 61+55 is 136.

The result for 63+22 is 105.

The result for 72+62 is 154.

The result for 57+27 is 106.

The result for 52+76 is 150.

8 Shot Examples

Python Function

14+57

44+45

...

61+23

22+77

Test Queries

② Function

Execution

① Function

Proposal

Figure 2: An overview of our framework SolverLearner for inductive reasoning. SolverLearner follows a two-step process to

segregate the learning of input-output mapping functions from the application of these functions for inference. Speciﬁcally,

functions are applied through external code interpreters, to avoid incorporating LLM-based deductive reasoning.

it solely to these external executors. For instance,

in function proposal stage for an arithmetic task,

we have:

“You are an expert mathematician and program-

mer. You are asked to add two numbers, the base

of which is unknown. Below are some provided

examples: The result for 76+76 is 174.

Please identify the underlying pattern to determine

the base being used and implement a solver() func-

tion to achieve the goal.

def solver(n1: str, n2: str) -> str:

# Let’s write a Python program step by step

# Each input is a number represented as a string.

# The function computes the sum of these numbers

and returns it as a string. ”

Function Execution: In the second phase, func-

tions are executed through external code interpreters

to solve the test cases for evaluation purposes. These

code interpreters act as “oracle” deductive reason-

ers, fully preventing the LLM from involving deduc-

tive reasoning. This ensures that the ﬁnal results

reﬂect only the inductive reasoning capability of the

LLM. To further decouple the LLM’s inﬂuence in

this phase, test cases are generated using a template

without involving the LLM. More details can be

found in Appendix A.1.3.

4 Tasks

In this section, we provide a brief overview of the

tasks under consideration. Our focus is on inves-

tigating the reasoning abilities of LLMs in both

deductive and inductive reasoning scenarios. To

ensure a robust evaluation, we carefully select tasks

that lend themselves well to comparison. Firstly, to

prevent LLMs from reciting tasks seen frequently

during pre-training, which could artiﬁcially inﬂate

performance in deductive reasoning, a signiﬁcant

portion of the tasks falls into the category of “coun-

terfactual reasoning” tasks. Secondly, in the context

of inductive reasoning, where only a few in-context

examples are available without the mapping func-

tion, our objective is to learn the function that

maps inputs to outputs based on this restricted

dataset. To achieve this, we choose tasks that are

well-constrained, ensuring the existence of a single,

unique function capable of ﬁtting this limited data.

Detailed descriptions of each task and the prompts

used can be found in Appendix A.1 and A.2.

Arithmetic In this study, we focus on the two-

digit addition task previously explored in the work

of Wu et al. (2023). We investigate multiple

numerical bases, speciﬁcally base-8, 9, 10, 11, and

16 where base 10 corresponds to the commonly

observed case during pretraining. In the context of

deductive reasoning, the base is explicitly provided

without any accompanying in-context examples,

and the LLM is expected to perform the addition

computation by relying on its inherent deductive

reasoning abilities. Conversely, in the context of

inductive reasoning, instead of explicitly providing

the base information to LLMs, we provide LLMs

solely with few-shot examples and require them

to induce the base through these examples and

subsequently generate a function to solve arithmetic

problems.

Basic Syntactic Reasoning In this setting, we

concentrate on tasks related to syntactic recognition

previously explored by Wu et al. (2023). Our

objective is to evaluate LLMs using artiﬁcially

constructed English sentences that vary from the

conventional subject-verb-object (SVO) word order.

For deductive reasoning, we directly provide the

new word order to LLMs without any contextual

examples, challenging them to identify the subject,

verb, and object within this artiﬁcial language. In

contrast, for inductive reasoning, we do not give

剩余17页未读，继续阅读

评论收藏

内容反馈

豪AI冰

粉丝: 73
资源: 68

LLM归纳与演绎推理能力的研究

LLM.04 TimeBench:LLM时间推理能力的综合评估

非图宾根基准下LLM ChatGPT的因果发现和因果推理能力.pdf

ChatGPT使用指南-文本推理

结合多智能体树思维验证器改进大规模语言模型推理能力

LLM原理与ChatPDF实现.pdf

Re-Reading提升大型语言模型推理能力

C/C++ 中的 LLM 推理

Byzer-LLM 基于 Ray 技术构建，是一款覆盖大语言模型（LLM）完整生命周期的解决方案包括预训练、微调、部署及推理服务

一个基于 Python 的 LLM (大语言模型) 快速推理和服务框架，GPU利用率大幅提升

Python_用于在现代消费级gpu上本地运行llm的快速推理库.zip

基于LLM的知识图谱补全研究

Python_ToRA是一系列工具集成推理LLM代理，旨在通过与工具ICLR24交互来解决具有挑战性的数学推理问题.zip

中文LLM大语言模型支持列表

深入LLM与RAG 原理、实现与应用

基于 Ray 技术构建，是一款覆盖大语言模型（LLM）完整生命周期的解决方案，包括预训练、微调、部署及推理服务等阶段

深入LLM与RAG 原理、实现与应用教程

推荐系统(RS)与大模型(LLM)的结合

LLM as DBA LLM as DBA LLM as DBA LLM as DBA LLM as DBA

大语言模型LLM：微调、量化、推理.zip

Python_为了加快llm的推理速度和增强llm对关键信息的感知，压缩提示符和KVCache，以最小的性能损失实现高.zip

通向AGI之路：大型语言模型（LLM）技术精要 - 知乎.pdf

一个基于 Java 的 LLM（大语言模型）应用开发框架

Python_面向llm的高吞吐量、高内存利用率推理与服务引擎.zip

【大模型-多模态】复杂视觉推理任务中利用多模态增强大规模语言模型能力的方法研究

终端LLM AI模型:mlc-llm

Building LLM Powered Applications-2024.5.pdf

论文译文：LLM Maybe LongLM: SelfExtend LLM Context Window Without Tun

YOLOv8-deepsort 实现智能车辆目标检测+车辆跟踪+车辆计数

Transformer模型实现长期预测并可视化结果（附代码+数据集+原理介绍）

YOLOv8网络结构图，自制visio文件，yolov8.vsds，需要的自取，在原有的基础上直接改就行了

最新资源