確率環境下でのモデルベース学習

田村, 剛士; 鶴岡, 久; 山口, 明宏

インデックスツリー

RootNode

アイテム

確率環境下でのモデルベース学習

http://hdl.handle.net/11478/1289

名前 / ファイル	ライセンス	アクション
11478-1289_p9.pdf (855.2 kB)

Item type

紀要論文 / Departmental Bulletin Paper(1)

公開日

2019-05-28

タイトル

確率環境下でのモデルベース学習

タイトル

Model Based Learning in Stochastic Environments

言語

jpn

キーワード

主題Scheme

Other

主題

model based learning

キーワード

主題Scheme

Other

主題

prioritized sweeping

キーワード

主題Scheme

Other

主題

Q-value

キーワード

主題Scheme

Other

主題

Bellman error

キーワード

主題Scheme

Other

主題

confidence interval

キーワード

言語

主題Scheme

Other

主題

model based learning

キーワード

言語

主題Scheme

Other

主題

prioritized sweeping

キーワード

言語

主題Scheme

Other

主題

Q-value

キーワード

言語

主題Scheme

Other

主題

Bellman error

キーワード

言語

主題Scheme

Other

主題

confidence interval

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

departmental bulletin paper

著者

田村, 剛士
鶴岡, 久
山口, 明宏

著者(ヨミ)

識別子Scheme

WEKO

識別子

2503

姓名

タムラ, タケシ

著者(ヨミ)

識別子Scheme

WEKO

識別子

2504

姓名

ツルオカ, ヒサシ

著者(ヨミ)

識別子Scheme

WEKO

識別子

2505

姓名

ヤマグチ, アキヒロ

別言語の著者

識別子Scheme

WEKO

識別子

2506

姓名

TAMURA, Takeshi

別言語の著者

識別子Scheme

WEKO

識別子

2507

姓名

TSURUOKA, Hisashi

別言語の著者

識別子Scheme

WEKO

識別子

2508

姓名

YAMAGUCHI, Akihiro

内容記述

内容記述タイプ

Other

内容記述

Model based learning can reevaluate the utility of every state, according to a measure of urgency. Prioritized sweeping is a typical algorithm for efficient state updating. In a stochastic environment, a probability distribution can be used to represent the uncertainty of the Q-value caused by probabilistic state transitions or probabilistic rewards. The product of the confidence interval and the Bellman error is used to provide a measure for prioritizing,which takes account of the level of confidence and also yields a measure of urgency. The performance of this approach in the trap domain is examined and compared with that of the ordinary sweeping method. Experimental results indicate that the proposed approach results in a more effective exploration of the state than does the use of conventional sweeping methods.

書誌情報

福岡工業大学研究論集

巻 44, 号 1, p. 9-12, 発行日 2011-09

ISSN

収録物識別子タイプ

ISSN

収録物識別子

02876620

フォーマット

内容記述タイプ

Other

内容記述

application/pdf

形態

値

855208 bytes

著者版フラグ

出版タイプ

VoR

出版タイプResource

http://purl.org/coar/version/c_970fb48d4fbd8a85

タイトル（ヨミ）

その他のタイトル

カクリツカンキョウカデノモデルベースガクシュウ

出版者

福岡工業大学

出版者（ヨミ）

値

フクオカコウギョウダイガク

資源タイプ

内容記述タイプ

Other

内容記述

論文(Article)

資源タイプ・ローカル

値

紀要論文

資源タイプ・NII

値

Departmental Bulletin Paper

資源タイプ・DCMI

値

text

資源タイプ・ローカル表示コード

値

戻る

views

See details

	Views

Versions

Ver.1

2023-05-15 12:43:46.260952

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

確率環境下でのモデルベース学習

× 田村, 剛士

× 鶴岡, 久

× 山口, 明宏

Versions

Share

Cite as

エクスポート