PyoSignal Logo
PyoSignal
Back to Research

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

Paper ID: 2606.19980 โ€ข 7 Upvotes
Agentic Workflow Robotics Automated ML Closed-loop Control Agent Vision Evaluation
ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

๐Ÿ“ ํ•ต์‹ฌ ์š”์•ฝ

์ฝ”๋”ฉ ์—์ด์ „ํŠธ๊ฐ€ ๋ฌผ๋ฆฌ์  ํ™˜๊ฒฝ์˜ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ด ์Šค์Šค๋กœ ๋กœ๋ด‡ ์ œ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐœ์„ ํ•˜๋Š” ํ์‡„ ๋ฃจํ”„(Closed-loop) ํ”„๋ ˆ์ž„์›Œํฌ ์ œ์•ˆ

๐Ÿ“– ์ƒ์„ธ ๋‚ด์šฉ

๋กœ๋ด‡์˜ ์ •๊ตํ•œ ์กฐ์ž‘ ๊ธฐ์ˆ ์€ ์—ฌ์ „ํžˆ ์ธ๊ฐ„์˜ ๊ฐœ์ž…๊ณผ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์—”์ง€๋‹ˆ์–ด๋ง์— ํฌ๊ฒŒ ์˜์กดํ•˜๊ณ  ์žˆ์–ด ๋ฒ”์šฉ ๋ฌผ๋ฆฌ ์ง€๋Šฅ ๊ตฌํ˜„์˜ ๋ณ‘๋ชฉ์ด ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์กด ์ฝ”๋”ฉ ์—์ด์ „ํŠธ๋Š” ๋””์ง€ํ„ธ ํ™˜๊ฒฝ์— ๊ตญํ•œ๋˜์–ด ์žˆ์–ด, ์‹ค์ œ ๋ฌผ๋ฆฌ ํ™˜๊ฒฝ์—์„œ์˜ ๋ฐ˜๋ณต์ ์ธ ์‹คํ—˜๊ณผ ๊ฐœ์„  ๋ฃจํ”„๋ฅผ ์ž๋™ํ™”ํ•˜๋Š” ๊ฒƒ์ด ํ•„์ˆ˜์ ์ž…๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ํ™˜๊ฒฝ ๋ฆฌ์…‹, ์ •์ฑ… ์‹คํ–‰, ๊ฒฐ๊ณผ ๊ฒ€์ฆ, ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฐœ์„ ์ด ์œ ๊ธฐ์ ์œผ๋กœ ์—ฐ๊ฒฐ๋œ ENPIRE ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์ด ์‹œ์Šคํ…œ์€ ์ฝ”๋”ฉ ์—์ด์ „ํŠธ๊ฐ€ ๋กœ๊ทธ ๋ถ„์„๊ณผ ๋ฌธํ—Œ ์กฐ์‚ฌ๋ฅผ ํ†ตํ•ด ์Šค์Šค๋กœ ํ•™์Šต ์ธํ”„๋ผ์™€ ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •ํ•˜๋ฉฐ ์ •์ฑ…์„ ์ตœ์ ํ™”ํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ์—์ด์ „ํŠธ๋Š” ํ•€ ๋ฐ•์Šค ์ •๋ฆฌ๋‚˜ ๋„๊ตฌ ์‚ฌ์šฉ๊ณผ ๊ฐ™์€ ๊ณ ๋‚œ๋„ ์ž‘์—…์—์„œ 99%์˜ ์„ฑ๊ณต๋ฅ ์„ ๋‹ฌํ•˜๋Š” ์ •์ฑ…์„ ์ž์œจ์ ์œผ๋กœ ํ•™์Šตํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋กœ๋ด‡ ์—ฐ๊ตฌ ํ”„๋กœ์„ธ์Šค๋ฅผ ์ž๋™ํ™” ๊ฐ€๋Šฅํ•œ ์ตœ์ ํ™” ๋ฌธ์ œ๋กœ ์ „ํ™˜ํ•˜์—ฌ ์ธ๊ฐ„์˜ ๊ฐœ์ž…์„ ์ตœ์†Œํ™”ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

๐Ÿ”‘ ์ฃผ์š” ๋‚ด์šฉ (Key Points)

  • ๋ฌผ๋ฆฌ์  ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„(Reset-Execute-Verify-Refine)๋ฅผ ์ž๋™ํ™”ํ•˜๋Š” ENPIRE ํ”„๋ ˆ์ž„์›Œํฌ ๊ฐœ๋ฐœ
  • ์ฝ”๋”ฉ ์—์ด์ „ํŠธ๊ฐ€ ๋กœ๊ทธ ๋ถ„์„ ๋ฐ ๋ฌธํ—Œ ์กฐ์‚ฌ๋ฅผ ํ†ตํ•ด ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ํ•™์Šต ์ธํ”„๋ผ๋ฅผ ์ง์ ‘ ๊ฐœ์„ ํ•˜๋Š” Evolution ๋ชจ๋“ˆ ๋„์ž…
  • ๋ฉ€ํ‹ฐ ๋กœ๋ด‡ ํ™˜๊ฒฝ์—์„œ ์—์ด์ „ํŠธ ํŒ€์„ ์šด์šฉํ•˜์—ฌ ํ•™์Šต ์†๋„๋ฅผ ๊ฐ€์†ํ™”ํ•˜๋Š” ํ™•์žฅ์„ฑ ํ™•๋ณด

๐Ÿ’ก ์‹ค๋ฌด์  ๊ฐ€์น˜ (Relevance)

๋กœ๋ด‡ ์ œ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฐœ๋ฐœ ์‹œ ๋ฐ˜๋ณต๋˜๋Š” ์‹คํ—˜-์ˆ˜์ • ๊ณผ์ •์„ ์ž๋™ํ™”ํ•˜์—ฌ ์—”์ง€๋‹ˆ์–ด์˜ ์ˆ˜๋™ ๊ฐœ์ž…์„ ํš๊ธฐ์ ์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

โœ… ์ถ”์ฒœ ์•ก์…˜ (Actionable Items)

  • ์—์ด์ „ํŠธ๊ฐ€ ์ƒ์„ฑํ•œ ์ฝ”๋“œ๊ฐ€ ๋ฌผ๋ฆฌ์  ํ•˜๋“œ์›จ์–ด์˜ ์•ˆ์ „ ๊ฐ€์ด๋“œ๋ผ์ธ์„ ์ค€์ˆ˜ํ•˜๋Š”์ง€ ๊ฒ€์ฆํ•˜๋Š” ์ƒŒ๋“œ๋ฐ•์Šค ๊ตฌ์ถ•
  • ์‹ค์ œ ๋กœ๋ด‡ ํ™˜๊ฒฝ๊ณผ ์œ ์‚ฌํ•œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ™˜๊ฒฝ์—์„œ ENPIRE์˜ ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„ ์„ฑ๋Šฅ ํ…Œ์ŠคํŠธ
  • ์‹คํŒจ ๋กœ๊ทธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์—์ด์ „ํŠธ๊ฐ€ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์กฐ์ •ํ•˜๋Š” ๋กœ์ง์˜ ์œ ํšจ์„ฑ ๊ฒ€์ฆ