AI Research

Daily trending papers Source: Hugging Face

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

비디오 확산 모델에서 attention 연산량을 95%까지 줄이면서 생성 품질을 유지하는 SpargeAttention2를 제안, 기존 sparse attention 방법들을 능가하는 성능을 보임.

Unified Latents (UL): How to train your latents

Unified Latents (UL) 프레임워크는 diffusion prior와 diffusion model을 사용하여 잠재 표현 학습 성능을 높이고, ImageNet-512 및 Kinetics-600 데이터셋에서 SOTA를 달성하여 이미지 및 비디오 생성 모델의 효율성과 품질을 향상시킬 수 있다.

Diffusion Model Latent Representation Image Generation

Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

다양한 플랫폼에서 GUI 자동화를 위한 에이전트 모델 GUI-Owl-1.5를 공개, 개발자들이 UI 자동화 및 테스트를 효율적으로 수행할 수 있도록 돕는다.

GUI Automation Agent Multi-Platform

"What Are You Doing?": Effects of Intermediate Feedback from Agentic LLM In-Car Assistants During Multi-Step Processing

운전 중 멀티스텝 작업을 수행하는 LLM 기반 차량 내 에이전트 어시스턴트에서 중간 피드백 제공이 사용자 경험, 신뢰도 향상, 작업 부하 감소에 긍정적인 영향을 미침.

Agent LLM User Experience

Arcee Trinity Large Technical Report

Arcee에서 개발한 MoE 모델 Trinity 시리즈(Large, Mini, Nano)를 공개했으며, 특히 Large 모델은 새로운 로드 밸런싱 전략 SMEBU를 통해 안정적인 학습을 보였다. 개발자들은 Hugging Face에서 모델을 다운로드하여 사용할 수 있다.

MoE Large Language Model Sparse Model