Dera News
derafrom heavy users
16:07 JST
WEDNESDAY, 1 JULY 2026

half the internet is terrified of AI. we are on the other half, taking this into our daily life, trying to understand better.

we use AI in everything we do — so every monday we read the whole week of it and work out what actually happened. something real, from heavy users. it makes our day if what we produce makes someone find AI more interesting for their life.

read our version of what happened this week in AI. free.read this week →
01
Hugging Face Daily Papers2 HR AGO/ primary source

BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language

A new neuroscience model, BrainJanus, unifies brain activity, vision, and language, enabling two-way conversions between them.

What happened
  • BrainJanus is the first model to integrate brain activity with visual and linguistic sensory inputs within a unified framework.
  • It allows for bidirectional mapping, converting images/text to brain activity and vice-versa, by quantizing continuous brain data into 'tokens'.
Why it matters
  • This unified approach marks a significant advance in neuroscience, moving beyond separate encoding and decoding tasks.
  • It could lay foundational groundwork for future technologies that better understand and interact with human perception and thought.
— the story beneath
02
Hugging Face Daily Papers2 HR AGO/ primary source

Xiaomi-GUI-0 Technical Report

Institution: Xiaomi Research | Authors: Wanxia Cao, Chengzhen Duan, Pei Fu, Pengzhi Gao, Niu Lian arXiv Links arXiv | PDF AI summary Abstract A native multimodal GUI agent trained in real-device environments demonstrates superior performance and stability compared to traditional benchmark-based approaches. Generated by Qwen/Qwen2.5-Coder-32B-Instruct 摘要:与传统的基于基准的方法相比,在真实设备环境中训练的本机多模式 GUI 代理表现出卓越的性能和稳定性。 由 Qwen/Qwen2.5-Coder-32B-Instruct 生成 Abstract Generated by Qwen/Qwen2.5-Coder-32B-Instruct Gr...

03
Hugging Face Daily Papers2 HR AGO/ primary source

PolyFlow: Continuous Topology Embedding Flow Matching for Artist-style Mesh Generation

Tencent Hunyuan's new PolyFlow technology promises to greatly accelerate high-quality 3D mesh generation by enabling parallel processing.

What happened
  • Tencent Hunyuan unveiled 'PolyFlow,' a new 3D mesh generation technique that significantly speeds up model creation.
  • PolyFlow converts discrete mesh data into a continuous representation, allowing for parallel processing via a Transformer-based framework.
Why it matters
  • This could drastically reduce the time and resources needed for 3D content creation, potentially lowering barriers for businesses in virtual spaces.
  • Faster and more precise 3D model generation could lead to richer, more dynamic digital experiences in industries like gaming, design, and virtual commerce.
04
Hugging Face Daily Papers2 HR AGO/ primary source

PhotoQuilt: Training-Free Arbitrary-Resolution Photomosaics via Bootstrapped Tiled Denoising

University of Toronto researchers developed PhotoQuilt, a new framework to efficiently generate high-resolution photo mosaics, overcoming diffusion mo

What happened
  • University of Toronto team launched PhotoQuilt, a framework for generating high-resolution photo mosaics.
  • The system uses 'bootstrap tiled denoising' to maintain both overall structure and individual tile detail.
Why it matters
  • This innovation addresses a key challenge in AI image generation, where previous models struggled with large-scale mosaic creation.
  • It demonstrates progress in AI's ability to handle complex image synthesis efficiently, potentially impacting future visual content creation tools.
05
Hugging Face Daily Papers2 HR AGO/ primary source

AVTok: 1D Unified Tokenization for Holistic Audio-Video Generation

New AI technology, AVTok, unifies audio and video generation, making content creation more natural and synchronized.

What happened
  • AVTok, a new unified tokenizer, integrates audio and video AI generation processes.
  • It uses a dual-stream transformer architecture to efficiently encode audio-visual pairs into one compact data representation.
Why it matters
  • This integration addresses the challenge of out-of-sync and unnatural content in AI-generated video with sound.
  • It's an important step toward future large-scale multimodal AI models that can handle audio and video together more effectively.
06
Hugging Face Daily Papers2 HR AGO/ primary source

Evolution Fine-Tuning: Learning to Discover Across 371 Optimization Tasks

Institution: Minnesota NLP | Authors: Young-Jun Lee, Seungone Kim, Minki Kang, Alistair Cheong Liang Chuen, Zerui Chen arXiv Links arXiv | PDF AI summary Abstract Evolutionary fine-tuning enables large language models to develop cross-task problem-solving capabilities by learning from search trajectories, demonstrating improved performance on mathematical conjectures and optimization tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct 摘要 进化微调使大型语言模型能够通过学习搜索轨迹来开发跨任务问题解决能力,从而提高数学猜想和优化任务的性能。 由 Qwen...

— the rundown
08

DOPD: Dual On-policy Distillation

A new AI learning method, DOPD, enhances how large AI models transfer knowledge by solving the 'privilege illusion' problem.

Hugging Face Daily Papers2 HR AGO/ primary source
12

Drop-Then-Recovery: How Redundant Are Vision-Language-Action Models?

Institution: LLM-Drop | Authors: Guoheng Sun, Kaixi Feng, Shwai He, Xiaochuan Gong, Yexiao He arXiv Links arXiv | PDF AI summary Abstract Research reveals that language backbones in Vision-Language-Action models are highly redundant for robotic manipulation tasks, while vision and action pathways are more critical, suggesting need for deliberate capacity allocation in future architectures. Generated by Qwen/Qwen2.5-Coder-32B-Instruct 摘要 研究表明,视觉-语言-动作模型中的语言主干对于机器人操作任务来说是高度冗余的,而视觉和动作路径则更为关键,这表明在未来...

Hugging Face Daily Papers10 HR AGO/ primary source
14

MirrorPPR: Exemplar-Based Portrait Photo Retouching

Institution: DENG Lab @ SJTU | Authors: Zhihong Liu, Zheng Li, Jiachun Jin, Siqi Kou, Yitao Jian arXiv Links arXiv | PDF AI summary Abstract Exemplar-based portrait retouching framework using Diffusion Transformer with LoRA adaptation and self-augmented training data achieves superior quality and identity preservation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct 摘要 基于示例的肖像修饰框架使用具有 LoRA 适应和自我增强训练数据的 Diffusion Transformer,实现了卓越的质量和身份保留。 由 Qwen/Qwen2.5-Coder-32B-Instruct 生成 Abstract Generated by Q...

Hugging Face Daily Papers8 HR AGO/ primary source
19

A Gravitational Interpretation of Fine-Tuning Reversion

Institution: Mohamed Bin Zayed University of Artificial Intelligence | Authors: Samuele Poppi, Nils Lukas arXiv Links arXiv | PDF AI summary Abstract Post-alignment safety degradation arises from geometric properties of training history, where fine-tuning reversion follows a persistent direction defined by early training dynamics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct 摘要 对齐后安全性下降是由训练历史的几何特性引起的,其中微调回归遵循早期训练动态定义的持久方向。 由 Qwen/Qwen2.5-Coder-32B-Instruct 生成 Abstract Generated by Qwen/Qwen2.5-C...

Hugging Face Daily Papers15 HR AGO/ primary source
546 stories scored · 14-day window · 15/20 fully briefed/ranked entirely by dera's own scoring · the score stays hidden

Weekly AI brief for practitioners

A practitioner-first AI newsletter by dera.ai — what changed, why it matters, what to try next

By dera.ai • Every Monday, free

We respect your privacy. Unsubscribe at any time.