arXiv 论文速递

Manifold limit for the training of shallow graph convolutional neural networks

Authors: Johanna Tengler, Christoph Brune, José A. Iglesias

First: 2026-01-09T18:59:20+00:00 · Latest: 2026-01-09T18:59:20+00:00

Comments: 44 pages, 0 figures, 1 table

Abstract

We study the discrete-to-continuum consistency of the training of shallow graph convolutional neural networks (GCNNs) on proximity graphs of sampled point clouds under a manifold assumption. Graph convolution is defined spectrally via the graph Laplacian, whose low-frequency spectrum approximates that of the Laplace-Beltrami operator of the underlying smooth manifold, and shallow GCNNs of possibly infinite width are linear functionals on the space of measures on the parameter space. From this functional-analytic perspective, graph signals are seen as spatial discretizations of functions on the manifold, which leads to a natural notion of training data consistent across graph resolutions. To enable convergence results, the continuum parameter space is chosen as a weakly compact product of unit balls, with Sobolev regularity imposed on the output weight and bias, but not on the convolutional parameter. The corresponding discrete parameter spaces inherit the corresponding spectral decay, and are additionally restricted by a frequency cutoff adapted to the informative spectral window of the graph Laplacians. Under these assumptions, we prove $Γ$-convergence of regularized empirical risk minimization functionals and corresponding convergence of their global minimizers, in the sense of weak convergence of the parameter measures and uniform convergence of the functions over compact sets. This provides a formalization of mesh and sample independence for the training of such networks.

中文标题/摘要

标题：浅层图卷积神经网络训练的流形极限

我们研究了在采样点云的近邻图上，基于流形假设浅层图卷积神经网络（GCNNs）训练的离散到连续一致性。图卷积通过图拉普拉斯算子的谱定义，其低频谱近似于底层光滑流形的拉普拉斯-贝尔特拉米算子，而浅层GCNNs可能是无限宽的，它们是参数空间上测度的线性泛函。从泛函分析的角度来看，图信号被视为流形上函数的空间离散化，这导致了一种自然的训练数据概念，这种概念在不同的图分辨率下是一致的。为了使收敛结果成立，连续参数空间被选择为弱紧的单位球乘积，对输出权重和偏置施加Sobolev正则性，但不对卷积参数施加正则性。相应的离散参数空间继承了相应的谱衰减，并且还受到适应图拉普拉斯算子的信息谱窗口的频率截止的限制。在这些假设下，我们证明了正则化经验风险最小化泛函的Γ-收敛及其全局最小值的相应收敛，在参数测度的弱收敛和函数在紧集上的均匀收敛意义上。这为这种网络的训练提供了网格和样本独立性的形式化。

Summary / 总结

The study investigates the consistency of training shallow graph convolutional neural networks (GCNNs) on proximity graphs of sampled point clouds under a manifold assumption. The research defines graph convolution spectrally via the graph Laplacian and considers shallow GCNNs as linear functionals on the space of measures. It proves Γ-convergence of regularized empirical risk minimization functionals and convergence of their global minimizers, ensuring mesh and sample independence in the training process.

该研究探讨了在点云近邻图上训练浅层图卷积神经网络（GCNN）在流形假设下的连续一致性。研究集中在通过图拉普拉斯定义的谱图卷积以及参数空间上的测度线性函数。主要发现包括证明了正则化经验风险最小化泛函的Γ收敛及其全局最小值的收敛，确保了此类网络训练的网格和样本独立性。

AdaFuse: Adaptive Ensemble Decoding with Test-Time Scaling for LLMs

Authors: Chengming Cui, Tianxin Wei, Ziyi Chen, Ruizhong Qiu, Zhichen Zeng, Zhining Liu, Xuying Ning, Duo Zhou, Jingrui He

First: 2026-01-09T18:58:22+00:00 · Latest: 2026-01-09T18:58:22+00:00

Abs · PDF · Code1 · Code2 · Code3

Abstract

Large language models (LLMs) exhibit complementary strengths arising from differences in pretraining data, model architectures, and decoding behaviors. Inference-time ensembling provides a practical way to combine these capabilities without retraining. However, existing ensemble approaches suffer from fundamental limitations. Most rely on fixed fusion granularity, which lacks the flexibility required for mid-generation adaptation and fails to adapt to different generation characteristics across tasks. To address these challenges, we propose AdaFuse, an adaptive ensemble decoding framework that dynamically selects semantically appropriate fusion units during generation. Rather than committing to a fixed granularity, AdaFuse adjusts fusion behavior on the fly based on the decoding context, with words serving as basic building blocks for alignment. To be specific, we introduce an uncertainty-based criterion to decide whether to apply ensembling at each decoding step. Under confident decoding states, the model continues generation directly. In less certain states, AdaFuse invokes a diversity-aware scaling strategy to explore alternative candidate continuations and inform ensemble decisions. This design establishes a synergistic interaction between adaptive ensembling and test-time scaling, where ensemble decisions guide targeted exploration, and the resulting diversity in turn strengthens ensemble quality. Experiments on open-domain question answering, arithmetic reasoning, and machine translation demonstrate that AdaFuse consistently outperforms strong ensemble baselines, achieving an average relative improvement of 6.88%. The code is available at https://github.com/CCM0111/AdaFuse.

中文标题/摘要

标题：AdaFuse：适应性集成解码与测试时缩放的LLM解码框架

大型语言模型（LLMs）由于预训练数据、模型架构和解码行为的不同而表现出互补的优势。推理时的集成提供了一种实用的方法来结合这些能力，而无需重新训练。然而，现有的集成方法存在根本性的局限性。大多数方法依赖于固定的融合粒度，缺乏在生成过程中进行中期调整的灵活性，也无法适应不同任务的生成特性。为了解决这些挑战，我们提出了AdaFuse，这是一种适应性集成解码框架，在生成过程中动态选择语义上合适的融合单元。AdaFuse 不是固定粒度地进行融合，而是根据解码上下文实时调整融合行为，以单词作为基本对齐单元。具体来说，我们引入了一种基于不确定性的标准来决定在每个解码步骤是否应用集成。在自信的解码状态下，模型直接继续生成。在不确定的状态下，AdaFuse 调用一种多样性感知的缩放策略来探索替代候选续写，并指导集成决策。这种设计建立了适应性集成和测试时缩放之间的协同作用，其中集成决策引导有针对性的探索，而产生的多样性反过来增强了集成质量。在开放域问答、算术推理和机器翻译上的实验表明，AdaFuse 一致地优于强大的集成基线，平均相对改进为 6.88%。代码可在 https://github.com/CCM0111/AdaFuse 获取。

Summary / 总结

AdaFuse is an adaptive ensemble decoding framework designed to dynamically select fusion units during generation, addressing the limitations of fixed-granularity ensembling. It uses an uncertainty-based criterion to decide whether to apply ensembling at each step, and employs a diversity-aware scaling strategy in uncertain states to explore alternative continuations. Experiments show that AdaFuse outperforms strong ensemble baselines, achieving an average relative improvement of 6.88%.

AdaFuse 是一种自适应集成解码框架，旨在生成过程中动态选择融合单元，解决固定粒度集成的局限性。它使用不确定性准则在每一步决定是否应用集成，并在不确定性较高的状态下采用多样性感知的缩放策略来探索替代候选延续，指导集成决策。实验表明，AdaFuse 在开放领域问答、算术推理和机器翻译任务中均优于强集成基线，平均相对改进率为 6.88%。

LookAroundNet: Extending Temporal Context with Transformers for Clinically Viable EEG Seizure Detection

Authors: Þór Sverrisson, Steinn Guðmundsson

First: 2026-01-09T18:52:24+00:00 · Latest: 2026-01-09T18:52:24+00:00