Skip to main content

Showing 1–50 of 50 results for author: Frei, S

  1. arXiv:2410.07746  [pdf, other

    cs.LG stat.ML

    Benign Overfitting in Single-Head Attention

    Authors: Roey Magen, Shuning Shang, Zhiwei Xu, Spencer Frei, Wei Hu, Gal Vardi

    Abstract: The phenomenon of benign overfitting, where a trained neural network perfectly fits noisy training data but still achieves near-optimal test performance, has been extensively studied in recent years for linear models and fully-connected/convolutional networks. In this work, we study benign overfitting in a single-head softmax attention model, which is the fundamental building block of Transformers… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  2. arXiv:2410.01774  [pdf, other

    cs.LG stat.ML

    Trained Transformer Classifiers Generalize and Exhibit Benign Overfitting In-Context

    Authors: Spencer Frei, Gal Vardi

    Abstract: Transformers have the capacity to act as supervised learning algorithms: by properly encoding a set of labeled training ("in-context") examples and an unlabeled test example into an input sequence of vectors of the same dimension, the forward pass of the transformer can produce predictions for that unlabeled test example. A line of recent work has shown that when linear transformers are pre-traine… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: 34 pages

  3. arXiv:2407.18904  [pdf, other

    math.AG

    Birational geometry of Fano varieties of lines on cubic fourfolds containing pairs of cubic scrolls

    Authors: Corey Brooke, Sarah Frei, Lisa Marquand, Xuqiang Qin

    Abstract: We characterize the birational geometry of some hyperkähler fourfolds of Picard rank $3$ obtained as the Fano varieties of lines on cubic fourfolds containing pairs of cubic scrolls. In each of the two cases considered, we provide a census of the birational models, relating each model to familiar geometric constructions. We also provide structural results about the birational automorphism groups,… ▽ More

    Submitted 2 August, 2024; v1 submitted 26 July, 2024; originally announced July 2024.

    Comments: v2: updated figures and ancillary files. 34 pages, 4 figures, two ancillary files

    MSC Class: 14J42; 14E08; 14E07; 14J35

  4. arXiv:2406.13510  [pdf, ps, other

    math.AG

    Conic bundle threefolds differing by a constant Brauer class and connections to rationality

    Authors: Sarah Frei, Lena Ji, Soumya Sankar, Bianca Viray, Isabel Vogt

    Abstract: A double cover $Y$ of $\mathbb{P}^1 \times \mathbb{P}^2$ ramified over a general $(2,2)$-divisor will have the structure of a geometrically standard conic bundle ramified over a smooth plane quartic $Δ\subset \mathbb{P}^2$ via the second projection. These threefolds are rational over algebraically closed fields, but over nonclosed fields, including over $\mathbb{R}$, their rationality is an open p… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 18 pages

    MSC Class: 14C25 (Primary) 14E08; 14G27; 14H40; 14K30 (Secondary)

  5. arXiv:2404.00522  [pdf, other

    cs.LG stat.ML

    Minimum-Norm Interpolation Under Covariate Shift

    Authors: Neil Mallinar, Austin Zane, Spencer Frei, Bin Yu

    Abstract: Transfer learning is a critical part of real-world machine learning deployments and has been extensively studied in experimental works with overparameterized neural networks. However, even in the simplest setting of linear regression a notable gap still exists in the theoretical understanding of transfer learning. In-distribution research on high-dimensional linear regression has led to the identi… ▽ More

    Submitted 17 July, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: The Forty-first International Conference on Machine Learning (ICML 2024)

  6. arXiv:2403.12517  [pdf, other

    math.AG

    On decompositions for Fano schemes of intersections of two quadrics

    Authors: Pieter Belmans, Jishnu Bose, Sarah Frei, Benjamin Gould, James Hotchkiss, Alicia Lamarche, Jack Petok, Cristian Rodriguez Avila, Saket Shah

    Abstract: We propose conjectural semiorthogonal decompositions for Fano schemes of linear subspaces on intersections of two quadrics, in terms of symmetric powers of the associated hyperelliptic (resp. stacky) curve. When the intersection is odd-dimensional, we moreover conjecture an identity in the Grothendieck ring of varieties and other motivic contexts. The evidence for these conjectures is given by upg… ▽ More

    Submitted 8 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 27 pages, comments are very welcome; v2: added an inadvertently omitted reference in support of Conjecture A

  7. arXiv:2402.15615  [pdf, other

    physics.flu-dyn math.NA

    Attached and separated rotating flow over a finite height ridge

    Authors: Stefan Frei, Erik Burman, Edward R Johnson

    Abstract: This paper discusses the effect of rotation on the boundary layer in high Reynolds number flow over a ridge using a numerical method based on stabilised finite elements that captures steady solutions up to Reynolds number of order $10^6$. The results are validated against boundary layer computations in shallow flows and for deep flows against experimental observations reported in Machicoane et al.… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  8. arXiv:2402.00209  [pdf, other

    math.NA

    Modeling and numerical simulation of fully Eulerian fluid-structure interaction using cut finite elements

    Authors: Stefan Frei, Tobias Knoke, Marc C. Steinbach, Anne-Kathrin Wenske, Thomas Wick

    Abstract: We present a monolithic finite element formulation for (nonlinear) fluid-structure interaction in Eulerian coordinates. For the discretization we employ an unfitted finite element method based on inf-sup stable finite elements. So-called ghost penalty terms are used to guarantee the robustness of the approach independently of the way the interface cuts the finite element mesh. The resulting system… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  9. arXiv:2312.17373  [pdf, other

    math.NA math.OC

    A non-intrusive neural-network based BFGS algorithm for parameter estimation in non-stationary elasticity

    Authors: Stefan Frei, Jan Reichle, Stefan Volkwein

    Abstract: We present a non-intrusive gradient and a non-intrusive BFGS algorithm for parameter estimation problems in non-stationary elasticity. To avoid multiple (and potentially expensive) solutions of the underlying partial differential equation (PDE), we approximate the PDE solver by a neural network within the algorithms. The network is trained offline for a given set of parameters. The algorithms are… ▽ More

    Submitted 15 August, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  10. arXiv:2310.02541  [pdf, other

    cs.LG stat.ML

    Benign Overfitting and Grokking in ReLU Networks for XOR Cluster Data

    Authors: Zhiwei Xu, Yutong Wang, Spencer Frei, Gal Vardi, Wei Hu

    Abstract: Neural networks trained by gradient descent (GD) have exhibited a number of surprising generalization behaviors. First, they can achieve a perfect fit to noisy training data and still generalize near-optimally, showing that overfitting can sometimes be benign. Second, they can undergo a period of classical, harmful overfitting -- achieving a perfect fit to training data with near-random performanc… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  11. arXiv:2309.00531  [pdf, ps, other

    math.NT math.AG

    On abelian varieties whose torsion is not self-dual

    Authors: Sarah Frei, Katrina Honigs, John Voight

    Abstract: We construct infinitely many abelian surfaces $A$ defined over the rational numbers such that, for $\ell\leqslant 7$ prime, the $\ell$-torsion subgroup of $A$ is not isomorphic as a Galois module to the $\ell$-torsion subgroup of the dual abelian surface. We do this by analyzing the action of the Galois group on the $\ell$-adic Tate module and its reduction modulo $\ell$.

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: 22 pages

  12. arXiv:2308.03215  [pdf, other

    stat.ML cs.LG

    The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning

    Authors: Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu

    Abstract: In this work, we investigate the dynamics of stochastic gradient descent (SGD) when training a single-neuron autoencoder with linear or ReLU activation on orthogonal data. We show that for this non-convex problem, randomly initialized SGD with a constant step size successfully finds a global minimum for any batch size choice. However, the particular global minimum found depends upon the batch size… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  13. arXiv:2306.09927  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Trained Transformers Learn Linear Models In-Context

    Authors: Ruiqi Zhang, Spencer Frei, Peter L. Bartlett

    Abstract: Attention-based neural networks such as transformers have demonstrated a remarkable ability to exhibit in-context learning (ICL): Given a short prompt sequence of tokens from an unseen task, they can formulate relevant per-token and next-token predictions without any parameter updates. By embedding a sequence of labeled training data and unlabeled test data as a prompt, this allows for transformer… ▽ More

    Submitted 19 October, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 50 pages, revised definition 3.2 and corollary 4.3

  14. A threefold violating a local-to-global principle for rationality

    Authors: Sarah Frei, Lena Ji

    Abstract: In this note we construct an example of a smooth projective threefold that is irrational over $\mathbb Q$ but is rational at all places. Our example is a complete intersection of two quadrics in $\mathbb P^5$, and we show it has the desired rationality behavior by constructing an explicit element of order $4$ in the Tate--Shafarevich group of the Jacobian of an associated genus $2$ curve.

    Submitted 26 June, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: 7 pages. v2: the example is now unconditional

    MSC Class: Primary: 14E08. Secondary: 14G12; 14J20; 11G30

    Journal ref: Res. Number Theory 10 (2024), no. 2, Paper No. 39, 9 pp

  15. arXiv:2303.01462  [pdf, ps, other

    cs.LG stat.ML

    Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization

    Authors: Spencer Frei, Gal Vardi, Peter L. Bartlett, Nathan Srebro

    Abstract: Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have an implicit bias towards solutions which satisfy the Karush--Kuhn--Tucker (KKT) conditions for margin maximization. In this work we establish a number of settings where the satisfaction of these KKT conditions implies benign overfitting in linear classifiers and in two-layer leaky ReLU networks: the estim… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 53 pages

  16. arXiv:2303.01456  [pdf, ps, other

    cs.LG stat.ML

    The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks

    Authors: Spencer Frei, Gal Vardi, Peter L. Bartlett, Nathan Srebro

    Abstract: In this work, we study the implications of the implicit bias of gradient flow on generalization and adversarial robustness in ReLU networks. We focus on a setting where the data consists of clusters and the correlations between cluster means are small, and show that in two-layer ReLU networks gradient flow is biased towards solutions that generalize well, but are highly vulnerable to adversarial e… ▽ More

    Submitted 31 October, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: 42 pages; NeurIPS 2023 camera ready

  17. arXiv:2210.07082  [pdf, other

    cs.LG stat.ML

    Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data

    Authors: Spencer Frei, Gal Vardi, Peter L. Bartlett, Nathan Srebro, Wei Hu

    Abstract: The implicit biases of gradient-based optimization algorithms are conjectured to be a major factor in the success of modern deep learning. In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with leaky ReLU activations when the training data are nearly-orthogonal, a common property of high-dimensional data. For gradient… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 54 pages

  18. arXiv:2207.14035  [pdf, ps, other

    math.AG math.NT

    Groups of symplectic involutions on symplectic varieties of Kummer type and their fixed loci

    Authors: Sarah Frei, Katrina Honigs

    Abstract: We describe the Galois action on the middle $\ell$-adic cohomology of smooth, projective fourfolds $K_A(v)$ that occur as a fiber of the Albanese morphism on moduli spaces of sheaves on an abelian surface $A$ with Mukai vector $v$. We show this action is determined by the action on $H^2_{ét}(A_{\bar{k}},\mathbb{Q}_\ell(1))$ and on a subgroup $G_A(v) \leqslant (A\times \hat{A})[3]$, which depends o… ▽ More

    Submitted 15 April, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

    Comments: 44 pages. v3: minor revisions. final version to appear in Forum Math. Sigma

  19. Curve classes on conic bundle threefolds and applications to rationality

    Authors: Sarah Frei, Lena Ji, Soumya Sankar, Bianca Viray, Isabel Vogt

    Abstract: We undertake a study of conic bundle threefolds $π\colon X\to W$ over geometrically rational surfaces whose associated discriminant covers $\tildeΔ\toΔ\subset W$ are smooth and geometrically irreducible. First, we determine the structure of the group $\mathrm{CH}^2 X_{\overline{k}}$ of rational equivalence classes of curves. Precisely, we construct a Galois-equivariant group homomorphism from… ▽ More

    Submitted 21 July, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

    Comments: 39 pages. Comments welcome! v2: Updated introduction. v3: Added Section 3, Subsection 5.5, and Example 8.6

    MSC Class: 14C25 (Primary); 14E08; 14G27; 14H40; 14K30 (Secondary)

    Journal ref: Algebr. Geom. 11 (3) (2024) 421-459

  20. arXiv:2207.02081  [pdf, other

    math.NA

    Efficient coarse correction for parallel time-stepping in plaque growth simulations

    Authors: Stefan Frei, Alexander Heinlein

    Abstract: In order to make the numerical simulation of atherosclerotic plaque growth feasible, a temporal homogenization approach is employed. The resulting macro-scale problem for the plaque growth can be further accelerated by using parallel time integration schemes, such as the parareal algorithm. However, the parallel scalability is dominated by the computational cost of the coarse propagator. Therefore… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: This article has been written as a contribution to a conference proceedings, which builds on our previous article arXiv:2203.06526

    MSC Class: 65M60; 65Y05

  21. arXiv:2203.06581  [pdf, other

    math.NA

    Analysis of an implicitly extended Crank-Nicolson scheme for the heat equation on a time-dependent domain

    Authors: Stefan Frei, Maneesh Kumar Singh

    Abstract: We consider a time-stepping scheme of Crank-Nicolson type for the heat equation on a moving domain in Eulerian coordinates. As the spatial domain varies between subsequent time steps, an extension of the solution from the previous time step is required. Following Lehrenfeld \& Olskanskii [ESAIM: M2AN, 53(2):\,585-614, 2019], we apply an implicit extension based on so-called ghost-penalty terms. Fo… ▽ More

    Submitted 28 April, 2023; v1 submitted 13 March, 2022; originally announced March 2022.

    MSC Class: 65M12; 65M60; 65M85

  22. Towards parallel time-stepping for the numerical simulation of atherosclerotic plaque growth

    Authors: Stefan Frei, Alexander Heinlein

    Abstract: The numerical simulation of atherosclerotic plaque growth is computationally prohibitive, since it involves a complex cardiovascular fluid-structure interaction (FSI) problem with a characteristic time scale of milliseconds to seconds, as well as a plaque growth process governed by reaction-diffusion equations, which takes place over several months. In this work we combine a temporal homogenizatio… ▽ More

    Submitted 25 April, 2023; v1 submitted 12 March, 2022; originally announced March 2022.

  23. arXiv:2202.07626  [pdf, other

    cs.LG math.ST stat.ML

    Random Feature Amplification: Feature Learning and Generalization in Neural Networks

    Authors: Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

    Abstract: In this work, we provide a characterization of the feature-learning process in two-layer ReLU networks trained by gradient descent on the logistic loss following random initialization. We consider data with binary labels that are generated by an XOR-like function of the input features. We permit a constant fraction of the training labels to be corrupted by an adversary. We show that, although line… ▽ More

    Submitted 13 September, 2023; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: 46 pages; JMLR camera ready revision

  24. arXiv:2202.05928  [pdf, ps, other

    cs.LG math.ST stat.ML

    Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data

    Authors: Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

    Abstract: Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient descent. To better understand this empirical observation, we consider the generalization error of two-layer neural networks trained to interpolation by gradient descent on the logistic loss following random initialization. We… ▽ More

    Submitted 13 September, 2023; v1 submitted 11 February, 2022; originally announced February 2022.

    Comments: 39 pages; minor corrections

  25. arXiv:2111.08668  [pdf, ps, other

    math.AG math.NT

    Reduction of Brauer classes on K3 surfaces, rationality and derived equivalence

    Authors: Sarah Frei, Brendan Hassett, Anthony Várilly-Alvarado

    Abstract: We consider the reduction of Brauer classes on surfaces over number fields, with a view toward applications to rationality and derived equivalence. We show that a Brauer class on a very general polarized K3 surface over a number field becomes trivial upon reduction for a set of places of positive natural density. As a consequence, there are cubic fourfolds which become rational upon reduction for… ▽ More

    Submitted 4 March, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: 23 pages; minor changes to clarify main argument

    MSC Class: 14J28; 14F22 (primary); 14E08; 14F08 (secondary)

  26. arXiv:2106.13805  [pdf, other

    cs.LG math.OC stat.ML

    Self-training Converts Weak Learners to Strong Learners in Mixture Models

    Authors: Spencer Frei, Difan Zou, Zixiang Chen, Quanquan Gu

    Abstract: We consider a binary classification problem when the data comes from a mixture of two rotationally symmetric distributions satisfying concentration and anti-concentration properties enjoyed by log-concave distributions among others. We show that there exists a universal constant $C_{\mathrm{err}}>0$ such that if a pseudolabeler $\boldsymbolβ_{\mathrm{pl}}$ can achieve classification error at most… ▽ More

    Submitted 25 August, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: 23 pages. This version has added more detailed comparisons with related work, fixed a technical issue in the original proof, and improved the convergence guarantee to be about the last iterate of stochastic gradient descent

  27. arXiv:2106.13792  [pdf, ps, other

    cs.LG math.OC stat.ML

    Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

    Authors: Spencer Frei, Quanquan Gu

    Abstract: Although the optimization objectives for learning neural networks are highly non-convex, gradient-based methods have been wildly successful at learning neural networks in practice. This juxtaposition has led to a number of recent studies on provable guarantees for neural networks trained by gradient descent. Unfortunately, the techniques in these works are often highly specific to the particular s… ▽ More

    Submitted 13 September, 2022; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: 16 pages. Updated presentation, changed results from online SGD to batch GD

  28. arXiv:2106.09394  [pdf, other

    math.NA

    On temporal homogenization in the numerical simulation of atherosclerotic plaque growth

    Authors: Stefan Frei, Alexander Heinlein, Thomas Richter

    Abstract: A temporal homogenization approach for the numerical simulation of atherosclerotic plaque growth is extended to fully coupled fluid-structure interaction (FSI) simulations. The numerical results indicate that the two-scale approach yields significantly different results compared to a simple heuristic averaging, where only stationary long-scale FSI problems are solved, confirming the importance of… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    MSC Class: 65M60; 74F10; 76M10; 76Z05; 92C35

  29. arXiv:2104.09437  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Provable Robustness of Adversarial Training for Learning Halfspaces with Noise

    Authors: Difan Zou, Spencer Frei, Quanquan Gu

    Abstract: We analyze the properties of adversarial training for learning adversarially robust halfspaces in the presence of agnostic label noise. Denoting $\mathsf{OPT}_{p,r}$ as the best robust classification error achieved by a halfspace that is robust to perturbations of $\ell_{p}$ balls of radius $r$, we show that adversarial training on the standard binary cross-entropy loss yields adversarially robust… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: 42 pages, 2 figures

  30. A mechanically consistent model for fluid-structure interactions with contact including seepage

    Authors: Erik Burman, Miguel A. Fernández, Stefan Frei, Fannie M. Gerosa

    Abstract: We present a new approach for the mechanically consistent modelling and simulation of fluid-structure interactions with contact. The fundamental idea consists of combining a relaxed contact formulation with the modelling of seepage through a porous layer of co-dimension 1 during contact. For the latter, a Darcy model is considered in a thin porous layer attached to a solid boundary in the limit of… ▽ More

    Submitted 20 March, 2021; originally announced March 2021.

  31. arXiv:2101.01152  [pdf, other

    cs.LG math.OC stat.ML

    Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise

    Authors: Spencer Frei, Yuan Cao, Quanquan Gu

    Abstract: We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic gradient descent (SGD) following an arbitrary initialization. We prove that SGD produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution for a broad class of distributions that includes log-concave isotropic and hard margin distributions. Eq… ▽ More

    Submitted 15 February, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: 30 pages, 10 figures

  32. arXiv:2011.08691  [pdf, other

    physics.flu-dyn cs.CE math.NA physics.comp-ph

    Falling balls in a viscous fluid with contact: Comparing numerical simulations with experimental data

    Authors: Henry von Wahl, Thomas Richter, Stefan Frei, Thomas Hagemeier

    Abstract: We evaluate a number of different finite element approaches for fluid-structure (contact) interaction problems against data from physical experiments. For this we take the data from experiments by Hagemeier [Mendeley Data, doi: 10.17632/mf27c92nc3.1]. This consists of trajectories of single particles falling through a highly viscous fluid and rebounding off the bottom fluid tank wall. The resultin… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Journal ref: Physics of Fluids 33, 033304 (2021)

  33. arXiv:2010.00539  [pdf, other

    cs.LG math.OC stat.ML

    Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins

    Authors: Spencer Frei, Yuan Cao, Quanquan Gu

    Abstract: We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of linear halfspaces. If $\mathsf{OPT}$ is the best classification error achieved by a halfspace, by appealing to the notion of soft margins we are able to show that gradient descent finds halfspaces with classification error $\tilde O(\mathsf{OPT}^{1/2}) + \varepsilon$ in… ▽ More

    Submitted 13 February, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: 25 pages, 1 table

  34. arXiv:2007.13906  [pdf, other

    math.NA

    A locally modified second-order finite element method for interface problems and its implementation in 2 dimensions

    Authors: Stefan Frei, Gozel Judakova, Thomas Richter

    Abstract: The locally modified finite element method, which is introduced in [Frei, Richter: SINUM 52(2014), p. 2315-2334], is a simple fitted finite element method that is able to resolve weak discontinuities in interface problems. The method is based on a fixed structured coarse mesh, which is then refined into sub-elements to resolve an interior interface. In this work, we extend the locally modified f… ▽ More

    Submitted 28 April, 2023; v1 submitted 27 July, 2020; originally announced July 2020.

  35. arXiv:2005.14426  [pdf, other

    cs.LG math.OC stat.ML

    Agnostic Learning of a Single Neuron with Gradient Descent

    Authors: Spencer Frei, Yuan Cao, Quanquan Gu

    Abstract: We consider the problem of learning the best-fitting single neuron as measured by the expected square loss $\mathbb{E}_{(x,y)\sim \mathcal{D}}[(σ(w^\top x)-y)^2]$ over some unknown joint distribution $\mathcal{D}$ by using gradient descent to minimize the empirical risk induced by a set of i.i.d. samples $S\sim \mathcal{D}^n$. The activation function $σ$ is an arbitrary Lipschitz and non-decreasin… ▽ More

    Submitted 31 August, 2020; v1 submitted 29 May, 2020; originally announced May 2020.

    Comments: 31 pages, 3 tables. This version improves the risk bound from O(OPT^1/2) to O(OPT) for strictly increasing activation functions

  36. arXiv:1912.08503  [pdf, other

    math.NA

    3D-2D Stokes-Darcy coupling for the modelling of seepage with an application to fluid-structure interaction with contact

    Authors: Erik Burman, Miguel A. Fernández, Stefan Frei, Fannie M. Gerosa

    Abstract: In this note we introduce a mixed dimensional Stokes-Darcy coupling where a $d$ dimensional Stokes' flow is coupled to a Darcy model on the $d-1$ dimensional boundary of the domain. The porous layer introduces tangential creeping flow along the boundary and allows for the modelling of boundary flow due to surface roughness. This leads to a new model of flow in fracture networks with reservoirs in… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  37. arXiv:1910.03054  [pdf, other

    math.NA

    Eulerian time-stepping schemes for the non-stationary Stokes equations on time-dependent domains

    Authors: Erik Burman, Stefan Frei, Andre Massing

    Abstract: This article is concerned with the discretisation of the Stokes equations on time-dependent domains in an Eulerian coordinate framework. Our work can be seen as an extension of a recent paper by Lehrenfeld & Olshanskii [ESAIM: M2AN, 53(2):585-614, 2019], where BDF-type time-stepping schemes are studied for a parabolic equation on moving domains. For space discretisation, a geometrically unfitted f… ▽ More

    Submitted 1 December, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

  38. arXiv:1910.02934  [pdf, other

    cs.LG math.OC stat.ML

    Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks

    Authors: Spencer Frei, Yuan Cao, Quanquan Gu

    Abstract: The skip-connections used in residual networks have become a standard architecture choice in deep learning due to the increased training stability and generalization performance with this architecture, although there has been limited theoretical understanding for this improvement. In this work, we analyze overparameterized deep residual networks trained by gradient descent following random initial… ▽ More

    Submitted 7 October, 2019; originally announced October 2019.

    Comments: 37 pages. In NeurIPS 2019

  39. Weak imposition of Signorini boundary conditions on the boundary element method

    Authors: Erik Burman, Stefan Frei, Matthew W. Scroggs

    Abstract: We derive and analyse a boundary element formulation for boundary conditions involving inequalities. In particular, we focus on Signorini contact conditions. The Calderón projector is used for the system matrix and boundary conditions are weakly imposed using a particular variational boundary operator designed using techniques from augmented Lagrangian methods. We present a complete numerical a pr… ▽ More

    Submitted 13 May, 2020; v1 submitted 15 August, 2019; originally announced August 2019.

    Journal ref: SIAM Journal on Numerical Analysis 58(4), 2020, 2334-2350

  40. Rational points and derived equivalence

    Authors: Nicolas Addington, Benjamin Antieau, Sarah Frei, Katrina Honigs

    Abstract: We give the first examples of derived equivalences between varieties defined over non-closed fields where one has a rational point and the other does not. We begin with torsors over Jacobians of curves over Q and F_q(t), and conclude with a pair of hyperkaehler 4-folds over Q. The latter is independently interesting as a new example of a transcendental Brauer-Manin obstruction to the Hasse princip… ▽ More

    Submitted 16 December, 2020; v1 submitted 5 June, 2019; originally announced June 2019.

    Comments: 20 pages, magma code as ancillary files. final version to appear in Compositio Math

    Journal ref: Compositio Math. 157 (2021) 1036-1050

  41. Efficient approximation of flow problems with multiple scales in time

    Authors: Stefan Frei, Thomas Richter

    Abstract: In this article we address flow problems that carry a multiscale character in time. In particular we consider the Navier-Stokes flow in a channel on a fast scale that influences the movement of the boundary which undergoes a deformation on a slow scale in time. We derive an averaging scheme that is of first order with respect to the ratio of time-scales $ε$. In order to cope with the problem of un… ▽ More

    Submitted 31 March, 2020; v1 submitted 28 March, 2019; originally announced March 2019.

    Journal ref: SIAM Multiscale Modeling and Simulation 18(2), 2020

  42. arXiv:1810.06735  [pdf, ps, other

    math.AG math.NT

    Moduli spaces of sheaves on K3 surfaces and Galois representations

    Authors: Sarah Frei

    Abstract: We consider two K3 surfaces defined over an arbitrary field, together with a smooth proper moduli space of stable sheaves on each. When the moduli spaces have the same dimension, we prove that if the étale cohomology groups (with Q_ell coefficients) of the two surfaces are isomorphic as Galois representations, then the same is true of the two moduli spaces. In particular, if the field of definitio… ▽ More

    Submitted 12 May, 2021; v1 submitted 15 October, 2018; originally announced October 2018.

    Comments: 16 pages. Minor changes to match published version, which appeared in Selecta Math

  43. arXiv:1810.04766  [pdf, other

    math.NA

    An edge-based pressure stabilisation technique for finite elements on arbitrarily anisotropic meshes

    Authors: Stefan Frei

    Abstract: In this article, we analyse a stabilised equal-order finite element approximation for the Stokes equations on anisotropic meshes. In particular, we allow arbitrary anisotropies in a sub-domain, for example along the boundary of the domain, with the only condition that a maximum angle is fulfilled in each element.This discretisation is motivated by applications on moving domains as arising e.g. in… ▽ More

    Submitted 10 October, 2018; originally announced October 2018.

    MSC Class: 65N12; 65N30; 76D07

  44. arXiv:1808.08758  [pdf, other

    math.NA

    A Nitsche-based formulation for fluid-structure interactions with contact

    Authors: Erik Burman, Miguel A. Fernández, Stefan Frei

    Abstract: We derive a Nitsche-based formulation for fluid-structure interaction (FSI) problems with contact. The approach is based on the work of Chouly and Hild [SIAM Journal on Numerical Analysis. 2013;51(2):1295--1307] for contact problems in solid mechanics. We present two numerical approaches, both of them formulating the FSI interface and the contact conditions simultaneously in equation form on a joi… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    MSC Class: 65M60

  45. arXiv:1806.00999  [pdf, other

    math.NA

    On the implementation of a locally modified finite element method for interface problems in deal.II

    Authors: Stefan Frei, Thomas Richter, Thomas Wick

    Abstract: In this work, we describe a simple finite element approach that is able to resolve weak discontinuities in interface problems accurately. The approach is based on a fixed patch mesh consisting of quadrilaterals, that will stay unchanged independent of the position of the interface. Inside the patches we refine once more, either in eight triangles or in four quadrilaterals, in such a way that the i… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

    MSC Class: 65M60

  46. arXiv:1804.02768  [pdf, other

    math.NA

    Finite element simulation of fluid dynamics and CO$_2$ gas exchange in the alveolar sacs of the human lung

    Authors: Luis J. Caucha, Stefan Frei, Obidio Rubio

    Abstract: In this article we present a numerical framework based on continuum models for the fluid dynamics and the CO$_2$ gas distribution in the alveolar sacs of the human lung during expiration and inspiration, including the gas exchange to the cardiovascular system. We include the expansion and contraction of the geometry by means of the Arbitrary Lagrangian Eulerian (ALE) method. For discretisation, we… ▽ More

    Submitted 8 April, 2018; originally announced April 2018.

    MSC Class: 65M60; 76Z05

  47. arXiv:1706.00632  [pdf, other

    math.OC

    An adaptive Newton algorithm for optimal control problems with application to optimal electrode design

    Authors: Thomas Carraro, Simon Dörsam, Stefan Frei, Daniel Schwarz

    Abstract: In this work we present an adaptive Newton-type method to solve nonlinear constrained optimization problems in which the constraint is a system of partial differential equations discretized by the finite element method. The adaptive strategy is based on a goal-oriented a posteriori error estimation for the discretization and for the iteration error. The iteration error stems from an inexact soluti… ▽ More

    Submitted 2 June, 2017; originally announced June 2017.

  48. arXiv:1703.03516  [pdf, ps, other

    math.NT

    The a-number of hyperelliptic curves

    Authors: Sarah Frei

    Abstract: It is known that for a smooth hyperelliptic curve to have a large $a$-number, the genus must be small relative to the characteristic of the field, $p>0$, over which the curve is defined. It was proven by Elkin that for a genus $g$ hyperelliptic curve $C$ to have $a_C=g-1$, the genus is bounded by $g<\frac{3p}{2}$. In this paper, we show that this bound can be lowered to $g <p$. The method of proof… ▽ More

    Submitted 26 June, 2017; v1 submitted 9 March, 2017; originally announced March 2017.

    Comments: 7 pages. v2: revised and improved the proof of the main theorem based on suggestions from the referee. To appear in the proceedings volume of Women in Numbers Europe-2

  49. arXiv:1603.04130  [pdf, ps, other

    math.PR

    A lower bound for $p_c$ in range-$R$ bond percolation in two and three dimensions

    Authors: Spencer Frei, Edwin Perkins

    Abstract: We use the connection between bond percolation and SIR epidemics to establish lower bounds for the critical percolation probability in $2$ and $3$ dimensions as the range becomes large. The bound agrees with the conjectured asymptotics for the long range critical probability, refines results of M. Penrose, and complements results of van der Hofstad and Sakai in dimensions greater than $6$.

    Submitted 14 March, 2016; originally announced March 2016.

    MSC Class: 60K35 (Primary) 60J68; 60J80; 92D30 (Secondary)

  50. Quantification of deviations from rationality with heavy-tails in human dynamics

    Authors: Thomas Maillart, Didier Sornette, Stefan Frei, Thomas Duebendorfer, Alexander Saichev

    Abstract: The dynamics of technological, economic and social phenomena is controlled by how humans organize their daily tasks in response to both endogenous and exogenous stimulations. Queueing theory is believed to provide a generic answer to account for the often observed power-law distributions of waiting times before a task is fulfilled. However, the general validity of the power law and the nature of o… ▽ More

    Submitted 23 July, 2010; originally announced July 2010.

    Comments: 17 pages, 4 figures