Skip to main content
-3 votes
0 answers
13 views

Why normalisation is good for learning in neural netwrok in intutive sense? [closed]

I have been wondering about this when we normalise we lose the magnitude part of the information and only directional information remains so why does losing this magnitude information doesn't impact ...
White Hat's user avatar
0 votes
0 answers
20 views

Autograd on a specific layer’s parameters

I am trying to get the Jacobian matrix of a specific layer's parameters. The below is my network model and i apply functional_call on it. def fm(params, input): return functional_call(self.model, ...
Klae zhou's user avatar
0 votes
0 answers
14 views

Do we plug in the old values or the new values during the gradient descent update?

I have a scenario when I am trying to optimize a vector of D dimensions. Every component of the vector is dependent on other components according to a function such as: summation over (i,j): (1-e(x_i)(...
Darkmoon Chief's user avatar
0 votes
0 answers
24 views

ValueError: One or more gradients are None, meaning no gradients are flowing

I'm trying to train a model I tried to wrote reading this paper: A lightweight model using frequency, trend and temporal attention for long sequence time-series prediction Now, during the training I ...
Giacomo Golino's user avatar
-1 votes
1 answer
48 views

What is wrong with my gradient descent implementation (SVM classifier with hinge loss)

I am trying to implement and train an SVM multi-class classifier from scratch using python and numpy in jupyter notebooks. I have been using the CS231n course as my base of knowledge, especially this ...
ho88it's user avatar
  • 21
0 votes
0 answers
45 views

Lagrangian relaxation (Pytorch implementation) with inequality constraints does not converge

I'm using PyTorch to solve an optimization problem with Lagrangian relaxation. I think my implementation is correct but it just does not converge and I do not why. I'm unsure what the problem is and ...
Z. Ma's user avatar
  • 1
0 votes
0 answers
23 views

Custom Quantile Loss Function in LightGBM R Predicts Same Value for All Quantiles

I'm trying to implement a custom quantile loss function using LightGBM in R, which works like the already implemented quantile loss, to later add further customisation to this function. However, when ...
qweurtz's user avatar
0 votes
0 answers
35 views

How can i just calculate a part of grad using make_functional in PyTorch?

func_model, func_params = make_functional(self.model) def fm(x, func_params): fx = func_model(func_params, x) return fx.squeeze(0).squeeze(0) def floss(...
Klae zhou's user avatar
0 votes
1 answer
59 views

Simple Gradient Descent in Python vs Keras

I am practicing neural networks by building my own in notebooks. I am trying to check my model against an equivalent model in Keras. My model seems to work the same as other simple coded neural ...
AdamS's user avatar
  • 11
1 vote
1 answer
35 views

PyTorch function involving softmax and log2 second derivative is always 0

I'm trying to compute the second derivatives (Hessian) of a function t with respect to a tensor a using PyTorch. Below is the code I initially wrote: import torch torch.manual_seed(0) a = torch....
Ray Bern's user avatar
  • 135
0 votes
0 answers
18 views

In NLOpt, with constrained gradient-based optimization, should I use equality constraints or inequality constraints?

It seems like quite often, either one could apply. That is, I have some function which I know has a minimum of 0 and I want to specify that it has to equal 0. Does it make more sense to specify a ...
dspyz's user avatar
  • 5,450
0 votes
1 answer
59 views

Can you affine warp a tensor while preserving gradient flow?

I'm trying to recreate the cv2.warpAffine() function, taking a tensor input and output rather than a Numpy array. However, gradients calculated from the output tensor produce a Non-None gradient ...
arcanespud's user avatar
0 votes
0 answers
18 views

How to improve the gradient algorithm?

I've recently become interested in the gradient algorithm. After a first implementation in Node.js, I obtained satisfactory results. Sometimes, the algorithm diverges. For example, with the Rosenbrock ...
pronodingo's user avatar
-1 votes
1 answer
38 views

Learning rate in Gradient Descent algorithm

In the gradient descent algorithm, I update the B and M values ​​according to their derivatives and then multiply them with the Learning rate value, but when I use the same value for L, such as 0.0001,...
Fhurky's user avatar
  • 7
0 votes
0 answers
15 views

Not calculating gradients for backward propagation

I have following code for my custom loss function which I want to minimize but it doesnt calculate any gradients for the backwardpass. My code for the custom loss function: class CustomLossFunction(nn....
fabone's user avatar
  • 21

15 30 50 per page
1
2 3 4 5
99