geondpt docs

class geondpt.Paraboloid(input_features, output_features, bias=True, device=None, dtype=None, output_factor=0.1, input_factor=0.01, lr_factor=10.0, wd_factor=10.0, init='live', h_factor=0.01, p_factor=-1e-06, grad_factor=1.0, init_from_numpy=None)

Bases: Module

Passes the incoming data through a layer of paraboloid neurons.

Args:

input_features

Size of each input sample.

output_features

Size of each output sample.

bias

This is to facilitate ease of replacement of Linear layers with Paraboloid ones, does not do anything. Default: True.

output_factor

Multiplies the output of the module. Default: 0.1.

input_factor

Multiplies the input before passing it through the layer. Default: 0.01.

lr_factor

Multiplies the learning rate applied to the parameters by the optimizer. Default: 10.0.

wd_factor

Multiplies the weight decay applied to the parameters by the optimizer. Default: 10.0.

init

Selects the initialization method for the parameters. Valid options are 'spotlight', 'live', 'linear'. Default: 'live'.

h_factor

Affects the 'spotlight' and 'live' initializations. Multiplies the magnitude of the directrix vector. Default: 0.01.

p_factor

Affects the 'spotlight' and 'live' initializations. Determines the offset of the focus from the data subspace. Default: -0.000001.

grad_factor

Multiplies the outgoing delta signal. Default: 1.0.

init_from_numpy

Initiates the parameter tensor directly from a numpy tensor. Default: None.


Shape:

  • Input: \((*, H_{in})\) where \(*\) means any number of dimensions including none and \(H_{in} = \text{in_features}\).

  • Output: \((*, H_{out})\) where all but the last dimension are the same shape as the input and \(H_{out} = \text{out_features}\).


Example:

>>> import torch
>>> import geondpt as gpt
>>> pb = gpt.Paraboloid(20, 30)
>>> input = torch.randn(128, 20)
>>> output = pb(input)
>>> print(output.size())
torch.Size([128, 30])
class geondpt.ParaConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, bias=True, padding_mode='constant', device=None, dtype=None, output_factor=0.1, input_factor=0.01, lr_factor=1.0, wd_factor=2.0, skip_input_grad=False, init='spotlight', h_factor=0.01, p_factor=-1e-06, grad_factor=1.0, init_from_numpy=None)

Bases: Module

Applies a 2D convolution over an input signal composed of several input planes using the paraboloid neuron computation.

The arguments kernel_size, stride, padding, dilation can either be:

  • a single int – in which case the same value is used for the height and width dimension.

  • a tuple of two ints – in which case, the first int is used for the height dimension, and the second int for the width dimension.

This module currently does not support grouping.

Args:
in_channels

Number of channels in the input image.

out_channels

Number of channels produced by the convolution.

kernel_size

Size of the convolving kernel.

stride

Stride of the convolution. Default: 1.

padding

Padding added to all four sides of the input. Default: 0.

dilation

Spacing between kernel elements. Default: 1.

bias

This is to facilitate ease of replacement of Linear layers with Paraboloid ones, does not do anything. Default: True.

padding_mode

Same as torch.nn.functional.pad from PyTorch. Default: 'constant'.

output_factor

Multiplies the output of the module. Default: 0.1.

input_factor

Multiplies the input before passing it through the layer. Default: 0.01.

lr_factor

Multiplies the learning rate applied to the parameters by the optimizer. Default: 1.0.

wd_factor

Multiplies the weight decay applied to the parameters by the optimizer. Default: 2.0.

skip_input_grad

If set to True, it skips the computation of the delta signal, should only be set for the very first layer of the network. Default: False.

init

Selects the initialization method for the parameters. Valid options are 'spotlight', 'linear'. Default: 'spotlight'.

h_factor

Affects the 'spotlight' and 'live' initializations. Multiplies the magnitude of the directrix vector. Default: 0.01.

p_factor

Affects the 'spotlight' and 'live' initializations. Determines the offset of the focus from the data subspace. Default: -0.000001.

grad_factor

Multiplies the outgoing delta signal. Default: 1.0.

init_from_numpy

Initiates the parameter tensor directly from a numpy tensor. Default: None.


Shape:

  • Input: \((N, C_{in}, H_{in}, W_{in})\)

  • Output: \((N, C_{out}, H_{out}, W_{out})\), where

    \[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]
    \[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]

Example:

>>> import torch
>>> import geondpt as gpt
>>> pb = gpt.ParaConv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
>>> input = torch.randn(20, 16, 50, 100)
>>> output = pb(input)
class geondpt.GeoNDSGD(params, lr=0.001, momentum=0, dampening=0, weight_decay=0, nesterov=False, *, maximize=False, foreach: bool | None = None)

Bases: Optimizer

Implements stochastic gradient descent that properly handles weight decay for models that include paraboloid neurons. The foreach parameters is removed, as it seems to not work properly. Otherwise same as torch.optim.SGD.

Args:
params

Iterable of parameters to optimize or dicts defining parameter groups.

lr

Learning rate. Default: ``1e-3’’.

momentum

Momentum factor. Default: ``0’’.

weight_decay

Weight decay. Default: ``0’’.

dampening

Dampening for momentum. Default: ``0’’.

nesterov

Enables Nesterov momentum. Default: ``False’’.

maximize

Maximize the objective with respect to the params, instead of minimizing. Default: ``False’’.


Example:

>>> optimizer = gpt.GeoNDSGD(net.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4, nesterov = True)
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()