geondpt docs

class geondpt.Paraboloid(input_features, output_features, bias=True, device=None, dtype=None, output_factor=0.1, input_factor=0.01, lr_factor=10.0, wd_factor=10.0, init='live', h_factor=0.01, p_factor=-1e-06, grad_factor=1.0, init_from_numpy=None)

Bases: Module

Passes the incoming data through a layer of paraboloid neurons.

Args:

input_features
Size of each input sample.

output_features
Size of each output sample.

bias
This is to facilitate ease of replacement of Linear layers with Paraboloid ones, does not do anything. Default: True.

output_factor
Multiplies the output of the module. Default: 0.1.

input_factor
Multiplies the input before passing it through the layer. Default: 0.01.

lr_factor
Multiplies the learning rate applied to the parameters by the optimizer. Default: 10.0.

wd_factor
Multiplies the weight decay applied to the parameters by the optimizer. Default: 10.0.

init
Selects the initialization method for the parameters. Valid options are 'spotlight', 'live', 'linear'. Default: 'live'.

h_factor
Affects the 'spotlight' and 'live' initializations. Multiplies the magnitude of the directrix vector. Default: 0.01.

p_factor
Affects the 'spotlight' and 'live' initializations. Determines the offset of the focus from the data subspace. Default: -0.000001.

grad_factor
Multiplies the outgoing delta signal. Default: 1.0.

init_from_numpy
Initiates the parameter tensor directly from a numpy tensor. Default: None.

Shape:

Input: \((*, H_{in})\) where \(*\) means any number of dimensions including none and \(H_{in} = \text{in_features}\).

Output: \((*, H_{out})\) where all but the last dimension are the same shape as the input and \(H_{out} = \text{out_features}\).

Example:

>>> import torch
>>> import geondpt as gpt
>>> pb = gpt.Paraboloid(20, 30)
>>> input = torch.randn(128, 20)
>>> output = pb(input)
>>> print(output.size())
torch.Size([128, 30])

class geondpt.ParaConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, bias=True, padding_mode='constant', device=None, dtype=None, output_factor=0.1, input_factor=0.01, lr_factor=1.0, wd_factor=2.0, skip_input_grad=False, init='spotlight', h_factor=0.01, p_factor=-1e-06, grad_factor=1.0, init_from_numpy=None)

Bases: Module

Applies a 2D convolution over an input signal composed of several input planes using the paraboloid neuron computation.

The arguments kernel_size, stride, padding, dilation can either be:

a single int – in which case the same value is used for the height and width dimension.

a tuple of two ints – in which case, the first int is used for the height dimension, and the second int for the width dimension.

This module currently does not support grouping.

Args:

in_channels
Number of channels in the input image.

out_channels
Number of channels produced by the convolution.

kernel_size
Size of the convolving kernel.

stride
Stride of the convolution. Default: 1.

padding
Padding added to all four sides of the input. Default: 0.

dilation
Spacing between kernel elements. Default: 1.

bias
This is to facilitate ease of replacement of Linear layers with Paraboloid ones, does not do anything. Default: True.

padding_mode
Same as torch.nn.functional.pad from PyTorch. Default: 'constant'.

output_factor
Multiplies the output of the module. Default: 0.1.

input_factor
Multiplies the input before passing it through the layer. Default: 0.01.

lr_factor
Multiplies the learning rate applied to the parameters by the optimizer. Default: 1.0.

wd_factor
Multiplies the weight decay applied to the parameters by the optimizer. Default: 2.0.

skip_input_grad
If set to True, it skips the computation of the delta signal, should only be set for the very first layer of the network. Default: False.

init
Selects the initialization method for the parameters. Valid options are 'spotlight', 'linear'. Default: 'spotlight'.

h_factor
Affects the 'spotlight' and 'live' initializations. Multiplies the magnitude of the directrix vector. Default: 0.01.

p_factor
Affects the 'spotlight' and 'live' initializations. Determines the offset of the focus from the data subspace. Default: -0.000001.

grad_factor
Multiplies the outgoing delta signal. Default: 1.0.

init_from_numpy
Initiates the parameter tensor directly from a numpy tensor. Default: None.

Shape:

Input: \((N, C_{in}, H_{in}, W_{in})\)

Output: \((N, C_{out}, H_{out}, W_{out})\), where

\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\]

\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\]

Example:

>>> import torch
>>> import geondpt as gpt
>>> pb = gpt.ParaConv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
>>> input = torch.randn(20, 16, 50, 100)
>>> output = pb(input)

class geondpt.GeoNDSGD(params, lr=0.001, momentum=0, dampening=0, weight_decay=0, nesterov=False, *, maximize=False, foreach: bool | None = None)

Bases: Optimizer

Implements stochastic gradient descent that properly handles weight decay for models that include paraboloid neurons. The foreach parameters is removed, as it seems to not work properly. Otherwise same as torch.optim.SGD.

Args:

params
Iterable of parameters to optimize or dicts defining parameter groups.

lr
Learning rate. Default: ``1e-3’’.

momentum
Momentum factor. Default: ``0’’.

weight_decay
Weight decay. Default: ``0’’.

dampening
Dampening for momentum. Default: ``0’’.

nesterov
Enables Nesterov momentum. Default: ``False’’.

maximize
Maximize the objective with respect to the params, instead of minimizing. Default: ``False’’.

Example:

>>> optimizer = gpt.GeoNDSGD(net.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4, nesterov = True)
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()