Hello,
I wrote a very minimal example on how to use PyTorch in PureBasic for inference if anyone is interested. It involves C/C++, Python and PureBasic, you have to know all the languages to a certain extent, as models are usually designed and trained in Python and there is no standard C interface for PyTorch, so you have to add a C++ to C layer first.
The example works on Arch Linux 64bit at the current point in time:
https://github.com/Bradan/pytorch-in-purebasic
It's far from perfect and probably still has a memory leak (requires_grad on tensors will gather gradient information). However it shows that it is possible and that's all I wanted to achieve.
Enjoy.
Using PyTorch in PureBasic for inference
-
- Addict
- Posts: 2345
- Joined: Mon Jun 02, 2003 9:16 am
- Location: Germany
- Contact:
Using PyTorch in PureBasic for inference
bye,
Daniel
Daniel
Re: Using PyTorch in PureBasic for inference
I'm excited to see someone working towards AI on PureBasic even if another languages are involved.
I hope we can use AI with PureBasic in a easy way.
Thank you very much by the effort
I hope we can use AI with PureBasic in a easy way.
Thank you very much by the effort
Re: Using PyTorch in PureBasic for inference
Thanks i will have a good look at this.
-
- Addict
- Posts: 2345
- Joined: Mon Jun 02, 2003 9:16 am
- Location: Germany
- Contact:
Re: Using PyTorch in PureBasic for inference
Thanks for showing interest in this, this made me fix some minor problems and add a few more forward calls, in case you have models with more than one input (language transformers usually have a mask for example).
To have basic learning functionality (backpropagation utilizing the gradient) you'd need more tensor functions, meaning slicing/view/add/multiply/divide/... and the loss functions would be awesome as well. However almost everyone does that in Python, as the syntax allows straight forward tensor usage by operator overloading and so on.
The whole machine learning thing is reducable to numerical optimization problems. Imagine you are standing on a landscape and you want to find the deepest point on that. What you would do is start from an arbitrary point and let something roll down towards the deepest point until it stands still. You cannot be sure that there's no deeper point on the whole landscape but you reached a local minimum. The process of letting something roll down the landscape is called gradient descent. You only find local optimas this way, no global optimas, but the high amount of dimensions/parameters of some problems often makes it difficult to find a global optima anyway.
Frameworks like PyTorch and Tensorflow are supporting you at finding the next local optimum. The basic elements you always work with are tensors. You add them, multiply them, slice them, ... and the frameworks keep track of the gradient for you (autograd). This means you convert the input image to a tensor object, push it through some neural network calculations (usually matrix multiplications of the input with weights etc.), calculate the loss (usually the difference to the expected output), push the gradient back along the calclations again and the optimizer does a step with the learning rate as step length on a specific subset of weights/parameters inside the chain of calculations.
You can see exactly this in train.py (I added some comments):
You can also see the whole thing as a complex construct of springs that are connected and under tension. As soon as you let the springs go they will jump around and try to relax and reduce the tension until the whole system stands still.
To have basic learning functionality (backpropagation utilizing the gradient) you'd need more tensor functions, meaning slicing/view/add/multiply/divide/... and the loss functions would be awesome as well. However almost everyone does that in Python, as the syntax allows straight forward tensor usage by operator overloading and so on.
The whole machine learning thing is reducable to numerical optimization problems. Imagine you are standing on a landscape and you want to find the deepest point on that. What you would do is start from an arbitrary point and let something roll down towards the deepest point until it stands still. You cannot be sure that there's no deeper point on the whole landscape but you reached a local minimum. The process of letting something roll down the landscape is called gradient descent. You only find local optimas this way, no global optimas, but the high amount of dimensions/parameters of some problems often makes it difficult to find a global optima anyway.
Frameworks like PyTorch and Tensorflow are supporting you at finding the next local optimum. The basic elements you always work with are tensors. You add them, multiply them, slice them, ... and the frameworks keep track of the gradient for you (autograd). This means you convert the input image to a tensor object, push it through some neural network calculations (usually matrix multiplications of the input with weights etc.), calculate the loss (usually the difference to the expected output), push the gradient back along the calclations again and the optimizer does a step with the learning rate as step length on a specific subset of weights/parameters inside the chain of calculations.
You can see exactly this in train.py (I added some comments):
Code: Select all
def train(model, dataset):
data_loader = torch.utils.data.DataLoader(dataset,
batch_size=8,
shuffle=True,
num_workers=8)
loss_fn = torch.nn.MSELoss() # Mean Square Error as loss function
optim = torch.optim.AdamW(model.parameters(), lr=0.0001) # optimize only the model's weights with the given learning rate
for epoch in range(3):
loss_sum = 0.0
count = 0
model.train()
for inputs, outputs in data_loader:
batch_size = inputs.size(0)
outputs = torch.nn.functional.one_hot(outputs, 10).to(dtype=torch.float)
result = model(inputs)
loss = loss_fn(result, outputs) # calculate the loss
loss_sum += float(loss)
count += batch_size
loss.backward() # push the gradient back along all calculations
optim.step() # do a step and update the weights
if count >= 10000:
break
print(f"Epoch {epoch} finished with loss {loss_sum / count}")
model.eval()
torch.jit.save(torch.jit.script(model), f"epoch_{epoch}.pt")
bye,
Daniel
Daniel
Re: Using PyTorch in PureBasic for inference
A lot of this is way over my head, I'm Interested in Bert masked language models for code predictions, I can do the ngram dB native but I still have to dig around to see how it can utilize it.