Today’s objectives

OK. another morning. Every day above ground is a good day. Let’s goooooooooooooo.Today’s objective.

The way that VIdeoPose3D works is that it is actually two algorithm in one.

First, Detectron 2 (another Facebook algorithm) is used for 2D detection, than a temporal CNN takes this 2D output and turns them into 3D predictions.

The dataset I will use is 3dPW.

My goals for today are:

1-Format 3DPW in a way that’s usable.

2-Code up some basic neural network architecture.

So what my architecture will do take as input but the 2D input from Detectron2 (or rather just the shoulder joints) and the 3D outputs of the architecture described in the previous episode to .

I’ll start with say, 6 fully-connected layers of 19 neurons. This is a ballpark measurement. You can read up on the universal approximation theorem if you want to know more about this.

You might wonder: why am I not taking several frames into account? I’m repeating myself.

It SHOULD work without taking because think about it: fix your shoulder, fix your hand and fix the height of your elbow. I don’t know about you but if you’re capable of moving your elbows in this position,you might want to check for Martian ancestry and be wary of the FBI.

In order to train it, I will use a combination of backprop and probably some ad hoc visualization + training method, but that’s a blog post for another day.

Now this will be done in Julia. Why Julia?

IIRC, Julia has neural network layers as first-class citizens. It can accept math symbols as input. It has multi-line lambdas. It has things like lisp-style macros, which probably won’t help that much here, but I’ve always wanted to explore the topic. My understanding is that they’re mostly useful for larger programs, who who knows? It is also reputed to have excellent interoperability with python and to be really fast.

So I hope Julia can be my goto language for quick and dirty experimentation.

My guess is that since VRC is a real-time video game, I’ll probably have to run into issues eventually due to garbage collection, but alea jactea est.

