{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Going between DLT and intrinsic/extrinsic formulations\nI've had issues going between two common camera triangulation \nformulations (what I call the Projection matrix method, and the commonly known DLT method) \nwhen using experimental data. Experimental data is of course bound to be noisy, and\nso here I'll use synthetic data with no noise to solidify my understanding.\n\nI'll first generate 2 cameras and a few moving objects. Then project the \nobjects onto the camera planes, and finally try to recover the 3D coordinates.\n\n\n## Some background\nThe simulated data is created using the codebase developed by Giray Tandogan and \nmodified by Thejasvi Beleyur. It creates a group of objects moving randomly within a \n3D volume and such that the points are visible to both cameras on all frames.\n\nBoth cameras are `not` in plane, don't have any distortion - and have sensor and\nfocal length that match a Go Pro Hero (exact model here). \n\n## References\n* DLT to/from intrinsic+extrinsic https://biomech.web.unc.edu/dlt-to-from-intrinsic-extrinsic/ acc 2022-03-08\n* http://www.kwon3d.com/theory/dlt/dlt.html#3d\n\n\n\nAuthor: Thejasvi Beleyur, March 2022\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import cv2\nimport matplotlib.pyplot as plt\nimport numpy as np \nimport pandas as pd\nnp.set_printoptions(precision=8)\nnp.set_printoptions(suppress=True)\nimport track2trajectory.synthetic_data as syndata\nfrom track2trajectory.projection import project_to_2d_and_3d\nfrom track2trajectory.dlt_to_world import transformation_matrix_from_dlt\nfrom track2trajectory.dlt_to_world import cam_centre_from_dlt\nfrom scipy.spatial import distance\nfrom scipy.spatial.transform import Rotation\nfrom dlt_reconstruction_easyWand import dlt_reconstruct\nfrom track2trajectory.synthetic_data import make_rotation_mat_fromworld\n\nnp.random.seed(82319)\n\ndef point2point_matrix(xyz):\n    distmat = np.zeros((xyz.shape[0],xyz.shape[0]))\n    for i in range(distmat.shape[0]):\n        for j in range(distmat.shape[1]):\n            distmat[i,j] = distance.euclidean(xyz[i,:], xyz[j,:])\n    return distmat"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Generate the synthetic data\nLet's also rotate the second camera a bit to the origin\nThe rotations are around the `camera` axis wrt World axis\ni.e. x,y is on image plane and z points out into the 'depth'\nBoth cameras aren't really aligned completely with the 'world' axis.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "cam1_R = Rotation.from_euler('xyz', [5,-10,2],degrees=True).as_matrix() \ncam2_R = Rotation.from_euler('xyz', [1,10,-4],degrees=True).as_matrix()\ncam1_C = [-1,0,1] # camera centres in world XYZ, Y faces into the depth, Z goes up/down\ncam2_C = [1,0.5,-0.25]\n# generate camera instances with pre-specified parameters\ncam1, cam2 = syndata.generate_two_synthetic_cameras_version2(cam2_Rotation=cam2_R,\n                                                             cam1_Rotation=cam1_R,\n                                                             cam1_C=cam1_C,\n                                                             cam2_C=cam2_C)\n# generate brownian type particles that move within a specified bounding box\nobjects_in_3d = syndata.make_brownian_particles(3,[[-2,2],[6,8],[-1,1]], frames=2)\n# project 3D particle position to 2D camera pixel positions\ncam1_pixels, _ = project_to_2d_and_3d(objects_in_3d, cam1)\ncam2_pixels, _ = project_to_2d_and_3d(objects_in_3d, cam2)\n\ncam1_xy = cam1_pixels.loc[:,['x','y']].to_numpy()\ncam2_xy = cam2_pixels.loc[:,['x','y']].to_numpy()\n\nprint('Ground truth points')\nobjects_in_3d"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Verify Projection method: it works\nNow using the projection matrix let's get the 3D positions back using the \nintrinsic/extrinsic methods based on the 3x4 projection matrix P. \n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "threed_Pbased = []\nfor (xy1, xy2) in zip(cam1_xy, cam2_xy):\n    pos_homog = cv2.triangulatePoints(cam1.cm_mtrx, cam2.cm_mtrx, xy1, xy2)\n    xyz_position = cv2.convertPointsFromHomogeneous(pos_homog.T)\n    threed_Pbased.append(xyz_position)\nthreed_Pbased = np.array(threed_Pbased).reshape(-1,3)\npbased_distmat = point2point_matrix(threed_Pbased)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We see the y-z axes are interchanged, but consistently so. \n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print('Projection matrix based triangulation:')\n\nthreed_Pbased"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### DLT coefficients from Projection matrix: it works\nLet's go from the projection matrix to DLT coefficients and try reconstructing\nthe 3D positions using the DLT method. \n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fx, fy  = [1230,1230]\npx, py = 960, 540\nK = np.array([[fx, 0, px],\n              [0, fy, py],\n              [0, 0,  1]])\nm = np.column_stack((np.eye(3), np.zeros(3)))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, follow the 12 entry -> 11 entry conversion by normalising the \nfirst 11 entries with the 12th entry. The first 11 entries then become the \n11 DLT coefficients used to triangulate.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "dlt_c1 = cam1.cm_mtrx.flatten()[:-1]/cam1.cm_mtrx[-1,-1]\ndlt_c2 = cam2.cm_mtrx.flatten()[:-1]/cam2.cm_mtrx[-1,-1]\ndlt_11coefs_fromP = np.column_stack((dlt_c1, dlt_c2))\n\nxyz_11dlt_P = []\nfor (xy1, xy2) in zip(cam1_xy, cam2_xy):\n    object_pixels = np.append(xy1, xy2).reshape(1,-1)\n    xyz = dlt_reconstruct(dlt_11coefs_fromP, object_pixels)[0]\n    xyz_11dlt_P.append(xyz)\nxyz_11dlt_fromP = np.array(xyz_11dlt_P).reshape(-1,3)\n\npd.DataFrame(dlt_11coefs_fromP, columns=['cam1DLT','cam2DLT'])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Aside from the y-z axes interchange here too - the values match up. \n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print('DLT from projection matrix triangulation:')\n\npd.DataFrame(xyz_11dlt_fromP, columns=['x','y','z'])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Projection matrix from DLT \nOkay, so now let's do the `reverse`. Let's say we have the DLT coefficients. \nHow do we go about getting the `projection matrix`?\nIn the Hedrick write-up, `P` is actually not the Projection matrix as \ntypically understood (which maps 2d<->3d, `x = P X`, where P = K[R|t], where R is\nrotation and t is the translation). Instead it seems that P = [R|t] a 4x4 matrix which contains\nthe rotation and translation info.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# To fill the 3x4 matrix with 12 entries from the (normalised) 11 DLT coefficients,\n# let's append a 1 at the end.\n\nrev_P_cam1 = np.append(dlt_11coefs_fromP[:,0], [1]).reshape(3,4)\nrev_P_cam2 = np.append(dlt_11coefs_fromP[:,1], [1]).reshape(3,4)\n\nrev_Pbased = []\nfor (xy1, xy2) in zip(cam1_xy, cam2_xy):\n    pos_homog = cv2.triangulatePoints(rev_P_cam1, rev_P_cam2, xy1, xy2)\n    xyz_position = cv2.convertPointsFromHomogeneous(pos_homog.T)\n    rev_Pbased.append(xyz_position)\nrev_Pbased = np.array(rev_Pbased).reshape(-1,3)\n\nprint('Projection matrix from DLT:')\npd.DataFrame(rev_Pbased, columns=['x','y','z'])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This examples with synthetic data shows two things: 1) it is indeed possible to \nback and forth between the projection matrix and DLT formulations and get the `same`\nxyz coordinates 2) the established workflows in my codebase are correct. \n\nOf course, one major point of difference here is that my synthetic data has no \nnoise - be it in the 2D pixel coordinates, or in the intrinsic/extrinsic parameter\nestimation, and/or DLT coefficient estimation. Perhaps the noiseless nature of\nthe \n\nHere, we've shown that we can recover the ground truth 3D coordinates\nof a synthetic data set using multiple methods 1) 'standard' projection matrix\nbased methods 2) obtaining 11 DLT coefficients from the 3x4 projection matrix\nand finally 3) 'regenerating' the 3x4 projection matrix from the 11 DLT coefficients\nat hand. The coordinates match the ground truth extremely well (<1e-6)\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Getting the projection matrix from `DLTcameraPosition`\n\nThe function :code:`transformation_matrix_from_dlt` is my Python implementation of \nTy Hedrick's `DLTcameraPosition`. :code:`transformation_matrix_from_dlt` is correct\nas it calculates the expected principal points, and focal lengths of the cameras. \nThe T1, T2 matrices are also numerically equivalent to the function output in Octave.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "T1, z1, ypr1 = transformation_matrix_from_dlt(dlt_11coefs_fromP[:,0]) # python implementation of DLTcameraPosition\nT2, z2, ypr2 = transformation_matrix_from_dlt(dlt_11coefs_fromP[:,1])\n\n#T1 = T1.T; T2= T2.T # transpose and convert the T matrix into the 'classic'\n# left-upper 3x3 rotation mat and the last column with the translation\n\nprint('DLTcameraPosition T matrices: \\n \\n ', T1, '\\n \\n ' ,T2)\n\n# shifts the left/right-handed to the other (it's still magic as of now)\nshifter_mat = np.row_stack(([1,0,0,0],\n                            [0,1,0,0],\n                            [0,0,-1,0],\n                            [0,0,0,1]))\nshifted_rotmat1 = np.matmul(T1, shifter_mat)[:3,:3]\nshifted_rotmat2 = np.matmul(T2, shifter_mat)[:3,:3]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Convert world coordiantes to camera frame to generate the projection matrix\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "T1cam = make_rotation_mat_fromworld(shifted_rotmat1, T1[-1,:3])\nT2cam = make_rotation_mat_fromworld(shifted_rotmat2, T2[-1,:3])\n\nprint('T matrices converted to camera frame \\n \\n', T1cam, ' \\n \\n ' ,T2cam)\n\nfx, fy = cam1.f_x, cam1.f_y\n\nK = np.array(([fx, 0, cam1.c_x],\n              [0, fy, cam1.c_y],\n              [0, 0,     1    ]))\nKm = np.matmul(K,m)\nm = np.column_stack((np.eye(3), np.zeros(3))) # the 3x4 matrix to 'grease the wheels'\n\nP11kmt = np.matmul(Km, T1cam)\nP22kmt = np.matmul(Km, T2cam)\n\nxyz_tmat_based = []\nfor (xy1, xy2) in zip(cam1_xy, cam2_xy):\n    pos_homog = cv2.triangulatePoints(P11kmt, P22kmt, xy1, xy2)\n    xyz_position = cv2.convertPointsFromHomogeneous(pos_homog.T)\n    xyz_tmat_based.append(xyz_position)\n\nxyz_tmat_based = np.array(xyz_tmat_based).reshape(-1,3)\n\nprint('XYZ coordinates generated through DLTcameraPosition inputs: \\n ')\npd.DataFrame(xyz_tmat_based, columns=['x','y','z'])"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}