Hi all I am trying to understand the values that are returned to me when I use ARKit's projectPoint
function.
My goal is to take a point, p = (x,y,z)
, where all values in the tuple are in range [-1, 1]
and convert it to p' = (x',y')
where all values in the pair are in range [0,1]
, i.e. project from world space onto screen/image space.
I know that arframe.camera.imageResolution = (1920.0, 1440.0)
, and
I know that my screen size is (390.0 ,844.0)
So, when I use the function I write: arframe.camera.projectPoint(point: p, orientation: .portrait, viewportSize: arframe.camera.imageResolution
This returns (x, y)
but it is the case that x > 1440.0
and y > 1920.0
sometimes.
Furthermore, sometimes, x
and y
are negative values.
I am confused about how the function returns values that are greater than the viewportsize we gave. I am also confused by why the values themselves are negative.
The documentation provided by apple states the function 'returns the projection of a point from the 3D world space detected by ARKit into the 2D space of a view rendering the scene.'
What does '2D space of a view rendering the scene' mean more explicitly?
CodePudding user response:
Theory
func projectPoint(_ point: simd_float3,
orientation: UIInterfaceOrientation,
viewportSize: CGSize) -> CGPoint
Xcode tip says – instance method
projectPoint(...)
returns the projection of the specified point into a 2D pixel coordinate space whose origin is in the upper left corner and whose size matches that of the viewportSize parameter.
The difference between Screen Size
and Viewport size
is described Here and Here (I see you said you know about that).
Solution
The trick is that the 2D point is projected correctly only when the 3D point is inside the frustum's coverage area of the camera – it's not a secret that the distance is calculated according to the Pythagorean theorem...
import ARKit
extension ViewController: ARSessionDelegate {
func session(_ session: ARSession, didUpdate frame: ARFrame) {
let point = simd_float3(0.3, 0.5,-2.0)
if self.sceneView.isNode(self.sphere,
insideFrustumOf: self.sceneView.pointOfView!) {
let pp = frame.camera.projectPoint(point,
orientation: .portrait,
viewportSize: CGSize(width: 375, height: 812))
self.label_A.text = String(format: "%.2f", pp.x / 375)
self.label_B.text = String(format: "%.2f", pp.y / 812)
}
}
}
As you can see, outputting values in normalized coordinates (0.00 ... 1.00
) is very simple:
class ViewController: UIViewController {
@IBOutlet var sceneView: ARSCNView!
@IBOutlet var label_A: UILabel!
@IBOutlet var label_B: UILabel!
let sphere = SCNNode(geometry: SCNSphere(radius: 0.1))
override func viewDidLoad() {
super.viewDidLoad()
sceneView.session.delegate = self
sceneView.scene = SCNScene()
sphere.geometry?.firstMaterial?.diffuse.contents = UIColor.green
sphere.position = SCNVector3(0.3, 0.5,-2.0)
sceneView.scene.rootNode.addChildNode(sphere)
let config = ARWorldTrackingConfiguration()
sceneView.session.run(config)
}
}
I used iPhone X parameters – vertical viewportSize is 375 x 812.