The Machine Learning team Apple, in collaboration with researchers from the University of Nanjing and Hong Kong University of Science and Technology, so that it announced an interesting 3D model of 3D AI. A large photogrammetric model is able to reconstruct 3D objects and scenes of several two -dimensional photos, but with a large difference from current. That is why it is important. He uses photographs to create 3D models or cards. Currently, this process includes the use of various models for steps such as an assessment of the pose and depth forecast, which can lead to inefficiency and errors.
Matrix3D simplifies this, doing all this at a time. He accepts images, camera parameters (such as angle and focal length), as well as depth data, and processes them using a single architecture. This not only simplifies the workflow, but also increases accuracy. Researchers used a masked learning strategy, which is very similar to the early AI systems based on transformers that helped to pave the way for the first versions of ChatGPT.
They randomly hid parts of the input data during the educational process, which forced Matrix3D to mainly learn to fill in the gaps. This method is a key because it allows MATRIX3D to effectively train even with smaller or incomplete data sets.
The results are impressive. With only three input images, Matrix3D can generate detailed 3D reconstruction of objects and even entire media, which, obviously, can have very interesting applications for immersive headsets such as Apple Vision Pro.