We developed a web interface for generating AR/VR render of various three-dimensional shapes for example asteroids, comets and other objects. The interface allows users to upload a screenshot or picture. The user can then generate the corresponding AR model of the object, which is found by the ocr.
Our project solves the challenge by providing users an accessible interface for visualizing various number of objects, as well as uploading their own. Our project is important as it can be used by students of any field to better understand what type of objects they are working with in their study environment.
React makes it painless to create interactive UIs. Design simple views for each state in your application, and React will efficiently update and render just the right components when your data changes.
WebXR, with the WebXR Device API at its core, provides the functionality needed to bring both augmented and virtual reality (AR and VR) to the web. Mixed reality is a large and complex subject, with much to learn and many other APIs to bring together to create an engaging experience for users.
Three.js allows the creation of graphical processing unit (GPU)-accelerated 3D animations using the JavaScript language as part of a website without relying on proprietary browser plugins. This is possible due to the advent of WebGL, a low-level graphics API created specifically for the web.
MindAR is an opensource web augmented reality library. It supports Image Tracking and Face Tracking. MindAR started with AFRAME integration. Starting from version 1.1.0. MindAR support direct integration with three.js.
Tailwind CSS is a low-level framework. Meaning, unlike other CSS frameworks like Bootstrap and Materialize, Tailwind doesn't offer fully styled components like buttons, dropdowns, and navbars. Instead, it offers utility classes so you can create your own reusable components.
OCR stands for Optical Character Recognition. It is a widespread technology to recognize text inside images, such as scanned documents and photos. OCR technology is used to convert virtually any kind of image containing written text (typed, handwritten, or printed) into machine-readable text data.
Implementing user login using Firebase. Using a database to store previous search results.
We would like to implement multi language support for OCR to include languages other than English.
Need to build a more comprehensive library of high quality 2D and 3D models.
Implementing a mature computer vison ML model for better text detection accuracy in a wider range of lighting conditions
OCR
Implementation of OCR while streaming a video on the browser through the webcam, also the low accuracy of the OCR when it started working and OCR being unable to clearly discern words in not ideal conditions.
AR Models
Most good AR models are behind a hefty paywall and often dont support the format required that would mean creating custom models from scratch.
Markless AR and other libraries
General lack of simple documentation and tutorials to make a functioning product.