Difference between revisions of "3D Model"

From
Jump to: navigation, search
m
 
(31 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
|title=PRIMO.ai
 
|title=PRIMO.ai
 
|titlemode=append
 
|titlemode=append
|keywords=artificial, intelligence, machine, learning, models, algorithms, data, singularity, moonshot, Tensorflow, Google, Nvidia, Microsoft, Azure, Amazon, AWS  
+
|keywords=ChatGPT, artificial, intelligence, machine, learning, GPT-4, GPT-5, NLP, NLG, NLC, NLU, models, data, singularity, moonshot, Sentience, AGI, Emergence, Moonshot, Explainable, TensorFlow, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Hugging Face, OpenAI, Tensorflow, OpenAI, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Meta, LLM, metaverse, assistants, agents, digital twin, IoT, Transhumanism, Immersive Reality, Generative AI, Conversational AI, Perplexity, Bing, You, Bard, Ernie, prompt Engineering LangChain, Video/Image, Vision, End-to-End Speech, Synthesize Speech, Speech Recognition, Stanford, MIT |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
+
 
 +
<!-- Google tag (gtag.js) -->
 +
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4GCWLBVJ7T"></script>
 +
<script>
 +
  window.dataLayer = window.dataLayer || [];
 +
  function gtag(){dataLayer.push(arguments);}
 +
  gtag('js', new Date());
 +
 
 +
  gtag('config', 'G-4GCWLBVJ7T');
 +
</script>
 
}}
 
}}
[http://www.youtube.com/results?search_query=3D+model+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
+
[https://www.youtube.com/results?search_query=3D+model+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
[http://www.google.com/search?q=3D+model+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
+
[https://www.google.com/search?q=3D+model+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
  
* [[Capabilities]]
+
* [[Robotics]] ... [[Transportation (Autonomous Vehicles)|Vehicles]] ... [[Autonomous Drones|Drones]] ... [[3D Model]] ... [[Point Cloud]]
* [[Point Cloud]]
+
* [[Simulation]] ... [[Simulated Environment Learning]] ... [[World Models]] ... [[Minecraft]]: [[Minecraft#Voyager|Voyager]]  
* [[Screening; Passenger, Luggage, & Cargo]]
+
* [[Cybersecurity]] ... [[Open-Source Intelligence - OSINT |OSINT]] ... [[Cybersecurity Frameworks, Architectures & Roadmaps | Frameworks]] ... [[Cybersecurity References|References]] ... [[Offense - Adversarial Threats/Attacks| Offense]] ... [[National Institute of Standards and Technology (NIST)|NIST]] ... [[U.S. Department of Homeland Security (DHS)| DHS]] ... [[Screening; Passenger, Luggage, & Cargo|Screening]] ... [[Law Enforcement]] ... [[Government Services|Government]] ... [[Defense]] ... [[Joint Capabilities Integration and Development System (JCIDS)#Cybersecurity & Acquisition Lifecycle Integration| Lifecycle Integration]] ... [[Cybersecurity Companies/Products|Products]] ... [[Cybersecurity: Evaluating & Selling|Evaluating]]
 
* [[Spatial-Temporal Dynamic Network (STDN)]]  
 
* [[Spatial-Temporal Dynamic Network (STDN)]]  
* [[3D Simulation Environments]]
+
* [[Hyperdimensional Computing (HDC)]]
* [http://arxiv.org/pdf/1808.01462.pdf A survey on Deep Learning Advances on Different 3D Data Representations | E. Ahmed, A. Saint, A. Shabayek, K. Cherenkova, R. Das, G. Gusev, and D. Aouada] - extending 2D deep learning to 3D data is not a straightforward tasks depending on the data representation itself and the task at hand.
+
* [[Video/Image]] ... [[Vision]] ... [[Enhancement]] ... [[Fake]] ... [[Reconstruction]] ... [[Colorize]] ... [[Occlusions]] ... [[Predict image]] ... [[Image/Video Transfer Learning]]
* [http://www.ri.cmu.edu/pub_files/2015/9/voxnet_maturana_scherer_iros15.pdf VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition | Daniel Maturana and Sebastian Scherer]
+
* [https://arxiv.org/pdf/1808.01462.pdf A survey on Deep Learning Advances on Different 3D Data Representations | E. Ahmed, A. Saint, A. Shabayek, K. Cherenkova, R. Das, G. Gusev, and D. Aouada] - extending 2D deep learning to 3D data is not a straightforward tasks depending on the data representation itself and the task at hand.
* [http://www.forbes.com/sites/bernardmarr/2019/08/23/the-amazing-ways-youtube-uses-artificial-intelligence-and-machine-learning/#553d467a5852 The Amazing Ways YouTube Uses Artificial Intelligence And Machine Learning | Bernard Marr - Forbes]
+
* [https://www.forbes.com/sites/bernardmarr/2019/08/23/the-amazing-ways-youtube-uses-artificial-intelligence-and-machine-learning/#553d467a5852 The Amazing Ways YouTube Uses Artificial Intelligence And Machine Learning | Bernard Marr - Forbes]
 +
* [https://azure.microsoft.com/en-us/services/cognitive-services/custom-vision-service/ Azure Custom Vision] ...an AI service and end-to-end platform for applying computer vision
  
  
 
== Geometric Deep Learning ==
 
== Geometric Deep Learning ==
* [http://www.synthetik-technologies.com/ Synthetik Applied Technologies]
+
* [https://www.synthetik-technologies.com/ Synthetik Applied Technologies]
 
<youtube>vfL6uJYFrp4</youtube>
 
<youtube>vfL6uJYFrp4</youtube>
 
<youtube>sSGBHHk9WGM</youtube>
 
<youtube>sSGBHHk9WGM</youtube>
Line 27: Line 37:
 
<youtube>be6Iw0QrI8w</youtube>
 
<youtube>be6Iw0QrI8w</youtube>
 
<youtube>wLU4YsC_4NY</youtube>
 
<youtube>wLU4YsC_4NY</youtube>
 +
<youtube>Rsh4poEpahI</youtube>
 +
<youtube>FHdOUFmPgWw</youtube>
 
<youtube>D3fnGG7cdjY</youtube>
 
<youtube>D3fnGG7cdjY</youtube>
  
== [http://github.com/timzhang642/3D-Machine-Learning 3D Machine Learning | GitHub] ==
+
== [https://github.com/timzhang642/3D-Machine-Learning 3D Machine Learning | GitHub] ==
  
* [http://github.com/timzhang642/3D-Machine-Learning#courses Courses]
+
* [https://github.com/timzhang642/3D-Machine-Learning#courses Courses]
* [http://github.com/timzhang642/3D-Machine-Learning#datasets Datasets]
+
* [https://github.com/timzhang642/3D-Machine-Learning#datasets Datasets]
** [http://github.com/timzhang642/3D-Machine-Learning#3d_models 3D Models]
+
** [https://github.com/timzhang642/3D-Machine-Learning#3d_models 3D Models]
** [http://github.com/timzhang642/3D-Machine-Learning#3d_scenes 3D Scenes]
+
** [https://github.com/timzhang642/3D-Machine-Learning#3d_scenes 3D Scenes]
* [http://github.com/timzhang642/3D-Machine-Learning#pose_estimation 3D Pose Estimation]
+
* [https://github.com/timzhang642/3D-Machine-Learning#pose_estimation 3D Pose Estimation]
* [http://github.com/timzhang642/3D-Machine-Learning#single_classification Single Object Classification]
+
* [https://github.com/timzhang642/3D-Machine-Learning#single_classification Single Object Classification]
* [http://github.com/timzhang642/3D-Machine-Learning#multiple_detection Multiple Objects Detection]
+
* [https://github.com/timzhang642/3D-Machine-Learning#multiple_detection Multiple Objects Detection]
* [http://github.com/timzhang642/3D-Machine-Learning#segmentation Scene/Object Semantic Segmentation]
+
* [https://github.com/timzhang642/3D-Machine-Learning#segmentation Scene/Object Semantic Segmentation]
* [http://github.com/timzhang642/3D-Machine-Learning#3d_synthesis 3D Geometry Synthesis/Reconstruction]
+
* [https://github.com/timzhang642/3D-Machine-Learning#3d_synthesis 3D Geometry Synthesis/Reconstruction]
** [http://github.com/timzhang642/3D-Machine-Learning#3d_synthesis_model_based Parametric Morphable Model-based methods]
+
** [https://github.com/timzhang642/3D-Machine-Learning#3d_synthesis_model_based Parametric Morphable Model-based methods]
** [http://github.com/timzhang642/3D-Machine-Learning#3d_synthesis_template_based Part-based Template Learning methods]
+
** [https://github.com/timzhang642/3D-Machine-Learning#3d_synthesis_template_based Part-based Template Learning methods]
** [http://github.com/timzhang642/3D-Machine-Learning#3d_synthesis_dl_based Deep Learning Methods]
+
** [https://github.com/timzhang642/3D-Machine-Learning#3d_synthesis_dl_based Deep Learning Methods]
* [http://github.com/timzhang642/3D-Machine-Learning#material_synthesis Texture/Material Analysis and Synthesis]
+
* [https://github.com/timzhang642/3D-Machine-Learning#material_synthesis Texture/Material Analysis and Synthesis]
* [http://github.com/timzhang642/3D-Machine-Learning#style_transfer Style Learning and Transfer]
+
* [https://github.com/timzhang642/3D-Machine-Learning#style_transfer Style Learning and Transfer]
* [http://github.com/timzhang642/3D-Machine-Learning#scene_synthesis Scene Synthesis/Reconstruction]
+
* [https://github.com/timzhang642/3D-Machine-Learning#scene_synthesis Scene Synthesis/Reconstruction]
* [http://github.com/timzhang642/3D-Machine-Learning#scene_understanding Scene Understanding]
+
* [https://github.com/timzhang642/3D-Machine-Learning#scene_understanding Scene Understanding]
  
  
 
== 3D Convolutional Neural Networks (3DCNN) ==
 
== 3D Convolutional Neural Networks (3DCNN) ==
[http://www.youtube.com/results?search_query=3DCNN+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
+
[https://www.youtube.com/results?search_query=3DCNN+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
[http://www.google.com/search?q=3DCNN+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
+
[https://www.google.com/search?q=3DCNN+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
  
3D CNN models are widely used for object classification and detection within varying data modalities such as LiDAR [[Point Cloud]], RGB-Depth data, 3D Computer Aided Design (CAD) models, and medical CT imagery. RGB-Depth data can also be processed using CNN by firstly extracting proposals from 2D RGB images using a 2D object detector and transforming the proposals and
+
3D CNN models are widely used for object classification and detection within varying data modalities such as [[3D Model#LiDAR|LiDAR]] [[Point Cloud]], RGB-Depth data, 3D Computer Aided Design (CAD) models, and medical CT imagery. RGB-Depth data can also be processed using CNN by firstly extracting proposals from 2D RGB images using a 2D object detector and transforming the proposals and
corresponding depth information into 3D [[Point Cloud]]s. The generated 3D point clouds can be further explored by 3D CNN models such as PointNet. These models designed for [[Point Cloud]]s, RGBD data, CAD models or medical CT images are not readily transferable to our volumetric 3D CT imagery for baggage security screening since the modality of input data for 3D CNN can differ significantly. However, the design of 3D CNN architectures and the training strategies used in existing work can be repurposed towards our prohibited item classification and detection within 3D CT baggage imagery. [http://arxiv.org/pdf/2003.12625.pdf On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery | Q. Wang, N. Bhowmik, and T. Breckon - Durham, UK]
+
corresponding depth information into 3D [[Point Cloud]]s. The generated 3D point clouds can be further explored by 3D CNN models such as PointNet. These models designed for [[Point Cloud]]s, RGBD data, CAD models or medical CT images are not readily transferable to our volumetric 3D CT imagery for baggage security screening since the modality of input data for 3D CNN can differ significantly. However, the design of 3D CNN architectures and the training strategies used in existing work can be repurposed towards our prohibited item classification and detection within 3D CT baggage imagery. [https://arxiv.org/pdf/2003.12625.pdf On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery | Q. Wang, N. Bhowmik, and T. Breckon - Durham, UK]
  
 
Schematic diagram of the Deep 3D Convolutional Neural Network and FEATURE-Softmax Classifier models. a Deep 3D Convolutional Neural Network. The feature extraction stage includes 3D convolutional and [[Pooling / Sub-sampling: Max, Mean]] layers. 3D filters in the 3D convolutional layers search for recurrent spatial patterns that best capture the local biochemical features to separate the 20 amino acid microenvironments. [[Pooling / Sub-sampling: Max, Mean]] layers perform down-sampling to the input to increase translational invariances of the network. By following the 3DCNN and 3D [[Pooling / Sub-sampling: Max, Mean]] layers with fully connected layers, the pooled filter responses of all filters across all positions in the protein box can be integrated. The integrated information is then fed to the [[Softmax]] classifier layer to calculate class probabilities and to make the final predictions. Prediction error drives parameter updates of the trainable parameters in the classifier, fully connected layers, and convolutional filters to learn the best feature for the optimal performances. b The FEATURE [[Softmax]] Classifier. The FEATURE [[Softmax]] model begins with an input layer, which takes in FEATURE vectors, followed by two fully-connected layers, and ends with a Softmax classifier layer. In this case, the input layer is equivalent to the feature extraction stage. In contrast to 3DCNN, the prediction error only drives parameter learning of the fully connected layers and classifier. The input feature is fixed during the whole training process
 
Schematic diagram of the Deep 3D Convolutional Neural Network and FEATURE-Softmax Classifier models. a Deep 3D Convolutional Neural Network. The feature extraction stage includes 3D convolutional and [[Pooling / Sub-sampling: Max, Mean]] layers. 3D filters in the 3D convolutional layers search for recurrent spatial patterns that best capture the local biochemical features to separate the 20 amino acid microenvironments. [[Pooling / Sub-sampling: Max, Mean]] layers perform down-sampling to the input to increase translational invariances of the network. By following the 3DCNN and 3D [[Pooling / Sub-sampling: Max, Mean]] layers with fully connected layers, the pooled filter responses of all filters across all positions in the protein box can be integrated. The integrated information is then fed to the [[Softmax]] classifier layer to calculate class probabilities and to make the final predictions. Prediction error drives parameter updates of the trainable parameters in the classifier, fully connected layers, and convolutional filters to learn the best feature for the optimal performances. b The FEATURE [[Softmax]] Classifier. The FEATURE [[Softmax]] model begins with an input layer, which takes in FEATURE vectors, followed by two fully-connected layers, and ends with a Softmax classifier layer. In this case, the input layer is equivalent to the feature extraction stage. In contrast to 3DCNN, the prediction error only drives parameter learning of the fully connected layers and classifier. The input feature is fixed during the whole training process
  
http://www.researchgate.net/publication/317704070/figure/fig3/AS:513475445260288@1499433494423/Schematic-diagram-of-the-Deep-3D-Convolutional-Neural-Network-and-FEATURE-Softmax.png
+
https://www.researchgate.net/publication/317704070/figure/fig3/AS:513475445260288@1499433494423/Schematic-diagram-of-the-Deep-3D-Convolutional-Neural-Network-and-FEATURE-Softmax.png
  
 
<youtube>vd0czNKL1DE</youtube>
 
<youtube>vd0czNKL1DE</youtube>
Line 64: Line 76:
  
  
== LiDAR ==  
+
== <span id="LiDAR"></span>LiDAR ==
[http://www.youtube.com/results?search_query=LiDAR+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
+
[https://www.youtube.com/results?search_query=LiDAR+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
[http://www.google.com/search?q=LiDAR+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
+
[https://www.google.com/search?q=LiDAR+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
  
* [http://arxiv.org/pdf/1711.06396.pdf VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection | Yin Zhou and Oncel Tuzel]
+
* [https://arxiv.org/pdf/1711.06396.pdf VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection | Yin Zhou and Oncel Tuzel]
* [http://www.ri.cmu.edu/pub_files/2015/9/voxnet_maturana_scherer_iros15.pdf VoxNet: A 3D Convolutional Neural Network for real-time object recognition | Daniel Maturana and Sebastian Scherer]
+
* [https://www.ri.cmu.edu/pub_files/2015/9/voxnet_maturana_scherer_iros15.pdf VoxNet: A 3D Convolutional Neural Network for real-time object recognition | Daniel Maturana and Sebastian Scherer]
  
LiDAR involves firing rapid laser pulses at objects and measuring how much time they take to return to the sensor. This is similar to the "time of flight" technology for RGB-D cameras we described above, but LiDAR has significantly longer range, captures many more points, and is much more robust to interference from other light sources. Most 3D LiDAR sensors today have several (up to 64) beams aligned vertically, spinning rapidly to see in all directions around the sensor. These are the sensors used in most self-driving cars because of their accuracy, range, and robustness, but the problem with LiDAR sensors is that they're often large, heavy, and extremely expensive (the 64-beam sensor that most self-driving cars use costs $75,000!). As a result, many companies are currently trying to develop cheaper “solid state LiDAR” systems that can sense in 3D without having to spin. [http://thegradient.pub/beyond-the-pixel-plane-sensing-and-learning-in-3d/ Beyond the pixel plane: sensing and learning in 3D | Mihir Garimella, Prathik Naidu - Stanford]
+
LiDAR involves firing rapid laser pulses at objects and measuring how much time they take to return to the sensor. This is similar to the "time of flight" technology for RGB-D cameras we described above, but LiDAR has significantly longer range, captures many more points, and is much more robust to interference from other light sources. Most 3D LiDAR sensors today have several (up to 64) beams aligned vertically, spinning rapidly to see in all directions around the sensor. These are the sensors used in most self-driving cars because of their accuracy, range, and robustness, but the problem with LiDAR sensors is that they're often large, heavy, and extremely expensive (the 64-beam sensor that most self-driving cars use costs $75,000!). As a result, many companies are currently trying to develop cheaper “solid state LiDAR” systems that can sense in 3D without having to spin. [https://thegradient.pub/beyond-the-pixel-plane-sensing-and-learning-in-3d/ Beyond the pixel plane: sensing and learning in 3D | Mihir Garimella, Prathik Naidu - Stanford]
  
[http://arxiv.org/pdf/1711.06396.pdf VoxelNet] is an end-to-end 3D object detector specially designed for LiDAR data. It consists of three modules:  
+
[https://arxiv.org/pdf/1711.06396.pdf VoxelNet] is an end-to-end 3D object detector specially designed for LiDAR data. It consists of three modules:  
 
* feature learning network (subdivide the [[Point Cloud]] into many subvolumes/voxels, feature engineering + fully connected neural network),  
 
* feature learning network (subdivide the [[Point Cloud]] into many subvolumes/voxels, feature engineering + fully connected neural network),  
 
* convolutional middle layer (3D convolution applied to the stacked voxel feature volumes, each subvolume/voxel is a feature vector)  
 
* convolutional middle layer (3D convolution applied to the stacked voxel feature volumes, each subvolume/voxel is a feature vector)  
 
* and region proposal networks.  
 
* and region proposal networks.  
VoxNet in a more generic model being able to handle different types of 3D data including LiDAR [[Point Cloud]], CAD and RGBD data. Qi et al. improved the performance of VoxNet by introducing the auxiliary subvolume supervision to alleviate the overfitting issue.[http://arxiv.org/pdf/2003.12625.pdf On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery | Q. Wang, N. Bhowmik, and T. Breckon - Durham, UK]
+
VoxNet in a more generic model being able to handle different types of 3D data including LiDAR [[Point Cloud]], CAD and RGBD data. Qi et al. improved the performance of VoxNet by introducing the auxiliary subvolume supervision to alleviate the overfitting issue.[https://arxiv.org/pdf/2003.12625.pdf On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery | Q. Wang, N. Bhowmik, and T. Breckon - Durham, UK]
  
http://media.giphy.com/media/xT1XH1NoZlqskNb2fu/giphy.gif
+
https://media.giphy.com/media/xT1XH1NoZlqskNb2fu/giphy.gif
  
 
<youtube>RXQBMAGaabs</youtube>
 
<youtube>RXQBMAGaabs</youtube>
  
 
== MVCNN ==  
 
== MVCNN ==  
[http://www.youtube.com/results?search_query=MVCNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
+
[https://www.youtube.com/results?search_query=MVCNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
[http://www.google.com/search?q=MVCNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
+
[https://www.google.com/search?q=MVCNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
  
A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes’ rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors. Recognition rates further increase when multiple views of the shapes are provided. In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and compact shape descriptor offering even better recognition performance. The same architecture can be applied to accurately recognize human hand-drawn sketches of shapes. We conclude that a collection of 2D views can be highly informative for 3D shape recognition and is amenable to emerging CNN architectures and their derivatives. [http://vis-www.cs.umass.edu/mvcnn/ Multi-view Convolutional Neural Networks for 3D Shape Recognition H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller] and [http://github.com/jongchyisu/mvcnn_pytorch MVCNN] with [[PyTorch]]
+
A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the [[context]] of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes’ rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors. Recognition rates further increase when multiple views of the shapes are provided. In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and compact shape descriptor offering even better recognition performance. The same architecture can be applied to accurately recognize human hand-drawn sketches of shapes. We conclude that a collection of 2D views can be highly informative for 3D shape recognition and is amenable to emerging CNN architectures and their derivatives. [https://vis-www.cs.umass.edu/mvcnn/ Multi-view Convolutional Neural Networks for 3D Shape Recognition H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller] and [https://github.com/jongchyisu/mvcnn_pytorch MVCNN] with [[PyTorch]]
  
 
<youtube>P0ivrbPjvnM</youtube>
 
<youtube>P0ivrbPjvnM</youtube>
 
<youtube>QQbOy6J2PI0</youtube>
 
<youtube>QQbOy6J2PI0</youtube>
  
http://vis-www.cs.umass.edu/mvcnn/images/mvcnn.png
+
https://vis-www.cs.umass.edu/mvcnn/images/mvcnn.png
  
 
== Quadtrees and Octrees ==  
 
== Quadtrees and Octrees ==  
[http://www.youtube.com/results?search_query=O-CNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
+
[https://www.youtube.com/results?search_query=O-CNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI Youtube search...]
[http://www.google.com/search?q=O-CNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
+
[https://www.google.com/search?q=O-CNN+3D+artificial+intelligence+deep+learning+deep+machine+learning+ML+AI ...Google search]
  
* [http://arxiv.org/pdf/1712.01537.pdf O-CNN: Octree-based convolutional neural networks for 3D shape analysis | P. Wang, Y. Liu, Y. Guo, C. Sun, and X. Tong]
+
* [https://arxiv.org/pdf/1712.01537.pdf O-CNN: Octree-based convolutional neural networks for 3D shape analysis | P. Wang, Y. Liu, Y. Guo, C. Sun, and X. Tong]
** [http://github.com/microsoft/O-CNN O-CNN |  P. Wang, Y. Liu, Y. Guo, C. Sun, and X. Tong | GitHub] repository contains the implementation of O-CNN and Aadptive O-CNN  ...built upon the Caffe framework and it supports octree-based convolution, deconvolution, pooling, and unpooling.  
+
** [https://github.com/microsoft/O-CNN O-CNN |  P. Wang, Y. Liu, Y. Guo, C. Sun, and X. Tong | GitHub] repository contains the implementation of O-CNN and Aadptive O-CNN  ...built upon the Caffe framework and it supports octree-based convolution, deconvolution, pooling, and unpooling.  
* [http://arxiv.org/abs/1611.05009 OctNet: Learning Deep 3D Representations at High Resolutions | G. Riegler, A. Ulusoy, and A. Geiger] a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. OctNet uses efficient space partitioning structures (i.e. octrees) to reduce memory and compute requirements of 3D convolutional neural networks, thereby enabling deep learning at high resolutions.
+
* [https://arxiv.org/abs/1611.05009 OctNet: Learning Deep 3D Representations at High Resolutions | G. Riegler, A. Ulusoy, and A. Geiger] a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus [[memory]] allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. OctNet uses efficient space partitioning structures (i.e. octrees) to reduce [[memory]] and compute requirements of 3D convolutional neural networks, thereby enabling deep learning at high resolutions.
** [http://github.com/griegler/octnet OctNet | GitHub]
+
** [https://github.com/griegler/octnet OctNet | GitHub]
  
<youtube>8jfAqRzAudw</youtube>
 
 
<youtube>qYyephF2BBw</youtube>
 
<youtube>qYyephF2BBw</youtube>
 
<youtube>xFcQaig5Z2A</youtube>
 
<youtube>xFcQaig5Z2A</youtube>
 
<youtube>KjSZpJUX5F4</youtube>
 
<youtube>KjSZpJUX5F4</youtube>
  
http://github.com/griegler/octnet/raw/master/doc/teaser.png
+
https://github.com/griegler/octnet/raw/master/doc/teaser.png
 
 
  
 
== 3D Models from 2D Images ==
 
== 3D Models from 2D Images ==
Line 119: Line 129:
 
<youtube>EMjPqgLX14A</youtube>
 
<youtube>EMjPqgLX14A</youtube>
  
 +
 +
= [[ChatGPT]] & Blender =
 +
 +
<youtube>rUUgLsspTZA</youtube>
 +
<youtube>HobnMo7AZbM</youtube>
  
 
== 3D Printing ==
 
== 3D Printing ==
 +
[https://www.youtube.com/results?search_query=ai+3D+printing+model+STL+mesh YouTube]
 +
[https://www.quora.com/search?q=ai%203D%20printing%20model%20STL%20mesh ... Quora]
 +
[https://www.google.com/search?q=ai+3D+printing+model+STL+mesh ...Google search]
 +
[https://news.google.com/search?q=ai+3D+printing+model+STL+mesh ...Google News]
 +
[https://www.bing.com/news/search?q=ai+3D+printing+model+STL+mesh&qft=interval%3d%228%22 ...Bing News]
 +
 
<youtube>q92Px7Z3KXk</youtube>
 
<youtube>q92Px7Z3KXk</youtube>
 +
 +
=== [[Generative AI]] & 3D Printing ===
 +
[https://www.youtube.com/results?search_query=ai+Generative+ChatGPT+GPT+3D+printing+model+STL+mesh YouTube]
 +
[https://www.quora.com/search?q=ai%20Generative%20ChatGPT%20GPT%203D%20printing%20model%20STL%20mesh ... Quora]
 +
[https://www.google.com/search?q=ai+Generative+ChatGPT+GPT+3D+printing+model+STL+mesh ...Google search]
 +
[https://news.google.com/search?q=ai+Generative+ChatGPT+GPT+3D+printing+model+STL+mesh ...Google News]
 +
[https://www.bing.com/news/search?q=ai+Generative+ChatGPT+GPT+3D+printing+model+STL+mesh&qft=interval%3d%228%22 ...Bing News]
 +
 +
* [[Generative AI]]  ... [[Conversational AI]] ... [[OpenAI]]'s [[ChatGPT]] ... [[Perplexity]]  ... [[Microsoft]]'s [[Bing]] ... [[You]] ...[[Google]]'s [[Bard]] ... [[Baidu]]'s [[Ernie]]
 +
* [https://3dwithus.com/how-can-chatgpt-be-used-for-3d-printing How Can ChatGPT Be Used for 3D Printing | Andrew Sink - 3DWithUs]
 +
* [https://www.blender.org/ Blender] ... free and open-source 3D creation suite that supports the entirety of the 3D pipeline—modeling, sculpting, rigging, 3D and 2D animation, simulation, rendering, compositing, motion tracking and video editing
 +
* [https://3dprintingindustry.com/news/generating-3d-models-from-text-with-nvidias-magic3d-220520/ Generating 3d Models From Text With Nvidia’s Magic3D | Ada Shaikhnag - 3D Printing Industry] ... combining 3D printing and generative AI
 +
 +
 +
While ChatGPT isn’t quite ready to create a functional model of a complex engine, it is capable of making simple shapes and also creating programs that can be used to make 3D models.
 +
 +
<youtube>tIIFKPzysok</youtube>
 +
<youtube>a1iV4fcWJJg</youtube>

Latest revision as of 08:07, 16 June 2024

Youtube search... ...Google search


Geometric Deep Learning

3D Machine Learning | GitHub


3D Convolutional Neural Networks (3DCNN)

Youtube search... ...Google search

3D CNN models are widely used for object classification and detection within varying data modalities such as LiDAR Point Cloud, RGB-Depth data, 3D Computer Aided Design (CAD) models, and medical CT imagery. RGB-Depth data can also be processed using CNN by firstly extracting proposals from 2D RGB images using a 2D object detector and transforming the proposals and corresponding depth information into 3D Point Clouds. The generated 3D point clouds can be further explored by 3D CNN models such as PointNet. These models designed for Point Clouds, RGBD data, CAD models or medical CT images are not readily transferable to our volumetric 3D CT imagery for baggage security screening since the modality of input data for 3D CNN can differ significantly. However, the design of 3D CNN architectures and the training strategies used in existing work can be repurposed towards our prohibited item classification and detection within 3D CT baggage imagery. On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery | Q. Wang, N. Bhowmik, and T. Breckon - Durham, UK

Schematic diagram of the Deep 3D Convolutional Neural Network and FEATURE-Softmax Classifier models. a Deep 3D Convolutional Neural Network. The feature extraction stage includes 3D convolutional and Pooling / Sub-sampling: Max, Mean layers. 3D filters in the 3D convolutional layers search for recurrent spatial patterns that best capture the local biochemical features to separate the 20 amino acid microenvironments. Pooling / Sub-sampling: Max, Mean layers perform down-sampling to the input to increase translational invariances of the network. By following the 3DCNN and 3D Pooling / Sub-sampling: Max, Mean layers with fully connected layers, the pooled filter responses of all filters across all positions in the protein box can be integrated. The integrated information is then fed to the Softmax classifier layer to calculate class probabilities and to make the final predictions. Prediction error drives parameter updates of the trainable parameters in the classifier, fully connected layers, and convolutional filters to learn the best feature for the optimal performances. b The FEATURE Softmax Classifier. The FEATURE Softmax model begins with an input layer, which takes in FEATURE vectors, followed by two fully-connected layers, and ends with a Softmax classifier layer. In this case, the input layer is equivalent to the feature extraction stage. In contrast to 3DCNN, the prediction error only drives parameter learning of the fully connected layers and classifier. The input feature is fixed during the whole training process

Schematic-diagram-of-the-Deep-3D-Convolutional-Neural-Network-and-FEATURE-Softmax.png


LiDAR

Youtube search... ...Google search

LiDAR involves firing rapid laser pulses at objects and measuring how much time they take to return to the sensor. This is similar to the "time of flight" technology for RGB-D cameras we described above, but LiDAR has significantly longer range, captures many more points, and is much more robust to interference from other light sources. Most 3D LiDAR sensors today have several (up to 64) beams aligned vertically, spinning rapidly to see in all directions around the sensor. These are the sensors used in most self-driving cars because of their accuracy, range, and robustness, but the problem with LiDAR sensors is that they're often large, heavy, and extremely expensive (the 64-beam sensor that most self-driving cars use costs $75,000!). As a result, many companies are currently trying to develop cheaper “solid state LiDAR” systems that can sense in 3D without having to spin. Beyond the pixel plane: sensing and learning in 3D | Mihir Garimella, Prathik Naidu - Stanford

VoxelNet is an end-to-end 3D object detector specially designed for LiDAR data. It consists of three modules:

  • feature learning network (subdivide the Point Cloud into many subvolumes/voxels, feature engineering + fully connected neural network),
  • convolutional middle layer (3D convolution applied to the stacked voxel feature volumes, each subvolume/voxel is a feature vector)
  • and region proposal networks.

VoxNet in a more generic model being able to handle different types of 3D data including LiDAR Point Cloud, CAD and RGBD data. Qi et al. improved the performance of VoxNet by introducing the auxiliary subvolume supervision to alleviate the overfitting issue.On the Evaluation of Prohibited Item Classification and Detection in Volumetric 3D Computed Tomography Baggage Security Screening Imagery | Q. Wang, N. Bhowmik, and T. Breckon - Durham, UK

giphy.gif

MVCNN

Youtube search... ...Google search

A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes’ rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors. Recognition rates further increase when multiple views of the shapes are provided. In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and compact shape descriptor offering even better recognition performance. The same architecture can be applied to accurately recognize human hand-drawn sketches of shapes. We conclude that a collection of 2D views can be highly informative for 3D shape recognition and is amenable to emerging CNN architectures and their derivatives. Multi-view Convolutional Neural Networks for 3D Shape Recognition H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller and MVCNN with PyTorch

mvcnn.png

Quadtrees and Octrees

Youtube search... ...Google search

teaser.png

3D Models from 2D Images


ChatGPT & Blender

3D Printing

YouTube ... Quora ...Google search ...Google News ...Bing News

Generative AI & 3D Printing

YouTube ... Quora ...Google search ...Google News ...Bing News


While ChatGPT isn’t quite ready to create a functional model of a complex engine, it is capable of making simple shapes and also creating programs that can be used to make 3D models.