Difference between revisions of "Neural Network"

From
Jump to: navigation, search
m
m (Deep Neural Network (DNN))
 
(41 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
|title=PRIMO.ai
 
|title=PRIMO.ai
 
|titlemode=append
 
|titlemode=append
|keywords=artificial, intelligence, machine, learning, models, algorithms, data, singularity, moonshot, Tensorflow, Google, Nvidia, Microsoft, Azure, Amazon, AWS  
+
|keywords=ChatGPT, artificial, intelligence, machine, learning, GPT-4, GPT-5, NLP, NLG, NLC, NLU, models, data, singularity, moonshot, Sentience, AGI, Emergence, Moonshot, Explainable, TensorFlow, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Hugging Face, OpenAI, Tensorflow, OpenAI, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Meta, LLM, metaverse, assistants, agents, digital twin, IoT, Transhumanism, Immersive Reality, Generative AI, Conversational AI, Perplexity, Bing, You, Bard, Ernie, prompt Engineering LangChain, Video/Image, Vision, End-to-End Speech, Synthesize Speech, Speech Recognition, Stanford, MIT |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools  
+
 
 +
<!-- Google tag (gtag.js) -->
 +
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4GCWLBVJ7T"></script>
 +
<script>
 +
  window.dataLayer = window.dataLayer || [];
 +
  function gtag(){dataLayer.push(arguments);}
 +
  gtag('js', new Date());
 +
 
 +
  gtag('config', 'G-4GCWLBVJ7T');
 +
</script>
 
}}
 
}}
[https://www.youtube.com/results?search_query=deep+Neural+Networks+DNN YouTube search...]
+
[https://www.youtube.com/results?search_query=deep+Neural+Networks+DNN YouTube]
[https://www.google.com/search?q=deep+Neural+Networks+DNN+machine+learning+ML+artificial+intelligence ...Google search]
+
[https://www.quora.com/search?q=deep%20Neural%20Networks%20DNN ... Quora]
 +
[https://www.google.com/search?q=deep+Neural+Networks+DNN ...Google search]
 +
[https://news.google.com/search?q=deep+Neural+Networks+DNN ...Google News]
 +
[https://www.bing.com/news/search?q=deep+Neural+Networks+DNN&qft=interval%3d%228%22 ...Bing News]
  
* [[Deep Learning]]
+
* [[What is Artificial Intelligence (AI)? | Artificial Intelligence (AI)]] ... [[Generative AI]] ... [[Machine Learning (ML)]] ... [[Deep Learning]] ... [[Neural Network]] ... [[Reinforcement Learning (RL)|Reinforcement]] ... [[Learning Techniques]]
 +
* [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Grok]] | [https://x.ai/ xAI] ... [[Groq]] ... [[Ernie]] | [[Baidu]]
 
* [[Neural Architecture]]
 
* [[Neural Architecture]]
* [[AI Solver]]
+
* [[Symbiotic Intelligence]] ... [[Bio-inspired Computing]] ... [[Neuroscience]] ... [[Connecting Brains]] ... [[Nanobots#Brain Interface using AI and Nanobots|Nanobots]] ... [[Molecular Artificial Intelligence (AI)|Molecular]] ... [[Neuromorphic Computing|Neuromorphic]] ... [[Evolutionary Computation / Genetic Algorithms| Evolutionary/Genetic]]
 +
** [https://en.wikipedia.org/wiki/Types_of_artificial_neural_networks Types of artificial neural networks | Wikipedia]
 +
* [[AI Solver]] ... [[Algorithms]] ... [[Algorithm Administration|Administration]] ... [[Model Search]] ... [[Discriminative vs. Generative]] ... [[Train, Validate, and Test]]
 
** [[...predict categories]]
 
** [[...predict categories]]
* [[Capabilities]]  
+
* [[Reservoir Computing (RC) Architecture]]
* [[Algorithm Administration#Automated Learning|Automated Learning]]
+
* [[Backpropagation]] ... [[Feed Forward Neural Network (FF or FFNN)|FFNN]] ... [[Forward-Forward]] ... [[Activation Functions]] ...[[Softmax]] ... [[Loss]] ... [[Boosting]] ... [[Gradient Descent Optimization & Challenges|Gradient Descent]] ... [[Algorithm Administration#Hyperparameter|Hyperparameter]] ... [[Manifold Hypothesis]] ... [[Principal Component Analysis (PCA)|PCA]]
 +
* [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ...  [[Algorithm Administration#Automated Learning|Automated Learning]]
 
* [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ...  [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]
 
* [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ...  [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]
 +
* [https://neurosciencenews.com/neuroscience-terms/neural-networks/ Neuroscience News - Neural Networks]
 
* [https://pathmind.com/wiki/neural-network A Beginner's Guide to Neural Networks and Deep Learning | Chris Nicholson - A.I. Wiki pathmind]
 
* [https://pathmind.com/wiki/neural-network A Beginner's Guide to Neural Networks and Deep Learning | Chris Nicholson - A.I. Wiki pathmind]
  
<img src="https://www.drchriseducation.com/wp-content/uploads/2021/05/Neural-Networks-for-Babies.jpg" width="300">
 
  
Neural Networks (NNs), also referred to as Artificial neural networks (ANNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. A Neural Network is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. The "signal" at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold.  Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times. [https://en.wikipedia.org/wiki/Artificial_neural_network Artificial neural network | Wikipedia]
+
<b>Neural Networks (NNs)</b>, also referred to as Artificial neural networks (ANNs) or neural nets, are a mathematical system modeled on the human brain that learns skills by finding patterns in data through layers of artificial neurons, outputting predictions or classifications. Neural Networks are computing systems inspired by the biological neural networks that constitute animal brains. Neural networks can be hardware- (neurons are represented by physical components) or software-based (computer models), and can use a variety of topologies and learning algorithms. A Neural Network is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. The "signal" at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a [[Activation Functions#Weights|weight]] that adjusts as learning proceeds. The [[Activation Functions#Weights|weight]] increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold.  Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times. [https://en.wikipedia.org/wiki/Artificial_neural_network Artificial neural network | Wikipedia]
  
<youtube>aircAruvnKk</youtube>
+
<img src="https://www.drchriseducation.com/wp-content/uploads/2021/05/Neural-Networks-for-Babies.jpg" width="200">
<youtube>0QczhVg5HaI</youtube>
 
<youtube>GQVLl0RqpSs</youtube>
 
<youtube>v2tKoymKIuE</youtube>
 
<youtube>LCzufhtIFnY</youtube>
 
<youtube>aircAruvnKk</youtube>
 
<youtube>SGZ6BttHMPw</youtube>
 
  
 
= <span id="Deep Neural Network (DNN)"></span>Deep Neural Network (DNN) =
 
= <span id="Deep Neural Network (DNN)"></span>Deep Neural Network (DNN) =
 
* [[Deep Learning]]
 
* [[Deep Learning]]
 +
** [[(Deep) Convolutional Neural Network (DCNN/CNN)]]
 +
** [[(Deep) Residual Network (DRN) - ResNet]]
 +
** [[Deep Belief Network (DBN)]]
 +
** [[ResNet-50]]
 
* [https://medium.com/intuitionmachine/deep-learnings-uncertainty-principle-13f3ffdd15ce Deep Learning’s Uncertainty Principle]
 
* [https://medium.com/intuitionmachine/deep-learnings-uncertainty-principle-13f3ffdd15ce Deep Learning’s Uncertainty Principle]
 
* [https://www.youtube.com/watch?v=7PiK4wtfvbA&list=PLBAGcD3siRDguyYYzhVwZ3tLvOyyG5k6K Andrew Ng's Deep Learning]
 
* [https://www.youtube.com/watch?v=7PiK4wtfvbA&list=PLBAGcD3siRDguyYYzhVwZ3tLvOyyG5k6K Andrew Ng's Deep Learning]
Line 41: Line 55:
 
The scientists observed that with convolutional deep neural networks with hierarchical locality, this exponential cost vanishes and becomes more linear again. Then they demonstrated that dimensionality can be avoided for deep networks of the convolutional type for certain types of compositional functions. The implications are that for problems with hierarchical locality, such as image classification, deep networks are exponentially more powerful than shallow networks.
 
The scientists observed that with convolutional deep neural networks with hierarchical locality, this exponential cost vanishes and becomes more linear again. Then they demonstrated that dimensionality can be avoided for deep networks of the convolutional type for certain types of compositional functions. The implications are that for problems with hierarchical locality, such as image classification, deep networks are exponentially more powerful than shallow networks.
  
“In approximation theory, both shallow and deep networks are known  to approximate any  continuous functions at an exponential cost,” the researchers wrote. “However, we proved that for certain types of compositional functions, deep networks of the convolutional type (even without weight sharing) can avoid the curse of dimensionality.”
 
  
The team then set out to explain why deep networks, which tend to be over-parameterized, perform well on out-of-sample data. The researchers demonstrated that for classification problems, given a standard deep network, trained with gradient descent algorithms, it is the direction in the parameter space that matters, rather than the norms or the size of the weights.
+
 
 +
<img src="https://miro.medium.com/max/880/1*tfqGFuYQHCAfRZTI0yw-dg.gif" width="700">
 +
 
 +
 
 +
“In approximation theory, both shallow and deep networks are known  to approximate any  continuous functions at an exponential cost,” the researchers wrote. “However, we proved that for certain types of compositional functions, deep networks of the convolutional type (even without [[Activation Functions#Weights|weight sharing]]) can avoid the curse of dimensionality.”
 +
 
 +
The team then set out to explain why deep networks, which tend to be over-parameterized, perform well on out-of-sample data. The researchers demonstrated that for classification problems, given a standard deep network, trained with gradient descent algorithms, it is the direction in the parameter space that matters, rather than the norms or the size of the [[Activation Functions#Weights|weights]].
  
 
The implications are that the dynamics of gradient descent on deep networks are equivalent to those with explicit constraints on both the norm and size of the parameters–the gradient descent converges to the max-margin solution. The team discovered a similarity known to linear models in which vector machines converge to the pseudoinverse solution which aims to minimize the number of solutions.
 
The implications are that the dynamics of gradient descent on deep networks are equivalent to those with explicit constraints on both the norm and size of the parameters–the gradient descent converges to the max-margin solution. The team discovered a similarity known to linear models in which vector machines converge to the pseudoinverse solution which aims to minimize the number of solutions.
  
In effect, the team posit that the act of training deep networks serves to provide implicit regularization and norm control. The scientists attribute the ability for deep networks to generalize, sans explicit capacity controls of a regularization term or constraint on the norm of the weights, to the mathematical computation that shows the unit vector (computed from the solution of gradient descent) remains the same, whether or not the constraint is enforced during gradient descent. In other words, deep networks select minimum norm solutions, hence the gradient flow of deep networks with an exponential-type loss locally minimizes the expected error. [https://www.psychologytoday.com/us/blog/the-future-brain/202008/new-ai-study-may-explain-why-deep-learning-works A New AI Study May Explain Why Deep Learning Works MIT researchers’ new theory illuminates machine learning’s black box.| Cami Rosso - Psychology Today]  ..[https://www.pnas.org/ PNAS (Proceedings of the National Academy of Sciences of the United States of America) | T. Poggio, A. Banburski, and Q. Liao]
+
In effect, the team posit that the act of training deep networks serves to provide implicit regularization and norm control. The scientists attribute the ability for deep networks to generalize, sans explicit capacity controls of a regularization term or constraint on the norm of the [[Activation Functions#Weights|weights]], to the mathematical computation that shows the unit vector (computed from the solution of gradient descent) remains the same, whether or not the constraint is enforced during gradient descent. In other words, deep networks select minimum norm solutions, hence the gradient flow of deep networks with an exponential-type loss locally minimizes the expected error. [https://www.psychologytoday.com/us/blog/the-future-brain/202008/new-ai-study-may-explain-why-deep-learning-works A New AI Study May Explain Why Deep Learning Works MIT researchers’ new theory illuminates machine learning’s black box.| Cami Rosso - Psychology Today]  ..[https://www.pnas.org/ PNAS (Proceedings of the National Academy of Sciences of the United States of America) | T. Poggio, A. Banburski, and Q. Liao]
 
 
  
 +
<youtube>FBpPjjhJGhk</youtube>
 
<youtube>ILsA4nyG7I0</youtube>
 
<youtube>ILsA4nyG7I0</youtube>
 +
<youtube>TkwXa7Cvfr8</youtube>
 
<youtube>sAkfoGAywks</youtube>
 
<youtube>sAkfoGAywks</youtube>
 
<youtube>gzYFDNKHSMM</youtube>
 
<youtube>gzYFDNKHSMM</youtube>
+
<youtube>aircAruvnKk</youtube>
 +
<youtube>0QczhVg5HaI</youtube>
 +
<youtube>GQVLl0RqpSs</youtube>
 +
<youtube>v2tKoymKIuE</youtube>
 +
<youtube>LCzufhtIFnY</youtube>
 +
<youtube>aircAruvnKk</youtube>
 +
<youtube>SGZ6BttHMPw</youtube>
 +
 
 
= Opening the Black Box =
 
= Opening the Black Box =
 
* [https://arxiv.org/pdf/1703.00810.pdf Opening the Black Box of Deep Neural Networks via Information | Ravid Schwartz-Ziv and Naftali Tishby - The Hebrew University of Jerusalem]
 
* [https://arxiv.org/pdf/1703.00810.pdf Opening the Black Box of Deep Neural Networks via Information | Ravid Schwartz-Ziv and Naftali Tishby - The Hebrew University of Jerusalem]
Line 63: Line 90:
  
 
<img src="https://i0.wp.com/syncedreview.com/wp-content/uploads/2019/06/image-45.png" width="600">
 
<img src="https://i0.wp.com/syncedreview.com/wp-content/uploads/2019/06/image-45.png" width="600">
 +
 +
= <span id="Decoding the Human Mind"></span>Decoding the Human Mind =
 +
<youtube>wPonuHqbNds</youtube>
  
 
= <span id="Neural Network History"></span>Neural Network History =
 
= <span id="Neural Network History"></span>Neural Network History =
* [[History of Artificial Intelligence (AI)]]
+
* [[Perceptron (P)]]
* [[Creatives]]
+
* [[Creatives]] ... [[History of Artificial Intelligence (AI)]] ... [[Neural Network#Neural Network History|Neural Network History]] ... [[Rewriting Past, Shape our Future]] ... [[Archaeology]] ... [[Paleontology]]
 
 
  
 
The history of neural networks is long and complex, but it can be traced back to the early days of artificial intelligence research. In 1943, <b>Warren McCulloch</b> and <b>Walter Pitts</b> published a paper in which they proposed a mathematical model of the human brain. This model, which was based on the idea that neurons in the brain are connected to each other, became the foundation for modern neural networks.
 
The history of neural networks is long and complex, but it can be traced back to the early days of artificial intelligence research. In 1943, <b>Warren McCulloch</b> and <b>Walter Pitts</b> published a paper in which they proposed a mathematical model of the human brain. This model, which was based on the idea that neurons in the brain are connected to each other, became the foundation for modern neural networks.
Line 101: Line 130:
  
 
<i>We sat down and did all this in the computer and built these computer models, and we just didn’t understand them. We didn’t understand why they worked or why they didn’t work or what was critical about them.</i>
 
<i>We sat down and did all this in the computer and built these computer models, and we just didn’t understand them. We didn’t understand why they worked or why they didn’t work or what was critical about them.</i>
 +
 +
<youtube>XKC-4Tosdd8</youtube>
 +
 +
= The Next Great Scientific Theory =
 +
* [[Perspective]] ... [[Context]] ... [[In-Context Learning (ICL)]] ... [[Transfer Learning]] ... [[Out-of-Distribution (OOD) Generalization]]
 +
 +
<b>The Next Great Scientific Theory is Hiding Inside a Neural Network</b>: Machine learning methods such as neural networks are quickly finding uses in everything from text generation to construction cranes. Excitingly, those same tools also promise a new paradigm for scientific discovery. In this Presidential Lecture, Miles Cranmer will outline an innovative approach that leverages neural networks in the scientific process.
 +
 +
 +
<hr><center><b><i>
 +
 +
Rather than directly modeling data, the approach interprets neural networks trained using the data. Through training, the neural networks can capture the physics underlying the system being studied. By extracting what the neural networks have learned, scientists can improve their theories.
 +
 +
</i></b></center><hr>
 +
 +
 +
He will also discuss the Polymathic AI initiative, a collaboration between researchers at the Flatiron Institute and scientists around the world. Polymathic AI is designed to spur scientific discovery using similar technology to that powering [[ChatGPT]]. Using Polymathic AI, scientists will be able to model a broad range of physical systems across different scales. [https://www.simonsfoundation.org/event/the-next-great-scientific-theory-is-hiding-inside-a-neural-network/ The Next Great Scientific Theory is Hiding Inside a Neural Network | Miles Cranmer - Simons Foundation]
 +
 +
 +
<youtube>fk2r8y5TfNY</youtube>

Latest revision as of 06:26, 11 May 2024

YouTube ... Quora ...Google search ...Google News ...Bing News


Neural Networks (NNs), also referred to as Artificial neural networks (ANNs) or neural nets, are a mathematical system modeled on the human brain that learns skills by finding patterns in data through layers of artificial neurons, outputting predictions or classifications. Neural Networks are computing systems inspired by the biological neural networks that constitute animal brains. Neural networks can be hardware- (neurons are represented by physical components) or software-based (computer models), and can use a variety of topologies and learning algorithms. A Neural Network is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron receives signals then processes them and can signal neurons connected to it. The "signal" at a connection is a real number, and the output of each neuron is computed by some non-linear function of the sum of its inputs. The connections are called edges. Neurons and edges typically have a weight that adjusts as learning proceeds. The weight increases or decreases the strength of the signal at a connection. Neurons may have a threshold such that a signal is sent only if the aggregate signal crosses that threshold. Typically, neurons are aggregated into layers. Different layers may perform different transformations on their inputs. Signals travel from the first layer (the input layer), to the last layer (the output layer), possibly after traversing the layers multiple times. Artificial neural network | Wikipedia

Deep Neural Network (DNN)

Researchers' study | The MIT research trio of Tomaso Poggio, Andrzej Banburski, and Quianli Liao - Center for Brains, Minds, and Machines) compared deep and shallow networks in which both used identical sets of procedures such as pooling, convolution, linear combinations, a fixed nonlinear function of one variable, and dot products. Why do deep networks have great approximation powers, and tend to achieve better results than shallow networks given they are both universal approximators?

The scientists observed that with convolutional deep neural networks with hierarchical locality, this exponential cost vanishes and becomes more linear again. Then they demonstrated that dimensionality can be avoided for deep networks of the convolutional type for certain types of compositional functions. The implications are that for problems with hierarchical locality, such as image classification, deep networks are exponentially more powerful than shallow networks. ...

The scientists observed that with convolutional deep neural networks with hierarchical locality, this exponential cost vanishes and becomes more linear again. Then they demonstrated that dimensionality can be avoided for deep networks of the convolutional type for certain types of compositional functions. The implications are that for problems with hierarchical locality, such as image classification, deep networks are exponentially more powerful than shallow networks.



“In approximation theory, both shallow and deep networks are known to approximate any continuous functions at an exponential cost,” the researchers wrote. “However, we proved that for certain types of compositional functions, deep networks of the convolutional type (even without weight sharing) can avoid the curse of dimensionality.”

The team then set out to explain why deep networks, which tend to be over-parameterized, perform well on out-of-sample data. The researchers demonstrated that for classification problems, given a standard deep network, trained with gradient descent algorithms, it is the direction in the parameter space that matters, rather than the norms or the size of the weights.

The implications are that the dynamics of gradient descent on deep networks are equivalent to those with explicit constraints on both the norm and size of the parameters–the gradient descent converges to the max-margin solution. The team discovered a similarity known to linear models in which vector machines converge to the pseudoinverse solution which aims to minimize the number of solutions.

In effect, the team posit that the act of training deep networks serves to provide implicit regularization and norm control. The scientists attribute the ability for deep networks to generalize, sans explicit capacity controls of a regularization term or constraint on the norm of the weights, to the mathematical computation that shows the unit vector (computed from the solution of gradient descent) remains the same, whether or not the constraint is enforced during gradient descent. In other words, deep networks select minimum norm solutions, hence the gradient flow of deep networks with an exponential-type loss locally minimizes the expected error. A New AI Study May Explain Why Deep Learning Works MIT researchers’ new theory illuminates machine learning’s black box.| Cami Rosso - Psychology Today ..PNAS (Proceedings of the National Academy of Sciences of the United States of America) | T. Poggio, A. Banburski, and Q. Liao

Opening the Black Box

Decoding the Human Mind

Neural Network History

The history of neural networks is long and complex, but it can be traced back to the early days of artificial intelligence research. In 1943, Warren McCulloch and Walter Pitts published a paper in which they proposed a mathematical model of the human brain. This model, which was based on the idea that neurons in the brain are connected to each other, became the foundation for modern neural networks.


The inventors of the neural network Walter Pitts and Warren McCulloch pictured here in 1949. Semantic Scholar


In the 1950s, neural networks began to be used to solve real-world problems. One of the most famous early applications was the development of the Perceptron by Frank Rosenblatt in 1958. The Perceptron was a simple neural network that could be used to classify data. It was initially very successful, but it was later shown to be limited in its capabilities.

In the 1960s and 1970s, neural networks fell out of favor as researchers focused on other approaches to artificial intelligence. However, in the 1980s, there was a resurgence of interest in neural networks. This was due in part to the development of new algorithms that made neural networks more powerful.

In the 1990s, neural networks began to be used to solve a wide range of problems, including image recognition, speech recognition, and natural language processing. This was due in part to the availability of large amounts of data that could be used to train neural networks. There were many major contributors to neural networks in the 1990s. Some of the most notable include:

  • Geoffrey Hinton, developed backpropagation, a technique for training neural networks that is still used today.
  • Yann LeCun, developed convolutional neural networks, which are now widely used in image recognition and natural language processing.
  • Yoshua Benigo, developed recurrent neural networks, which are used in tasks such as speech recognition and machine translation.
  • Michael Jordan, work on theoretical insights into neural networks and their applications.
  • Terrence Sejnowski, developed new learning algorithms and models, and applying neural networks to a variety of problems in machine learning and neuroscience.

In the 2000s, neural networks have become even more powerful due to the development of deep learning. Deep learning is a type of neural network that uses multiple layers of artificial neurons. Deep learning has been used to achieve state-of-the-art results on a wide range of problems, including image recognition, speech recognition, and natural language processing.

Today, neural networks are one of the most important tools in artificial intelligence. They are used in a wide range of applications, including self-driving cars, facial recognition, and fraud detection.

____________

AI will soon become impossible for humans to comprehend – the story of neural networks tells us why | David Beer - The Conversation


David Rumelhart, who had a background in psychology and was a co-author of a set of books published in 1986 that would later drive attention back again towards neural networks, found himself collaborating on the development of neural networks with his colleague Jay McClelland. As well as being colleagues they had also recently encountered each other at a conference in Minnesota where Rumelhart’s talk on “story understanding” had provoked some discussion among the delegates. Following that conference McClelland returned with a thought about how to develop a neural network that might combine models to be more interactive. What matters here is Rumelhart’s recollection of the “hours and hours and hours of tinkering on the computer”...

We sat down and did all this in the computer and built these computer models, and we just didn’t understand them. We didn’t understand why they worked or why they didn’t work or what was critical about them.

The Next Great Scientific Theory

The Next Great Scientific Theory is Hiding Inside a Neural Network: Machine learning methods such as neural networks are quickly finding uses in everything from text generation to construction cranes. Excitingly, those same tools also promise a new paradigm for scientific discovery. In this Presidential Lecture, Miles Cranmer will outline an innovative approach that leverages neural networks in the scientific process.



Rather than directly modeling data, the approach interprets neural networks trained using the data. Through training, the neural networks can capture the physics underlying the system being studied. By extracting what the neural networks have learned, scientists can improve their theories.



He will also discuss the Polymathic AI initiative, a collaboration between researchers at the Flatiron Institute and scientists around the world. Polymathic AI is designed to spur scientific discovery using similar technology to that powering ChatGPT. Using Polymathic AI, scientists will be able to model a broad range of physical systems across different scales. The Next Great Scientific Theory is Hiding Inside a Neural Network | Miles Cranmer - Simons Foundation