Difference between revisions of "AI Verification and Validation"
m |
m |
||
| (19 intermediate revisions by the same user not shown) | |||
| Line 2: | Line 2: | ||
|title=PRIMO.ai | |title=PRIMO.ai | ||
|titlemode=append | |titlemode=append | ||
| − | |keywords=artificial, intelligence, machine, learning, models | + | |keywords=ChatGPT, artificial, intelligence, machine, learning, NLP, NLG, NLC, NLU, models, data, singularity, moonshot, Sentience, AGI, Emergence, Moonshot, Explainable, TensorFlow, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Hugging Face, OpenAI, Tensorflow, OpenAI, Google, Nvidia, Microsoft, Azure, Amazon, AWS, Meta, LLM, metaverse, assistants, agents, digital twin, IoT, Transhumanism, Immersive Reality, Generative AI, Conversational AI, Perplexity, Bing, You, Bard, Ernie, prompt Engineering LangChain, Video/Image, Vision, End-to-End Speech, Synthesize Speech, Speech Recognition, Stanford, MIT |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools |
| − | |description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools | + | |
| + | <!-- Google tag (gtag.js) --> | ||
| + | <script async src="https://www.googletagmanager.com/gtag/js?id=G-4GCWLBVJ7T"></script> | ||
| + | <script> | ||
| + | window.dataLayer = window.dataLayer || []; | ||
| + | function gtag(){dataLayer.push(arguments);} | ||
| + | gtag('js', new Date()); | ||
| + | |||
| + | gtag('config', 'G-4GCWLBVJ7T'); | ||
| + | </script> | ||
}} | }} | ||
| − | [ | + | [https://www.youtube.com/results?search_query=ai+Verification+Validation YouTube] |
| − | [ | + | [https://www.quora.com/search?q=ai%20Verification%20Validation ... Quora] |
| + | [https://www.google.com/search?q=ai+Verification+Validation ...Google search] | ||
| + | [https://news.google.com/search?q=ai+Verification+Validation ...Google News] | ||
| + | [https://www.bing.com/news/search?q=ai+Verification+Validation&qft=interval%3d%228%22 ...Bing News] | ||
| + | |||
| − | * [[Explainable / Interpretable AI]] | + | * [[Risk, Compliance and Regulation]] ... [[Ethics]] ... [[Privacy]] ... [[Law]] ... [[AI Governance]] ... [[AI Verification and Validation]] |
| − | * [[Evaluation]] | + | * [[Data Quality]] ... [[AI Verification and Validation|validity]], [[Evaluation - Measures#Accuracy|accuracy]], [[Data Quality#Data Cleaning|cleaning]], [[Data Quality#Data Completeness|completeness]], [[Data Quality#Data Consistency|consistency]], [[Data Quality#Data Encoding|encoding]], [[Data Quality#Zero Padding|padding]], [[Data Quality#Data Augmentation, Data Labeling, and Auto-Tagging|augmentation, labeling, auto-tagging]], [[Data Quality#Batch Norm(alization) & Standardization| normalization, standardization]], [[Data Quality#Imbalanced Data|imbalanced data]] |
| − | * [[Data Science]] | + | * [[Artificial General Intelligence (AGI) to Singularity]] ... [[Inside Out - Curious Optimistic Reasoning| Curious Reasoning]] ... [[Emergence]] ... [[Moonshots]] ... [[Explainable / Interpretable AI|Explainable AI]] ... [[Algorithm Administration#Automated Learning|Automated Learning]] |
| − | * [[Train, Validate, and Test]] | + | * [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] |
| − | * [ | + | * [[Strategy & Tactics]] ... [[Project Management]] ... [[Best Practices]] ... [[Checklists]] ... [[Project Check-in]] ... [[Evaluation]] ... [[Evaluation - Measures|Measures]] |
| + | * [[AI Governance]] / [[Algorithm Administration]] | ||
| + | * [[Data Science]] ... [[Data Governance|Governance]] ... [[Data Preprocessing|Preprocessing]] ... [[Feature Exploration/Learning|Exploration]] ... [[Data Interoperability|Interoperability]] ... [[Algorithm Administration#Master Data Management (MDM)|Master Data Management (MDM)]] ... [[Bias and Variances]] ... [[Benchmarks]] ... [[Datasets]] | ||
| + | * [[Development]] ... [[Notebooks]] ... [[Development#AI Pair Programming Tools|AI Pair Programming]] ... [[Codeless Options, Code Generators, Drag n' Drop|Codeless]] ... [[Hugging Face]] ... [[Algorithm Administration#AIOps/MLOps|AIOps/MLOps]] ... [[Platforms: AI/Machine Learning as a Service (AIaaS/MLaaS)|AIaaS/MLaaS]] | ||
| + | * [[Libraries & Frameworks Overview]] ... [[Libraries & Frameworks]] ... [[Git - GitHub and GitLab]] ... [[Other Coding options]] | ||
| + | * [[AI Solver]] ... [[Algorithms]] ... [[Algorithm Administration|Administration]] ... [[Model Search]] ... [[Discriminative vs. Generative]] ... [[Train, Validate, and Test]] | ||
| + | * [[Agents#AI Agent Optimization|AI Agent Optimization]] ... [[Optimization Methods]] ... [[Optimizer]] ... [[Objective vs. Cost vs. Loss vs. Error Function]] ... [[Exploration]] | ||
| + | * [[Policy]] ... [[Policy vs Plan]] ... [[Constitutional AI]] ... [[Trust Region Policy Optimization (TRPO)]] ... [[Policy Gradient (PG)]] ... [[Proximal Policy Optimization (PPO)]] | ||
| + | * [https://www.sogeti.com/globalassets/global/downloads/reports/testing-of-artificial-intelligence_sogeti-report_11_12_2017-.pdf Testing of Artificial Intelligence | Sogeti] | ||
* [[Other Challenges]] in Artificial Intelligence | * [[Other Challenges]] in Artificial Intelligence | ||
| − | * [ | + | * [https://towardsdatascience.com/data-science-concepts-explained-to-a-five-year-old-ad440c7b3cbd Data Science Concepts Explained to a Five-year-old | Megan Dibble - Toward Data Science] |
| + | |||
| + | |||
| + | = Guardrails AI = | ||
| + | * [https://www.guardrailsai.com/ Guardrails AI] | ||
| + | ** [https://github.com/shreyaR/guardrails Guardrails } GitHub] | ||
| + | * [[Conversational AI]] ... [[ChatGPT]] | [[OpenAI]] ... [[Bing/Copilot]] | [[Microsoft]] ... [[Gemini]] | [[Google]] ... [[Claude]] | [[Anthropic]] ... [[Perplexity]] ... [[You]] ... [[phind]] ... [[Ernie]] | [[Baidu]] | ||
| + | * [[Large Language Model (LLM)]] ... [[Large Language Model (LLM)#Multimodal|Multimodal]] ... [[Foundation Models (FM)]] ... [[Generative Pre-trained Transformer (GPT)|Generative Pre-trained]] ... [[Transformer]] ... [[Attention]] ... [[Generative Adversarial Network (GAN)|GAN]] ... [[Bidirectional Encoder Representations from Transformers (BERT)|BERT]] | ||
| + | |||
| + | Guardrails AI is a [[Python]] package for specifying structure and type, validating and correcting the outputs of large language models (LLMs). Guardrails AI works by wrapping around [[Large Language Model (LLM)|LLM API calls]] to structure, validate, and correct the outputs. It can be used to enforce a wide range of requirements, such as: | ||
| + | |||
| + | * Ensuring that the output is of a certain type (e.g., JSON, [[Python]] code, etc.) | ||
| + | * Checking for bias in the output | ||
| + | * Identifying and correcting factual errors | ||
| + | * Preventing the output from containing certain keywords or phrases | ||
| + | |||
| + | Guardrails AI can be used to improve the safety and reliability of LLMs in a wide range of applications, such as: | ||
| + | |||
| + | * Generating text for websites and blogs | ||
| + | * Writing code and scripts | ||
| + | * Translating languages | ||
| + | * Answering questions in a comprehensive and informative way | ||
| + | |||
| + | Here are some examples of how Guardrails AI can be used: | ||
| + | |||
| + | * A news organization could use Guardrails AI to ensure that the articles it generates are free of bias and factual errors. | ||
| + | * A software company could use Guardrails AI to generate code that is well-formatted and bug-free. | ||
| + | * A customer service chatbot could use Guardrails AI to ensure that its responses are helpful and informative. | ||
| + | |||
| + | <youtube>mspA9SUgjYw</youtube> | ||
| + | |||
| + | = Testing = | ||
Covering both.. | Covering both.. | ||
* Testing ‘of’ AI | * Testing ‘of’ AI | ||
| Line 65: | Line 117: | ||
- Model selection | - Model selection | ||
- Discriminatory Variables | - Discriminatory Variables | ||
| − | - Adversarial Sensitivity About Olivier: Olivier is cofounder and Head of Data Science of Moov AI, a data science consulting company. He is co-holder of a patent for having created an advanced algorithm for assessing borrowing capacity. Olivier is a data science expert with extensive experience in supporting and implementing projects in companies in different sectors in their digital transformation: financial services, technology, aerospace, telecommunications and consumer products. He led the data team and implemented a data culture at Pratt & Whitney Canada, L’Oréal and GSoft. | + | - Adversarial Sensitivity About Olivier: Olivier is cofounder and Head of Data Science of Moov AI, a data science consulting company. He is co-holder of a patent for having created an advanced algorithm for assessing borrowing capacity. Olivier is a data science expert with extensive experience in supporting and implementing projects in companies in different sectors in their digital transformation: financial services, technology, aerospace, [[telecommunications]] and consumer products. He led the data team and implemented a data culture at Pratt & Whitney Canada, L’Oréal and GSoft. |
|} | |} | ||
|<!-- M --> | |<!-- M --> | ||
| Line 91: | Line 143: | ||
<b>Improving Testing through Automation and AI | Tariq King | STARWEST | <b>Improving Testing through Automation and AI | Tariq King | STARWEST | ||
</b><br>In this interview, Tariq King, the director of test engineering at Ultimate Software, discusses how we can innovate and make our testing better through smarter automation and the use of artificial intelligence. | </b><br>In this interview, Tariq King, the director of test engineering at Ultimate Software, discusses how we can innovate and make our testing better through smarter automation and the use of artificial intelligence. | ||
| − | https://starwest.techwell.com/ He also explains the fundamentals of white box testing so you can find bugs as soon as they happen, and do more thorough, targeted testing during software development. | + | https://starwest.techwell.com/ He also explains the fundamentals of white box testing so you can find bugs as soon as they happen, and do more thorough, targeted testing during software [[development]]. |
|} | |} | ||
|}<!-- B --> | |}<!-- B --> | ||
== <span id="A/B Testing"></span>A/B Testing == | == <span id="A/B Testing"></span>A/B Testing == | ||
| − | [ | + | [https://www.youtube.com/results?search_query=A/B+Testing+test+quality+Artificial+Intelligence+IV%26V YouTube search...] |
| − | [ | + | [https://www.google.com/search?q=A/B+Testing+test+quality+deep+machine+learning+ML ...Google search] |
| − | * [ | + | * [https://towardsdatascience.com/data-science-you-need-to-know-a-b-testing-f2f12aff619a Data science you need to know! A/B testing | Michael Barber - Towards Data Science] |
| − | * [ | + | * [https://medium.com/analytics-vidhya/a-b-testing-for-data-science-f1203e9503b6 A/B Testing for Data Science | Anjali Tiwari - Analytics Vidhya - Medium] |
| − | * [ | + | * [https://www.kdnuggets.com/2017/05/data-analyst-guide-ab-testing.html A Data Analyst guide to A/B testing | Jacob Joseph - CleverTap - KDnuggets] |
| − | A/B testing (also known as bucket testing or split-run testing) is a user experience research methodology. A/B tests consist of a randomized experiment with two variants, A and B. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistics. A/B testing is a way to compare two versions of a single variable, typically by testing a subject's response to variant A against variant B, and determining which of the two variants is more effective. [ | + | A/B testing (also known as bucket testing or split-run testing) is a user experience research methodology. A/B tests consist of a randomized experiment with two variants, A and B. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistics. A/B testing is a way to compare two versions of a single variable, typically by testing a subject's response to variant A against variant B, and determining which of the two variants is more effective. [https://en.wikipedia.org/wiki/A/B_testing Wikipedia] |
{|<!-- T --> | {|<!-- T --> | ||
| Line 137: | Line 189: | ||
<b>Incorporating AI in A/B Testing - Pavel Dmitriev | <b>Incorporating AI in A/B Testing - Pavel Dmitriev | ||
</b><br>Despite the rapidly growing number of applications of A.I., accurately measuring the quality of A.I. solutions remains a challenge. In this talk, I will highlight the issues with traditional approaches to evaluating A.I. systems and explain how A/B testing - the gold standard for measuring causal effects - can be used to resolve them. I will share practical learnings and pitfalls from a decade of applying A/B testing to evaluate A.I. systems, which practitioners will be able to apply in their domains. SUBSCRIBE: https://bit.ly/SubscribeMagnimind | </b><br>Despite the rapidly growing number of applications of A.I., accurately measuring the quality of A.I. solutions remains a challenge. In this talk, I will highlight the issues with traditional approaches to evaluating A.I. systems and explain how A/B testing - the gold standard for measuring causal effects - can be used to resolve them. I will share practical learnings and pitfalls from a decade of applying A/B testing to evaluate A.I. systems, which practitioners will be able to apply in their domains. SUBSCRIBE: https://bit.ly/SubscribeMagnimind | ||
| − | JOIN OUR SLACK COMMUNITY: | + | JOIN OUR SLACK COMMUNITY: https://bit.ly/AI-ML-DataScience-Lovers Magnimind Academy TV Presents - July 2020 |
|} | |} | ||
|}<!-- B --> | |}<!-- B --> | ||
== <span id="Multivariate Testing (MVT)"></span>Multivariate Testing (MVT) == | == <span id="Multivariate Testing (MVT)"></span>Multivariate Testing (MVT) == | ||
| − | [ | + | [https://www.youtube.com/results?search_query=Multivariate+Testing+test+quality+Artificial+Intelligence+IV%26V YouTube search...] |
| − | [ | + | [https://www.google.com/search?q=Multivariate+Testing+test+quality+deep+machine+learning+ML ...Google search] |
| − | * [ | + | * [https://vwo.com/blog/difference-ab-testing-multivariate-testing/ Key Differences Between Multivariate Testing (MVT) & A/B Testing | Paras Chopra - Visual Website Optimizer (VWO)] |
| − | * [ | + | * [https://www.invespcro.com/blog/what-is-multivariate-testing/ What is Multivariate Testing? | Khalid Saleh - invesp] |
| − | * [ | + | * [https://www.smashingmagazine.com/2011/04/multivariate-testing-101-a-scientific-method-of-optimizing-design/ Multivariate Testing 101: A Scientific Method Of Optimizing Design | Paras Chopra - Visual Website Optimizer (VWO)] |
| − | Multivariate testing is a technique for testing a hypothesis in which multiple variables are modified. The goal of multivariate testing is to determine which combination of variations performs the best out of all of the possible combinations. Websites and mobile apps are made of combinations of changeable elements. A multivariate test will change multiple elements, like changing a picture and headline at the same time. Three variations of the image and two variations of the headline are combined to create six versions of the content, which are tested concurrently to find the winning variation. [ | + | Multivariate testing is a technique for testing a hypothesis in which multiple variables are modified. The goal of multivariate testing is to determine which combination of variations performs the best out of all of the possible combinations. Websites and mobile apps are made of combinations of changeable elements. A multivariate test will change multiple elements, like changing a picture and headline at the same time. Three variations of the image and two variations of the headline are combined to create six versions of the content, which are tested concurrently to find the winning variation. [https://www.optimizely.com/optimization-glossary/multivariate-testing/ Optimizely] |
{|<!-- T --> | {|<!-- T --> | ||
| Line 165: | Line 217: | ||
<youtube>tVnnG7mFeqA</youtube> | <youtube>tVnnG7mFeqA</youtube> | ||
<b>Introduction to multivariate data analysis using vegan | <b>Introduction to multivariate data analysis using vegan | ||
| − | </b><br>Get started using the vegan package for R for multivariate data analysis and community ecology Further information about the webinar is in the [ | + | </b><br>Get started using the vegan package for R for multivariate data analysis and community ecology Further information about the webinar is in the [https://github.com/gavinsimpson/intro-vegan-webinar-july-2020 GitHub repo] |
|} | |} | ||
|}<!-- B --> | |}<!-- B --> | ||
Latest revision as of 09:35, 28 May 2025
YouTube ... Quora ...Google search ...Google News ...Bing News
- Risk, Compliance and Regulation ... Ethics ... Privacy ... Law ... AI Governance ... AI Verification and Validation
- Data Quality ... validity, accuracy, cleaning, completeness, consistency, encoding, padding, augmentation, labeling, auto-tagging, normalization, standardization, imbalanced data
- Artificial General Intelligence (AGI) to Singularity ... Curious Reasoning ... Emergence ... Moonshots ... Explainable AI ... Automated Learning
- Policy ... Policy vs Plan ... Constitutional AI ... Trust Region Policy Optimization (TRPO) ... Policy Gradient (PG) ... Proximal Policy Optimization (PPO)
- Strategy & Tactics ... Project Management ... Best Practices ... Checklists ... Project Check-in ... Evaluation ... Measures
- AI Governance / Algorithm Administration
- Data Science ... Governance ... Preprocessing ... Exploration ... Interoperability ... Master Data Management (MDM) ... Bias and Variances ... Benchmarks ... Datasets
- Development ... Notebooks ... AI Pair Programming ... Codeless ... Hugging Face ... AIOps/MLOps ... AIaaS/MLaaS
- Libraries & Frameworks Overview ... Libraries & Frameworks ... Git - GitHub and GitLab ... Other Coding options
- AI Solver ... Algorithms ... Administration ... Model Search ... Discriminative vs. Generative ... Train, Validate, and Test
- AI Agent Optimization ... Optimization Methods ... Optimizer ... Objective vs. Cost vs. Loss vs. Error Function ... Exploration
- Policy ... Policy vs Plan ... Constitutional AI ... Trust Region Policy Optimization (TRPO) ... Policy Gradient (PG) ... Proximal Policy Optimization (PPO)
- Testing of Artificial Intelligence | Sogeti
- Other Challenges in Artificial Intelligence
- Data Science Concepts Explained to a Five-year-old | Megan Dibble - Toward Data Science
Guardrails AI
- Guardrails AI
- Conversational AI ... ChatGPT | OpenAI ... Bing/Copilot | Microsoft ... Gemini | Google ... Claude | Anthropic ... Perplexity ... You ... phind ... Ernie | Baidu
- Large Language Model (LLM) ... Multimodal ... Foundation Models (FM) ... Generative Pre-trained ... Transformer ... Attention ... GAN ... BERT
Guardrails AI is a Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs). Guardrails AI works by wrapping around LLM API calls to structure, validate, and correct the outputs. It can be used to enforce a wide range of requirements, such as:
- Ensuring that the output is of a certain type (e.g., JSON, Python code, etc.)
- Checking for bias in the output
- Identifying and correcting factual errors
- Preventing the output from containing certain keywords or phrases
Guardrails AI can be used to improve the safety and reliability of LLMs in a wide range of applications, such as:
- Generating text for websites and blogs
- Writing code and scripts
- Translating languages
- Answering questions in a comprehensive and informative way
Here are some examples of how Guardrails AI can be used:
- A news organization could use Guardrails AI to ensure that the articles it generates are free of bias and factual errors.
- A software company could use Guardrails AI to generate code that is well-formatted and bug-free.
- A customer service chatbot could use Guardrails AI to ensure that its responses are helpful and informative.
Testing
Covering both..
- Testing ‘of’ AI
- Testing ‘with’ AI
|
|
|
|
|
|
|
|
A/B Testing
YouTube search... ...Google search
- Data science you need to know! A/B testing | Michael Barber - Towards Data Science
- A/B Testing for Data Science | Anjali Tiwari - Analytics Vidhya - Medium
- A Data Analyst guide to A/B testing | Jacob Joseph - CleverTap - KDnuggets
A/B testing (also known as bucket testing or split-run testing) is a user experience research methodology. A/B tests consist of a randomized experiment with two variants, A and B. It includes application of statistical hypothesis testing or "two-sample hypothesis testing" as used in the field of statistics. A/B testing is a way to compare two versions of a single variable, typically by testing a subject's response to variant A against variant B, and determining which of the two variants is more effective. Wikipedia
|
|
|
|
Multivariate Testing (MVT)
YouTube search... ...Google search
- Key Differences Between Multivariate Testing (MVT) & A/B Testing | Paras Chopra - Visual Website Optimizer (VWO)
- What is Multivariate Testing? | Khalid Saleh - invesp
- Multivariate Testing 101: A Scientific Method Of Optimizing Design | Paras Chopra - Visual Website Optimizer (VWO)
Multivariate testing is a technique for testing a hypothesis in which multiple variables are modified. The goal of multivariate testing is to determine which combination of variations performs the best out of all of the possible combinations. Websites and mobile apps are made of combinations of changeable elements. A multivariate test will change multiple elements, like changing a picture and headline at the same time. Three variations of the image and two variations of the headline are combined to create six versions of the content, which are tested concurrently to find the winning variation. Optimizely
|
|