|
|
| Line 30: |
Line 30: |
| | * [[AIOps / MLOps]] | | * [[AIOps / MLOps]] |
| | * [[Libraries & Frameworks]] | | * [[Libraries & Frameworks]] |
| − | * [http://ico.org.uk/media/about-the-ico/consultations/2617219/guidance-on-the-ai-auditing-framework-draft-for-consultation.pdf Guidance on the AI auditing framework | Information Commissioner's Office (ICO)]
| |
| − | * [http://www.gao.gov/products/GAO-20-48G Technology Readiness Assessments (TRA) Guide | US GAO] ...used to evaluate the maturity of technologies and whether they are developed enough to be incorporated into a system without too much risk.
| |
| − | * [http://dodcio.defense.gov/Portals/0/Documents/Cyber/2019%20Cybersecurity%20Resource%20and%20Reference%20Guide_DoD-CIO_Final_2020FEB07.pdf Cybersecurity Reference and Resource Guide | DOD]
| |
| − | * [http://recruitingdaily.com/five-ways-to-evaluate-ai-systems/ Five ways to evaluate AI systems | Felix Wetzel - Recruiting Daily]
| |
| − | * [http://github.com/cisagov/cset/releases Cyber Security Evaluation Tool (CSET®)] ...provides a systematic, disciplined, and repeatable approach for evaluating an organization’s security posture.
| |
| − | * [http://towardsdatascience.com/3-common-technical-debts-in-machine-learning-and-how-to-avoid-them-17f1d7e8a428 3 Common Technical Debts in Machine Learning and How to Avoid Them | Derek Chia - Towards Data Science]
| |
| − | * [http://ai.facebook.com/blog/new-code-completeness-checklist-and-reproducibility-updates/ New code completeness checklist and reproducibility updates |] [[Facebook]] AI
| |
| − | * [http://www.oreilly.com/radar/why-you-should-care-about-debugging-machine-learning-models/ Why you should care about debugging machine learning models | Patrick Hall and Andrew Burt - O'reilly]
| |
| | | | |
| − | Many products today leverage artificial intelligence for a wide range of industries, from healthcare to marketing. However, most business leaders who need to make strategic and procurement decisions about these technologies have no formal AI background or academic training in data science. The purpose of this article is to give business people with no AI expertise a general guideline on how to assess an AI-related product to help decide whether it is potentially relevant to their business. [http://emerj.com/ai-sector-overviews/how-to-assess-an-artificial-intelligence-product-or-solution-for-non-experts/ How to Assess an Artificial Intelligence Product or Solution (Even if You’re Not an AI Expert) | Daniel Faggella - Emerj]
| |
| | | | |
| − | Nature of risks inherent to AI applications: We believe that the challenge in governing AI is less about dealing with completely new types of risk and more about existing risks either being harder to identify in an effective and timely manner, given the complexity and speed of AI solutions, or manifesting themselves in unfamiliar ways. As such, firms do not require completely new processes for dealing with AI, but they will need to enhance existing ones to take into account AI and fill the necessary gaps. The likely impact on the level of resources required, as well as on roles and responsibilities, will also need to be addressed. [http://www2.deloitte.com/content/dam/Deloitte/nl/Documents/innovatie/deloitte-nl-innovate-lu-ai-and-risk-management.pdf AI and risk management: Innovating with confidence | Deloitte]
| |
| − |
| |
| − |
| |
| − | <hr>
| |
| − |
| |
| − | <center><b> Assessment Questions </b>- Artificial Intelligence (AI) / Machine Learning (ML) / Machine Intelligence (MI) </center>
| |
| − |
| |
| − | <hr>
| |
| − |
| |
| − |
| |
| − | * What challenge does the AI investment solve?
| |
| − | ** Is the intent of AI to increase performance (detection), reduce costs (predictive maintenance, reduce inventory) , decrease response time, or other outcome(s)?
| |
| − | ** How does the AI investment meet the challenge?
| |
| − | ** What analytics are being implemented? [[What is AI? | Descriptive (what happened?), Diagnostic (why did it happen?), Predictive/Preventive (what could happen?), Prescriptive (what should happen?), Cognitive (what steps should be taken?)]]
| |
| − | ** Is AI being used for [[Cybersecurity]]? Is AI used protect the AI investment against targeted attacks, often referred to as [http://link.springer.com/article/10.1007/s11416-016-0273-3 advanced targeted attacks (ATAs)] or [http://link.springer.com/chapter/10.1007/978-3-662-44885-4_5 advanced persistent threats (APTs)]?
| |
| − | * Is the organization using the AI investment to gain better capability in the future?
| |
| − | ** Is the right [[Evaluation#Leadership| Leadership]] in place?
| |
| − | ** Is [[Evaluation#Leadership| Leadership]]'s [[Strategy & Tactics | AI strategy]] documented and articulated well?
| |
| − | ** Does the AI investment strategy align with the organization's overall strategy and values?
| |
| − | ** Is the AI investment properly resourced? budgeted, trained staff with key positions filled?
| |
| − | ** Responsibility clearly defined and communicated for AI research, performing data science, applied machine intelligence engineering, qualitative assurance, software development, implementing foundational capabilities, user experience, change management, configuration management, security, backup/contingency, domain expertise, and project management
| |
| − | ** Is the organization positioned or positioning to scale its current state with AI?
| |
| − | * Does the AI reside in a [[Evaluation#Procuring| procured item/application/solution or developed in house]]?
| |
| − | ** If the AI is [[Evaluation#Buying| procured]], e.g. embedded in sensor product, what items are included in the contract to future proof the solution?
| |
| − | ** Contract items to protect organization reuse data rights?
| |
| − | * Are [[Evaluation#Best Practices| Best Practices]] being followed? Is the team trained in the [[Evaluation#Best Practices| Best Practices]]?
| |
| − | * What is the [[Evaluation#Return on Investment (ROI)| Return on Investment (ROI)]]? Is the AI investment on track with original ROI target?
| |
| − | ** What is the clear and realistic way of measuring the success of the AI investment?
| |
| − | * What are the significant [[Evaluation - Measures| measures]] that indicate the AI investment is achieving success?
| |
| − | ** What [[Evaluation - Measures]] are documented? Are the [[Evaluation - Measures|Measures]] being used correctly?
| |
| − | ** How would you be able to tell if the AI investment was working properly?
| |
| − | ** How perfect does AI have to be to trust it? What is the inference/prediction rate performance metric for the AI investment?
| |
| − | ** What is the current inference/prediction/[[Evaluation - Measures#Receiver Operating Characteristic (ROC) | True Positive Rate (TPR)]]?
| |
| − | ** What is the [[Evaluation - Measures#Receiver Operating Characteristic (ROC) | False Positive Rate (FPR)]]? How does AI reduce false-positives without increasing false negatives?
| |
| − | ** Is there a [[Evaluation - Measures#Receiver Operating Characteristic (ROC) |Receiver Operating Characteristic (ROC) curve]]; plotting the [[Evaluation - Measures#Receiver Operating Characteristic (ROC) | True Positive Rate (TPR)]] against the [[Evaluation - Measures#Receiver Operating Characteristic (ROC) | False Positive Rate (FPR)]]?
| |
| − | ** When the AI model is updated, how is it determined that the performance was indeed increased for the better?
| |
| − | ** Are response plans, procedures and training in place to address AI attack or failure incidents? How are AI investment’s models audited for security vulnerabilities?
| |
| − | * What is the [[Evaluation#ML Test Score| ML Test Score?]]
| |
| − | * Does [[Data Governance]] treat data as a first-class asset?
| |
| − | ** Is [[Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog | Master Data Management (MDM)]] in place?
| |
| − | ** Is there data management plan(ning)? Does data planning address metadata for dataflows and data transitions? data quality?
| |
| − | ** Has the data been identified for current AI investment? For future use AI investment(s)?
| |
| − | ** Are the internal data resources available and accessible? For external data resources, are contracts in place to make the data available and accessible?
| |
| − | ** Are permissions in place to use the data, with privacy constraints considered and mitigated?
| |
| − | ** Is the data labelled, or require manual labeling?
| |
| − | ** What is the quality of the data; skewed, gaps, clean?
| |
| − | ** Is there sufficient amount of data available?
| |
| − | ** Have the key features to be used in the AI model been identified? If needed, what are the algorithms used to combine AI features? What is the approximate number of features used?
| |
| − | ** How are the [[Datasets|dataset(s)]] used for AI training, testing and Validation managed? Are logs kept on which data is used for different executions/training so that the information used is traceable?
| |
| − | ** How is the access to the information guaranteed? Are the [[Datasets|dataset(s)]] for AI published (repo, marketplace) for reuse, if so where?
| |
| − | * What [[AI Governance]] is in place?
| |
| − | ** What are the [[Enterprise Architecture (EA)|AI architecture]] specifics, e.g. [[Ensemble Learning]] methods used, [[Graph Convolutional Network (GCN), Graph Neural Networks (Graph Nets), Geometric Deep Learning|graph network]], or [[Distributed]] learning?
| |
| − | ** What AI model type(s) are used? [[Regression]], [[K-Nearest Neighbors (KNN)]], [[Graph Convolutional Network (GCN), Graph Neural Networks (Graph Nets), Geometric Deep Learning|Graph Neural Networks], [[Reinforcement Learning (RL)]], [[Association Rule Learning]], etc.
| |
| − | ** Is [[Transfer Learning]] used? If so, which AI models are used? What mission specific [[Datasets|dataset(s)]] are used to tune the AI model?
| |
| − | ** Are the AI models published (repo, marketplace) for reuse, if so where?
| |
| − | ** Is the AI model reused from a repository (repo, marketplace)? If so, which one? How are you notified of updates? How often is the repository checked for updates?
| |
| − | ** Are AI service(s) are used for inference/prediction?
| |
| − | ** What AI languages, [[Libraries & Frameworks]], scripting, are implemented? [[Python]], [[Javascript]], [[PyTorch]] etc.
| |
| − | ** What optimizers are used? Is augmented machine learning (AugML) or automated machine learning (AutoML) used?
| |
| − | ** What [[Benchmarks|benchmark]] standard(s) are the AI model compared/scored? e.g. [[Global Vectors for Word Representation (GloVe)]]
| |
| − | ** How often is the deployed AI process [[Model Monitoring | monitored or measures re-evaluated]]?
| |
| − | ** How is bias accounted for in the AI process? How are the [[Datasetsdataset(s)]] used are assured to represent the problem space? What is the process of the removal of features/data that is believed are not relevant? What assurance is provided that the model (algorithm) is not biased?
| |
| − | ** Is the model [[Explainable Artificial Intelligence (XAI)| (implemented or to be implemented) explainable? Interpretable?]] How so?
| |
| − | ** Has role/job displacement due to automation and/or AI implementation being addressed?
| |
| − | ** Are User and [http://en.wikipedia.org/wiki/User_behavior_analytics | Entity Behavior Analytics (UEBA)] and AI used to help to create a baseline for trusted workload access?
| |
| − | * What foundational capabilities are defined or in place for the AI investment? infrastructure platform, cloud resources?
| |
| − | * Is the AI investment implementing an [[AIOps / MLOps]] pipeline/toolchain?
| |
| − | ** What tools are used for the [[AIOps / MLOps]]? Please identify those on-premises and online services?
| |
| − | ** Are the AI languages, libraries, scripting, and [[AIOps / MLOps]] applications registered in the organization?
| |
| − | ** Does the AI investment depict the [[AIOps / MLOps]] pipeline/toolchain applications in their tech stack?
| |
| − | ** Is the AI investment identifies in the [[AIOps / MLOps| SecDevOps]] architecture?
| |
| − | ** Does [[Master Data Management (MDM) / Feature Store / Data Lineage / Data Catalog | data management]] reflected in the [[AIOps / MLOps]] pipeline/toolchain processes/architecture?
| |
| − | ** Are the end-to-end visibility and bottleneck risks for [[AIOps / MLOps]] pipeline/toolchain reflected in the risk register with mitigation strategy for each risk?
| |
| − |
| |
| − |
| |
| − | {|<!-- T -->
| |
| − | | valign="top" |
| |
| − | {| class="wikitable" style="width: 550px;"
| |
| − | ||
| |
| − | <youtube>7CcSm0PAr-Y</youtube>
| |
| − | <b>How Should We Evaluate Machine Learning for AI?: Percy Liang
| |
| − | </b><br>Machine learning has undoubtedly been hugely successful in driving progress in AI, but it implicitly brings with it the train-test evaluation paradigm. This standard evaluation only encourages behavior that is good on average; it does not ensure robustness as demonstrated by adversarial examples, and it breaks down for tasks such as dialogue that are interactive or do not have a correct answer. In this talk, I will describe alternative evaluation paradigms with a focus on natural language understanding tasks, and discuss ramifications for guiding progress in AI in meaningful directions. Percy Liang is an Assistant Professor of Computer Science at Stanford University (B.S. from MIT, 2004; Ph.D. from UC Berkeley, 2011). His research spans machine learning and natural language processing, with the goal of developing trustworthy agents that can communicate effectively with people and improve over time through interaction. Specific topics include question answering, dialogue, program induction, interactive learning, and reliable machine learning. His awards include the IJCAI Computers and Thought Award (2016), an NSF CAREER Award (2016), a Sloan Research Fellowship (2015), and a Microsoft Research Faculty Fellowship (2014).
| |
| − | |}
| |
| − |
| |
| − | = <span id="ML Test Score"></span>ML Test Score =
| |
| | * [http://research.google/pubs/pub43146/ Machine Learning: The High Interest Credit Card of Technical Debt | | D. Sculley, G Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, and M. Young -] [[Google]] Research | | * [http://research.google/pubs/pub43146/ Machine Learning: The High Interest Credit Card of Technical Debt | | D. Sculley, G Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, and M. Young -] [[Google]] Research |
| | * [http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Hidden Technical Debt in Machine Learning Systems D. Sculley, G Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J. Crespo, and D. Dennison -] [[Google]] Research | | * [http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf Hidden Technical Debt in Machine Learning Systems D. Sculley, G Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J. Crespo, and D. Dennison -] [[Google]] Research |
| Line 150: |
Line 56: |
| | |}<!-- B --> | | |}<!-- B --> |
| | | | |
| − | = <span id="Procuring"></span>Procuring = | + | = [[Procuring]] = |
| | * [http://c3.ai/wp-content/uploads/2020/06/Enterprise-AI-Buyers-Guide.pdf Enterprise AI Buyer’s Guide | C3.ai] | | * [http://c3.ai/wp-content/uploads/2020/06/Enterprise-AI-Buyers-Guide.pdf Enterprise AI Buyer’s Guide | C3.ai] |
| | * [http://www3.weforum.org/docs/WEF_AI_Procurement_in_a_Box_AI_Government_Procurement_Guidelines_2020.pdf AI Procurement in a Box: AI Government Procurement Guidelines - Toolkit | World Economic Forum] | | * [http://www3.weforum.org/docs/WEF_AI_Procurement_in_a_Box_AI_Government_Procurement_Guidelines_2020.pdf AI Procurement in a Box: AI Government Procurement Guidelines - Toolkit | World Economic Forum] |