Strategy & Tactics

From
Jump to: navigation, search

YouTube search... ...Google search

  1. Think backwards
  2. Build a data pipeline before a model
  3. Deliver actions over accuracy
  4. Modularise and abstract
  5. Brand the solution


Artificial Intelligence is great for complex problems where traditional approaches are difficult or impossible to implement such as long list of rules would be required or when the conditions are always changing.

Application Kaggle Competition
Translate a business question into a data question  
Think about how the model is going to be consumed  
Validate assumptions and methods  
Consider how a machine learning model can connect to a (existing) tech stack  
Determine data sources  
Extract Data; SQL queries (sometimes across multiple databases), third-party systems, web scraping, API’s or data from partners Download some data (probably one or several CSV files)
Transforming and cleaning data; removing erroneous data, outliers and handling missing values Perhaps do a little cleaning, or chances are the data set may already be clean enough
Exploratory data analysis & feature extraction  
Feature engineering (vast range of variables) Feature engineering (a finite number of variables)
Perform preprocessing such as converting categorical data into numerical data Perform preprocessing such as converting categorical data into numerical data
Version models, Feature selection, Hyperparameter Tuning, and Web Service Endpoints  
Model selection; best model that can be integrated into the existing tech stack with the least amount of engineering Model selection
Build the model Build the model
Set up training and deployment pipelines  
Train the model Train the model
Validate the model Validate the model
Test the model Test the model
Optimise the model until it is ‘good enough’ considering the business value; perform hyperparameter tuning and compare results Run the data through a variety of suitable models until you find the best one; perform hyperparameter tuning and compare results
Make experiments reproducible  
If new tech stack…  
- Review prior art and available libraries, services, scafolding tools  
- Implement and test tech stack  
Deploy the model  
Monitor the model performance in production  
Set up monitoring alert systems  
Retrain when/if necessary  
Mature to automatically retrain and redeploy models  


Project Questions

  1. What challenge does the AI solve?
  2. Is the intent to increase performance (detection), reduce costs (predictive maintenance, reduce inventory), decrease response time, or other outcome(s)?
  3. What is the clear and realistic way of measuring the success of the AI initiative?
  4. Does the AI reside in a procured item/application/solution or developed in house?
  5. If the AI is procured, e.g. embedded in sensor product, what items are included in the contract to future proof the solution? Let the organization use implementation to gain better capability in the future? Contract items to protect the organization reuse data rights?
  6. What analytics is the AI resolving? Descriptive (what happened?), Diagnostic (why did it happen?), Predictive/Preventive (what could happen?), Prescriptive (what should happen?), Cognitive (what steps should be taken?)
  7. What is the current inference/prediction/true positive rate (TPR) rate?
  8. How perfect does AI have to be to trust it? What is the inference/prediction rate performance metric for the Program?
  9. What is the false-positive rate? How does AI reduce false-positives without increasing false negatives? What is the false-positive rate performance metric for the Program? Is there a Receiver Operating Characteristic (ROC) curve; plotting the true positive rate (TPR) against the false positive rate (FPR) ?
  10. Has the data been identified for AI (current application or for future use) initiative(s)? Is the data labelled, or require manual labeling?
  11. Have the key features to be used in the AI model been identified? If needed, what are the algorithms used to combine AI features? What is the approximate number of features used?
  12. How are the dataset(s) used for AI training, testing and Validation managed? Are logs kept on which data is used for different executions/training so that the information used is traceable? How is the access to the information guaranteed?
  13. Are the dataset(s) for AI published (repo, marketplace) for reuse, if so where?
  14. What AI model type(s) are used? Regression, K-Nearest Neighbors (KNN), BERT, reinforcement, rule-based
  15. What are the AI architecture specifics, e.g. ensemble methods used, graph network, or distributed learning?
  16. Are the AI models published (repo, marketplace) for reuse, if so where?
  17. Is the AI model reused from a repository (repo, marketplace)? If so, which one? How are you notified of updates? How often is the repository checked for updates?
  18. Is transfer learning used? If so, which AI models are used? What mission specific dataset(s) are used to tune the AI model?
  19. Are AI service(s) used for inference/prediction? How?
  20. What AI languages, libraries, scripting, are implemented?
  21. What tools are used for the AIOps? Please identify those on-premises and online services?
  22. Are the AI languages, libraries, scripting, and AIOps applications registered in the technical reference model?
  23. What optimizers are used? Is augmented machine learning (AugML) or automated machine learning (AutoML) used?
  24. When the AI model is updated, how is it determined that the performance was indeed increased for the better?
  25. What benchmark standard(s) are the AI model compared/scored? e.g. General Language Understanding Evaluation (GLUE)
  26. What AI languages, libraries, scripting, are implemented?
  27. How often is the deployed AI process monitored or measures re-evaluated?
  28. How is bias accounted for in the AI process? How are the dataset(s) used are assured to represent the problem space? What is the process of the removal of features/data that is believed are not relevant? What assurance is provided that the model (algorithm) is not biased?
  29. Is the AI model (implemented or to be implemented) explainable? How so?
  30. Has role/job displacement due to automation and/or AI implementation being addressed?