Strategy & Tactics
- AI Solver
- Evaluation Measures - Classification Performance
- Building Your Environment
- Enterprise Architecture (EA)
- The 5 Rules Of Product-Driven Data Science | David Foster - Medium
- Think backwards
- Build a data pipeline before a model
- Deliver actions over accuracy
- Modularise and abstract
- Brand the solution
- Exploring Enterprise AI - Insights, Use-Cases, and Best-Practices for Business Leaders | Emerj.com
- Downloadable: Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Data Science PDF | Stefan Kojouharov - Being Human
Artificial Intelligence is great for complex problems where traditional approaches are difficult or impossible to implement such as long list of rules would be required or when the conditions are always changing.
|Translate a business question into a data question|
|Think about how the model is going to be consumed|
|Validate assumptions and methods|
|Consider how a machine learning model can connect to a (existing) tech stack|
|Determine data sources|
|Extract Data; SQL queries (sometimes across multiple databases), third-party systems, web scraping, API’s or data from partners||Download some data (probably one or several CSV files)|
|Transforming and cleaning data; removing erroneous data, outliers and handling missing values||Perhaps do a little cleaning, or chances are the data set may already be clean enough|
|Exploratory data analysis & feature extraction|
|Feature engineering (vast range of variables)||Feature engineering (a finite number of variables)|
|Perform preprocessing such as converting categorical data into numerical data||Perform preprocessing such as converting categorical data into numerical data|
|Version models, Feature selection, Hyperparameter Tuning, and Web Service Endpoints|
|Model selection; best model that can be integrated into the existing tech stack with the least amount of engineering||Model selection|
|Build the model||Build the model|
|Set up training and deployment pipelines|
|Train the model||Train the model|
|Validate the model||Validate the model|
|Test the model||Test the model|
|Optimise the model until it is ‘good enough’ considering the business value; perform hyperparameter tuning and compare results||Run the data through a variety of suitable models until you find the best one; perform hyperparameter tuning and compare results|
|Make experiments reproducible|
|If new tech stack…|
|- Review prior art and available libraries, services, scafolding tools|
|- Implement and test tech stack|
|Deploy the model|
|Monitor the model performance in production|
|Set up monitoring alert systems|
|Retrain when/if necessary|
|Mature to automatically retrain and redeploy models|
- Predictive Analysis 101 | Ravi Kalakota
- Machine Learning Algorithms: Which One to Choose for Your Problem | Daniil Korbut
- Automating business processes, Gaining insight through data analysis, Engaging with customers and employees
- Machine Learning Algorithms
- Neural Network Zoo | Fjodor Van Veen
- Insights | McKinsey&Company
- Outline_of_machine_learning | Wikpedia
- What challenge does the AI solve?
- Is the intent to increase performance (detection), reduce costs (predictive maintenance, reduce inventory), decrease response time, or other outcome(s)?
- What is the clear and realistic way of measuring the success of the AI initiative?
- Does the AI reside in a procured item/application/solution or developed in house?
- If the AI is procured, e.g. embedded in sensor product, what items are included in the contract to future proof the solution? Let the organization use implementation to gain better capability in the future? Contract items to protect the organization reuse data rights?
- What analytics is the AI resolving? Descriptive (what happened?), Diagnostic (why did it happen?), Predictive/Preventive (what could happen?), Prescriptive (what should happen?), Cognitive (what steps should be taken?)
- What is the current inference/prediction/true positive rate (TPR) rate?
- How perfect does AI have to be to trust it? What is the inference/prediction rate performance metric for the Program?
- What is the false-positive rate? How does AI reduce false-positives without increasing false negatives? What is the false-positive rate performance metric for the Program? Is there a Receiver Operating Characteristic (ROC) curve; plotting the true positive rate (TPR) against the false positive rate (FPR) ?
- Has the data been identified for AI (current application or for future use) initiative(s)? Is the data labelled, or require manual labeling?
- Have the key features to be used in the AI model been identified? If needed, what are the algorithms used to combine AI features? What is the approximate number of features used?
- How are the dataset(s) used for AI training, testing and Validation managed? Are logs kept on which data is used for different executions/training so that the information used is traceable? How is the access to the information guaranteed?
- Are the dataset(s) for AI published (repo, marketplace) for reuse, if so where?
- What AI model type(s) are used? Regression, K-Nearest Neighbors (KNN), BERT, reinforcement, rule-based
- What are the AI architecture specifics, e.g. ensemble methods used, graph network, or distributed learning?
- Are the AI models published (repo, marketplace) for reuse, if so where?
- Is the AI model reused from a repository (repo, marketplace)? If so, which one? How are you notified of updates? How often is the repository checked for updates?
- Is transfer learning used? If so, which AI models are used? What mission specific dataset(s) are used to tune the AI model?
- Are AI service(s) used for inference/prediction? How?
- What AI languages, libraries, scripting, are implemented?
- What tools are used for the AIOps? Please identify those on-premises and online services?
- Are the AI languages, libraries, scripting, and AIOps applications registered in the technical reference model?
- What optimizers are used? Is augmented machine learning (AugML) or automated machine learning (AutoML) used?
- When the AI model is updated, how is it determined that the performance was indeed increased for the better?
- What benchmark standard(s) are the AI model compared/scored? e.g. General Language Understanding Evaluation (GLUE)
- What AI languages, libraries, scripting, are implemented?
- How often is the deployed AI process monitored or measures re-evaluated?
- How is bias accounted for in the AI process? How are the dataset(s) used are assured to represent the problem space? What is the process of the removal of features/data that is believed are not relevant? What assurance is provided that the model (algorithm) is not biased?
- Is the AI model (implemented or to be implemented) explainable? How so?
- Has role/job displacement due to automation and/or AI implementation being addressed?