Python

From
Revision as of 08:58, 8 July 2020 by BPeat (talk | contribs)
Jump to: navigation, search

Youtube search... ...Google search

state_of_the_stack_2015.png

Scientific Python Overview | Daniel Rothenberg and the GCST

Python Data Science Handbook

NumPy

Youtube search... ...Google search

  • NumPy -manipulation of numerical arrays. NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
  • Python Numpy Tutorial | Justin Johnson

Pandas

Youtube search... ...Google search

  • Python Data Analysis library - data structures and data analysis tools for the Python programming language. Pandas is a newer package built on top of NumPy, and provides an efficient implementation of a Pandas DataFrame. Pandas DataFrames are essentially multidimensional arrays with attached row and column labels, and often with heterogeneous types and/or missing data. As well as offering a convenient storage interface for labeled data, Pandas implements a number of powerful data operations familiar to users of both database frameworks and spreadsheet programs.
  • Modin accelerates Pandas by automatically distributing the computation across all of the system’s available CPU cores

Pandas DataFrame

Youtube search... ...Google search

SciPy

Youtube search... ...Google search

  • SciPy library - one of the core packages that make up the SciPy stack. It provides many user-friendly and efficient numerical routines such as routines for numerical integration, interpolation, optimization, linear algebra and statistics.

SymPy

Youtube search... ...Google search

  • SymPy library - a Python library for symbolic mathematics aiming to become a full-featured computer algebra system (CAS)
  • mpmath | Fredrik Johansson library for real and complex floating-point arithmetic with arbitrary precision

mpmath

Youtube search... ...Google search

  • mpmath | Fredrik Johansson library for real and complex floating-point arithmetic with arbitrary precision. can be used as a library, interactively via the Python interpreter, or from within the SymPy or Sage computer algebra systems which include mpmath as standard component. CoCalc lets you use mpmath directly in the browser. Cocalc or "Collaborative Calculation in the Cloud" enables programming online without the need to install any software.

NetworkX

Youtube search... ...Google search

scikit-learn

Youtube search... ...Google search

  • scikit-learn library for machine learning in Python. A toolkit implement a wide variety of algorithms for un/supervised machine learning tasks, including regressions, clustering, manifold learning, principal components, density estimation, and much more. It also provides many useful tools to help build “ pipelines” for managing modeling tasks such as data processing/normalization, feature engineering, cross-validation, fitting, and prediction. The package scikit-learn is recommended to be installed using pip install scikit-learn but in your code imported using import sklearn.

Python & Excel

What is the best library out there for working with Excel through Python?

You can just export to CSV if it's just a table of data that doesn't need any formatting. Pandas works great for this. You don't need anything else

pyxll

Youtube search... ...Google search

  • pyxll - Python Excel Add-In =python(“in Excel”)

PyXLL is an Excel Add-In that enables developers to extend Excel’s capabilities with Python code. For organizations that want to provide their end users with functionality within Excel, PyXLL makes Python a productive, flexible back-end for Excel worksheets. With PyXLL, your own Python code runs in Excel using any Python distribution you like (e.g. Anaconda, Enthought’s Canopy or any other CPython distribution from 2.3 to 3.7). Because PyXLL runs your own full Python distribution you have access to all 3rd party Python packages such as NumPy, Pandas and SciPy and can call them from Excel.


xlwings

Youtube search... ...Google search

  • xlwings - Innovative Solutions For Microsoft Excel; excel with Microsoft Excel

If you want a user to enter some data in excel, hand it off to python, and then show the results to your user in excel, xlwings is great.


openpyxl

Youtube search... ...Google search


XlsxWriter

Youtube search... ...Google search

I am just pulling data from an SQL Server, manipulating the results, and then dumping the results into an Excel spreadsheet. Just working with Excel cells, and ranges.

PyMC3

Youtube search... ...Google search

StatsModels

Youtube search... ...Google search

  • StatsModels A module for fitting and estimating many different types of statistical models as well as performing hypothesis testing and exploratory data analysis. It features tools for fitting generalized linear models, survival analyses, and multi-variate statistics.

OpenCV

Youtube search... ...Google search

  • OpenCV - Open Computer Vision - work with images and/or videos and wish to add a variety of classical and state-of-the-art vision algorithms to their toolbox.

LibROSA

Youtube search... ...Google search

  • LibROSA - audio and voice processing which can extract various kinds of features from audio segments, such as the rhythm, beats and tempo.

PyGame

Youtube search... ...Google search


Parallel

DASK

Youtube search... ...Google search

  • DASK provides advanced parallelism for analytics, enabling performance at scale for the tools you love - it is developed in coordination with other community projects like NumPy, Pandas, and scikit-learn

Joblib

Youtube search... ...Google search

  • Joblib provide lightweight pipelining

Tornado

Youtube search... ...Google search

  • Tornado is a web framework and asynchronous networking library. By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.

Numba

Youtube search... ...Google search

  • Numba JIT compiler that translates a subset of Python and NumPy code into fast machine code.

xarray

Youtube search... ...Google search

  • xarray working with labelled multi-dimensional arrays simple, and efficient. Xarray introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like arrays, which allows for a more intuitive, more concise, and less error-prone developer experience. The package includes a large and growing library of domain-agnostic functions for advanced analytics and visualization with these data structures. Xarray was inspired by and borrows heavily from Pandas, the popular data analysis package focused on labelled tabular data. It is particularly tailored to working with netCDF files, which were the source of xarray’s data model, and integrates tightly with DASK for parallel computing.

IPython Blocks

Youtube search... ...Google search

  • IPython Blocks a tool for practicing Python in the Jupyter giving learners a grid of colors to manipulate while practicing for loops, if statements, and other aspects of Python.

Metaflow

Youtube search... ...Google search

  • Metaflow, Netflix and AWS open source Python library

Web Automation with Python - Data Gathering

Write a Python crawler to extract information from websites to identify patterns, both in terms of the URL patterns and XPath patterns. Once these patterns are figured out, these tools can automatically extract the needed information and organize data a usable structure.

Requests

Youtube search... ...Google search

Beautiful Soup - bs4

Youtube search... ...Google search

Allows you to import its functions and use them in-line. Therefore, you could even use it in your Jupyter notebooks.

Scrapy

Youtube search... ...Google search

  • Scrapy webscraping .. open source and collaborative framework for extracting the data you need from websites

Selenium

Youtube search... ...Google search

Initialises a web browser such as Chrome and then simulates all the actions defined in the code; JavaScript functions to e.g. register an account, then log in and get the content after clicking some buttons and links.

Twisted

Youtube search... ...Google search

  • Twisted an event-driven networking engine

Twisted has been around a long time in the Python world. Pioneering the Deferred abstraction, which later turned into Promises and found their way into JavaScript, it is a fertile ground for asynchronous I/O experimentation. Through its groundbreaking protocol/transport design that some of us might take for granted these days, and a strict adherence to unit testing and representing things through abstract interfaces, it lets you talk a lot of different network protocols without really having to know everything about them.

Pipelines

Python is one of the most crucial orchestration and infrastructure automation components of AIOps to reduce or almost eliminates disconnect between developers and system admins. AIOps is centered on enabling AI pipelines for continuous integration and continuous deployment (CI/CD) with no downtime.

Vaex

Youtube search... ...Google search

PyCaret

Youtube search... ...Google search


TPOT

Youtube search... ...Google search

tpot-pipeline-example-768x361.png

ELI5

Youtube search... ...Google search

  • ELI5 "Explain it like I'm 5" helps to...
    • debug machine learning classifiers and explain their predictions.
      • scikit-learn - Currently ELI5 allows to explain weights and predictions of scikit-learn linear classifiers and regressors, print decision trees as text or as SVG, show feature importances and explain predictions of decision trees and tree-based ensembles. ELI5 understands text processing utilities from scikit-learn and can highlight text data accordingly. Pipeline and FeatureUnion are supported. It also allows to debug scikit-learn pipelines which contain HashingVectorizer, by undoing hashing.
      • xgboost - show feature importances and explain predictions of XGBClassifier, XGBRegressor and xgboost.Booster.
      • LightGBM - show feature importances and explain predictions of LGBMClassifier and LGBMRegressor. A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the Microsoft Distributed Machine Learning Toolkit (DMTK) project of Microsoft.
      • CatBoost - show feature importances of CatBoostClassifier, CatBoostRegressor and catboost.CatBoost. A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU
      • lightning - explain weights and predictions of lightning classifiers and regressors. Large-scale linear classification, regression and ranking in Python
      • sklearn-crfsuite ELI5 allows to check weights of sklearn_crfsuite.CRF models. CRFsuite is an implementation of Conditional Random Fields (CRFs) for labeling sequential data.
    • ELI5 also implements several algorithms for inspecting black-box models (see Inspecting Black-Box Estimators):
      • TextExplainer allows to explain predictions of any text classifier using LIME algorithm. There are utilities for using LIME with non-text data and arbitrary black-box classifiers as well, but this feature is currently experimental.
      • Permutation importance method can be used to compute feature importances for black box estimators.

Explanation and formatting are separated; you can get text-based explanation to display in console, HTML version embeddable in an IPython notebook or web dashboards, a Pandas DataFrame object if you want to process results further, or JSON version which allows to implement custom rendering and formatting on a client.

yellowbrick

Youtube search... ...Google search

This library is essentially an extension of the scikit-learn library and provides some really useful and pretty looking visualisations for machine learning models. The visualiser objects, the core interface, are scikit-learn estimators and so if you are used to working with scikit-learn the workflow should be quite familiar.

MLxtend

Youtube search... ...Google search

This library contains a host of helper functions for machine learning. This covers things like stacking and voting classifiers, model evaluation, feature extraction and engineering and plotting.

LIME

Youtube search... ...Google search

SHAP

Youtube search... ...Google search

shap_diagram.png

Python Stack

Youtube search... ...Google search

Flask

Youtube search... ...Google search

Flask is considered more Pythonic than the Django web framework because in common situations the equivalent Flask web application is more explicit. Flask is also easy to get started with as a beginner because there is little boilerplate code for getting a simple app up and running. Flask | Full Stack Python

1*yY0ngG41QQ63ukXuoZM4dQ.png

Flask & React

Flask & Docker

Flask, React, & Docker

07_testdriven.png

Django

  • Django - a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle of Web development, so you can focus on writing your app without needing to reinvent the wheel. It’s free and open source.

Django is a widely-used Python web application framework with a "batteries-included" philosophy. The principle behind batteries-included is that the common functionality for building web applications should come with the framework instead of as separate libraries. Django | Full Stack Python

Other Web Frameworks supporting Python

Visualization with Python

Youtube search... ...Google search

PythonVisLandscape.jpg

Matplotlib

Youtube search... ...Google search

seaborn

Youtube search... ...Google search

Plotly

Youtube search... ...Google search

  • Plotly | Plotly - graphing library makes interactive, publication-quality graphs online. Examples of how to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts. Plotly is a web-based service by default, but you can use the library offline in Python and upload plots to Plotly's free, public server or paid, private server. From there, you can embed your plots in a web page. All Plotly graphs include tooltips, and you can build custom controls (like sliders and filters) on top of a chart once it's embedded using Plotly's JavaScript API. Plotly.js is based on D3.js and WebGL.js. Another way to work in Plotly and share plots is in Mode Notebooks. You can pull data with SQL, use the Plotly offline library in the Python Notebook to plot the results of your query, and then add the interactive chart to a report. Get pricing
  • Learn Plotly | SuperDataScience
  • Driving your graphic via ipyWidgets
  • It’s 2019 — Make Your Data Visualizations Interactive with Plotly | Jeff Hale - Towards Data Science Find the path to make awesome figures quickly with Express and Cufflinks

1*A8muRMkAljwW8PKWa_OFpg.gif

Dash

Youtube search... ...Google search

  • Dash a framework for building analytical web applications. No Javascript required; sits on top of Flask
    • Dash Bio a web application framework that provides pure Python abstraction around HTML, CSS, and JavaScript. Dash Bio is a suite of bioinformatics components that make it simpler to analyze and visualize bioinformatics data and interact with them in a Dash application.

Cufflinks

Youtube search... ...Google search


         Cufflinks --> Plotly  -->  ployly.js  -->  D3.js


plotly.js

Youtube search... ...Google search

Plotly Chart Studio

Youtube search... ...Google search

  • Plotly Chart Studio - editor for creating d3.js and WebGL charts. Chart Studio is built on top of Plotly React, Plotly React Editor, the Plotly Image Server, Sheet.js, Handsontable and many other top-quality, open-source projects.

mpld3

Youtube search... ...Google search

  • mpld3 | Jake VanderPlas - brings together Matplotlib, the popular Python-based graphing library, and D3js, the popular Javascript library for creating interactive data visualizations for the web. The result is a simple API for exporting your Matplotlib graphics to HTML code which can be used within the browser, within standard web pages, blogs, or tools such as the IPython notebook.

Bokeh

Youtube search... ...Google search

  • Bokeh | Continuum Analytics an interactive visualization library that targets modern web browsers for presentation - inspired by the concepts outlined in The Grammar of Graphics. Interactive plotting in web browsers, running JavaScript but controlled by Python. You can layer components on top of one another to create a finished plot—for example, you can start with the axes and then add points, lines, labels, etc. Plots can be output as JSON objects, HTML documents, or interactive web applications. Bokeh does a good job of allowing users to manipulate data in the browser, with sliders and dropdown menus for filtering. Like in mpld3, you can zoom and pan to navigate plots, but you can also focus in on a set of data points with a box or lasso select.

HoloViews

Youtube search... ...Google search

Pygal

Youtube search... ...Google search

  • Pygal | Florian Mounier for producing beautiful out-of-the-box charts with very few lines of code. Each chart type is packaged into a method (e.g. pygal.Histogram() makes a histogram, pygal.Box() makes a box plot), and there's a variety of colorful default styles. If you want more control, you can configure almost every element of a plot—including sizing, titles, labels, and rendering. You can output charts as SVGs and add them to a web page with an embed tag or by inserting the code directly into the HTML.

scikit-image

Youtube search... ...Google search

  • scikit-image An image processing library featuring many common operations including convolutional mapping, filtering, edge detection, and image segmentation.

Shapely

Youtube search... ...Google search

  • Shapely - a spatial analysis library which extends Python to work as a fully-featured GIS environmental comparable to commercial software such as ArcGIS.

Satellite Imagery

Youtube search... ...Google search

  • Open Street Map a map of the world, created by people
  • Geospatial Data Abstraction Library (GDAL) a translator library for raster and vector geospatial data formats
  • Pyresample - re-projecting earth observing satellite data, capable of handling both swath data from polar-orbiting satellites and gridded data from geostationary satellites.
  • Fiona - handle vector data
  • rasterio - handle raster data
  • pyproj - transforming spatial reference systems - python interface to PROJ (cartographic projections and coordinate transformations library).
  • Folium - creating maps
  • GeoPandas - geospatial analysis; extends the datatypes used by pandas to allow spatial operations on geometric types. Geometric operations are performed by shapely. Geopandas further depends on fiona for file access and descartes and matplotlib for plotting.
  • GeoViews - visualizable geographic data that that can be mixed and matched with HoloViews objects

QkS2BEF.jpg


Amazing-Snake-Big-Anaconda-attack-Barack-Obama-Watch-Video.jpeg