Decentralized: Federated & Distributed

Jump to: navigation, search

Youtube search... ...Google search

Centralized vs. Decentralized vs. Distributed

Youtube search... ...Google search



Youtube search... ...Google search


Youtube search... ...Google search

Distributed machine learning refers to multi-node machine learning algorithms and systems that are designed to improve performance, increase accuracy, and scale to larger input data sizes. Increasing the input data size for many algorithms can significantly reduce the learning error and can often be more effective than using more complex methods [8]. Distributed machine learning allows companies, researchers, and individuals to make informed decisions and draw meaningful conclusions from large amounts of data. Many systems exist for performing machine learning tasks in a distributed environment. These systems fall into three primary categories: database, general, and purpose-built systems. Each type of system has distinct advantages and disadvantages, but all are used in practice depending upon individual use cases, performance requirements, input data sizes, and the amount of implementation effort. | SpringerLink

Distinguished Lecturer : Eric Xing - Strategies & Principles for Distributed Machine Learning
The rise of Big Data has led to new demands for Machine Learning (ML) systems to learn complex models with millions to billions of parameters that promise adequate capacity to digest massive datasets and offer powerful predictive analytics (such as high-dimensional latent features, intermediate representations, and decision functions) thereupon. In order to run ML algorithms at such scales, on a distributed cluster with 10s to 1000s of machines, it is often the case that significant engineering efforts are required --- and one might fairly ask if such engineering truly falls within the domain of ML research or not. Taking the view that Big ML systems can indeed benefit greatly from ML-rooted statistical and algorithmic insights --- and that ML researchers should therefore not shy away from such systems design --- we discuss a series of principles and strategies distilled from our resent effort on industrial-scale ML solutions that involve a continuum from application, to engineering, and to theoretical research and development of Big ML system and architecture, on how to make them efficient, general, and with convergence and scaling guarantees. These principles concern four key questions which traditionally receive little attention in ML research: How to distribute an ML program over a cluster? How to bridge ML computation with inter-machine communication? How to perform such communication? What should be communicated between machines? By exposing underlying statistical and algorithmic characteristics unique to ML programs but not typical in traditional computer programs, and by dissecting successful cases of how we harness these principles to design both high-performance distributed ML software and general-purpose ML framework, we present opportunities for ML researchers and practitioners to further shape and grow the area that lies between ML and systems. This is joint work with the CMU Petuum Team.

Distributed TensorFlow training (Google I/O '18)
To efficiently train machine learning models, you will often need to scale your training to multiple GPUs, or even multiple machines. TensorFlow now offers rich functionality to achieve this with just a few lines of code. Join this session to learn how to set this up.

Distribution Strategy API:

ResNet50 Model Garden example with MirroredStrategy API:

Performance Guides:

Commands to set up a GCE instance and run distributed training:

Multi-machine distributed training with train_and_evaluate:

Watch more TensorFlow sessions from I/O '18 here → See all the sessions from Google I/O '18 here →

Subscribe to the TensorFlow channel →

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
In this video from 2018 Swiss HPC Conference, Torsten Hoefler from (ETH) Zürich presents: Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. "Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this talk, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. Specifically, we present trends in DNN architectures and the resulting implications on parallelization strategies. We discuss the different types of concurrency in DNNs; synchronous and asynchronous stochastic gradient descent; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, we extrapolate potential directions for parallelism in deep learning." Learn more: Sign up for our insideHPC Newsletter:

Machine Learning Systems for Highly Distributed and Rapidly Growing Data
Microsoft Research The usability and practicality of machine learning are largely influenced by two critical factors: low latency and low cost. However, achieving low latency and low cost is very challenging when machine learning depends on real-world data that are rapidly growing and highly distributed (e.g., training a face recognition model using pictures stored across many data centers globally). In this talk, I will present my work on building low-latency and low-cost machine learning systems that enable efficient processing of real-world, large-scale data. I will describe a system-level approach that is inspired by the general characteristics of machine learning algorithms, machine learning model structures, and machine learning training/serving data. In line with this approach, I will first present a system that provides both low-latency and low-cost machine learning serving (inferencing) over large-scale continuously-growing datasets (e.g. videos). Shifting the focus to model training, I will then present a system that makes machine learning training over geo-distributed datasets as fast as training within a single data center. Finally, I will discuss our ongoing efforts to tackle a fundamental and largely overlooked problem: machine learning training over skewed data partitions (e.g., facial images collected by cameras in different countries).


Youtube search... ...Google search

Introduction to Decentralized P2P Apps
Most people think peer-to-peer (P2P) networks are just for file sharing, but it turns out you can also build other types of applications on P2P networks with advantages like enhanced privacy and security. We’ll walk through the process of building an increasingly complex P2P cloud storage system (think Dropbox), and touch on the challenges you’d run into and some of their possible solutions. Topics include efficiently locating data within a large network and building a system where we can trust random people on the internet with our personal files. EVENT: SFNode Meetup July 2018 SPEAKER: Dylan Barnard PERMISSIONS: SFNode Meetup Organizer provided Coding Tech with the permission to republish this video.

What is a Peer to Peer Network? Blockchain P2P Networks Explained
A peer to peer network, often referred to as p2p network, is one of the key aspects of blockchain technology. In this video, we break down the complexity of peer to peer networks by first defining what a network is and how p2p networks differ from traditional networks. Learn more about P2P Networks


Youtube search... ...Google search

Proxy vs. Reverse Proxy (Explained by Example)
Hussein Nasser In this episode we explain the difference between a Proxy (Forward proxy) and Reverse Proxy by example, and list all the benefits of each server. Online diagram tool used in this video: Http://

Proxy vs. Peer-to-Peer (P2P) Connections | webinar
In this webinar, we will explain how each connection type works, and in what applications you may prefer to use one or the other. You will learn how to use on Windows or macOS and on mobile (iOS/Android) apps to make P2P connections while we present the advantages of P2P versus traditional proxy connections from the web portal. All while making port forwardless connections.